Strange behaviour of load data infile - mysql

I have the following simple CSV file which I want to load into a MySQL table:
ViolationUtil1,RU
FiftyFifty2013_prof1,UM
Lunch_util1,RM
...
It contains several rows with two fields separated by a comma. I load it using the following command:
LOAD DATA LOCAL INFILE 'domains.txt'
INTO TABLE domains
FIELDS TERMINATED BY ",";
The table domains is defined in the following way:
CREATE TABLE domains (
domain varchar(63),
code varchar(3)
);
Strangely, the first letter of the first column disappears!
mysql> select * from domains;
+----------------------+------+
| domain | code |
+----------------------+------+
|iolationUtil1 | RU
|iftyFifty2013_prof1 | UM
|unch_util1 | RM
+----------------------+------+
When I changed the definition of the "code" column to:
code char(2)
I mysteriously got the correct result:
mysql> select * from domains;
+----------------------+------+
| domain | code |
+----------------------+------+
| ViolationUtil1 | RU |
| FiftyFifty2013_prof1 | UM |
| Lunch_util1 | RM |
+----------------------+------+
What is happening here?

Related

How to import csv data with json type fields into mysql database. The mysql table has columns of the corresponding json type

I have a mysql data table and a csv file, the table has a json type column, and the csv file has a corresponding json type field, I use the "load data local infile..." method to import the csv file into mysql , there is a problem with this process.
here is my datasheet details:
mysql> desc test;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int | NO | PRI | NULL | auto_increment |
| content | json | YES | | NULL | |
| address | varchar(255) | NO | | NULL | |
| type | int | YES | | 0 | |
+---------+--------------+------+-----+---------+----------------+
and my sql statement:
mysql> load data local infile '/Users/kk/Documents/test.csv'
-> into table test
-> fields terminated by ','
-> lines terminated by '\n'
-> ignore 1 rows
-> (id,address,content,type);
ERROR 3140 (22032): Invalid JSON text: "The document root must not be followed by other values." at position 3 in value for column 'test.content'.
My csv file data is as follows
"id","address","content","type"
1,"test01","{\"type\": 3, \"chain\": 1, \"address\": \"test01\"}",1
2,"test02","{\"type\": 3, \"chain\": 2, \"address\": \"test02\"}",1
If you are able to hand-craft a single insert statement that works (example here) you could go via a preprocessor written in a simple scripting language. Python, AutoIT, PowerShell, ... Using a preprocessor you have more control of fields, quoting, ordering etc compared to direct import in MySQL.
So for example (assuming you have used Python)
python split.py /Users/kk/Documents/test.csv > /tmp/temp.sql
mysql -h myhostname -u myUser mydatabase < temp.sql
where temp.sql would be something like
insert into test (content, address, type) values (`{"type":3,"chain":1,"address":"test01"}`, `test01`, 1);
...

Loading data to a table from a file

I need to load data to a table from a file using LOAD DATA command.
I've got a txt file that looks something like this:
1 "MARCA"#"MODELO"#"MATRICULA"#PRECIO
2 "CITROEN"#"PICASSA"#"CPG-2044"#12000
3 "CITROEN"#"PICASSA"#"CPR-1762"#12500
4 "CITROEN"#"C4"#"FPP-1464"#13500
5 "CITROEN"#"C4"#"FDR-4563"#13000
6 "CITROEN"#"C3"#"BDF-8856"#8000
7 "CITROEN"#"C3"#"BPZ-7878"#7500
8 "CITROEN"#"C2"#"CDR-1515"#5000
9 "CITROEN"#"C2"#"BCC-3434"#4500
Now, my first table is constructed as follows:
mysql> show columns from MARCAS;
+----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+----------------+
| ID_MARCA | int(11) | NO | PRI | NULL | auto_increment |
| MARCA | varchar(50) | YES | | NULL | |
+----------+-------------+------+-----+---------+----------------+
Now, I donĀ“t really know how to import data partially (as what I need to do is just load first 'column'. What I came up with is:
load data local infile /myfile.txt
into table MARCAS
fields terminated by '#'
lines terminated by '\n';
but that just does nothing (apart of suspending the terminal).
Help please?
You can also discard an input value by assigning it to a user variable
and not assigning the variable to a table column:
source: http://dev.mysql.com/doc/refman/5.7/en/load-data.html
load data local infile '/myfile.txt'
into table MARCAS
fields terminated by '#'
lines terminated by '\n'
(ID_MARCA, MARCA, #ignore1, #ignore2, #ignore3);
footnotes:
Your query is most unusual in the sense that you have your column names in upper case and sql keywords in lowercase. The usual thing is to have it the other way round!
You have said your mysql console get's suspended, I do believe what you mean is that it takes a long time to return after this query is typed. If you have a large number of rows, there's nothing unusual in that.

MySQL issue when importing specific CSV files with blank values in random rows

I had brought this up before earlier, but after doing some research, I realized I was looking in the wrong place. Here is the situation. I create this table:
CREATE TABLE PC_Contacts
(
POC VARCHAR(255) PRIMARY KEY NOT NULL,
Phone_1 VARCHAR(255),
Phone_2 VARCHAR(255)
);
I import a CSV file into MySQL which has the values for my table PC_Contacts:
USE Network
LOAD DATA INFILE 'C:\\ProgramData\\MySQL\\MySQL Server 5.7\\Uploads\\PC_Contacts.csv'
INTO Table PC_Contacts
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
My output after importing looks like this:
+------------------+--------------+---------------+
| POC | Phone_1 | Phone_2 |
+------------------+--------------+---------------+
|April Wilson | 123-456-5000 | 123-456-5006
| | 123-456-2222 |
| | 123-456-5331 |
| | 123-456-7772 |
|Anton Watson | 123-456-1258 | 123-456-6005
|Elisa Kerring | 123-456-1075 | 123-456-4475
Now as you may recall, based on my code input, that POC is the PK. I had, in the original CSV file, a value for every line. However, as you see, anything that has no value on the right affects the left column's values. However, if I looked in the GUI and pulled up the table there, it showed the cell as populated with the value, so the data is there. If I were to put in xxx-xxx-xxxx, it would fix the issue:
+------------------+--------------+---------------+
| POC | Phone_1 | Phone_2 |
+------------------+--------------+---------------+
|April Wilson | 123-456-5000 | 123-456-5006
|Nicky Nite | 123-456-2222 | xxx-xxx-xxxx
|Nicole | 123-456-5331 | xxx-xxx-xxxx
|Becky | 123-456-7772 | xxx-xxx-xxxx
|Anton Watson | 123-456-1258 | 123-456-6005
|Elisa Kerring | 123-456-1075 | 123-456-4475
Obviously my intentions are so that I can see the value without having to apply special formatting in the command line. Is there a special SELECT command for that maybe?
Here is a link to a portion of the .CSV, as requested:
https://drive.google.com/file/d/0B0MMqHN75RpGdkZhcGp0SWtmams/view?usp=sharing
Your CSV file contains a carriage return with newline at the end of row, which breaks formatting. Use:
SELECT POC, Phone_1, REPLACE(Phone_2, '\r', '') AS Phone_2 FROM PC_Contacts;
Or change your import query as follows:
USE Network
LOAD DATA INFILE 'C:\\ProgramData\\MySQL\\MySQL Server 5.7\\Uploads\\PC_Contacts.csv'
INTO Table PC_Contacts
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS;
And use simple SELECT:
SELECT * FROM PC_Contacts;

Hive Table returning empty result set on all queries

I created a Hive Table, which loads data from a text file. But its returning empty result set on all queries.
I tried the following command:
CREATE TABLE table2(
id1 INT,
id2 INT,
id3 INT,
id4 STRING,
id5 INT,
id6 STRING,
id7 STRING,
id8 STRING,
id9 STRING,
id10 STRING,
id11 STRING,
id12 STRING,
id13 STRING,
id14 STRING,
id15 STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
LOCATION '/user/biadmin/lineitem';
The command gets executed, and the table gets created. But, always returns 0 rows for all queries, including SELECT * FROM table2;
Sample data:
Single line of the input data:
1|155190|7706|1|17|21168.23|0.04|0.02|N|O|1996-03-13|1996-02-12|1996-03-22|DELIVER IN PERSON|TRUCK|egular courts above the|
I have attached the screen shot of the data file.
Output for command: DESCRIBE FORMATTED table2;
| Wed Apr 16 20:18:58 IST 2014 : Connection obtained for host: big-instght-15.persistent.co.in, port number 1528. |
| # col_name data_type comment |
| |
| id1 int None |
| id2 int None |
| id3 int None |
| id4 string None |
| id5 int None |
| id6 string None |
| id7 string None |
| id8 string None |
| id9 string None |
| id10 string None |
| id11 string None |
| id12 string None |
| id13 string None |
| id14 string None |
| id15 string None |
| |
| # Detailed Table Information |
| Database: default |
| Owner: biadmin |
| CreateTime: Mon Apr 14 20:17:31 IST 2014 |
| LastAccessTime: UNKNOWN |
| Protect Mode: None |
| Retention: 0 |
| Location: hdfs://big-instght-11.persistent.co.in:9000/user/biadmin/lineitem |
| Table Type: MANAGED_TABLE |
| Table Parameters: |
| serialization.null.format |
| transient_lastDdlTime 1397486851 |
| |
| # Storage Information |
| SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
| InputFormat: org.apache.hadoop.mapred.TextInputFormat |
| OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
| Compressed: No |
| Num Buckets: -1 |
| Bucket Columns: [] |
| Sort Columns: [] |
| Storage Desc Params: |
| field.delim | |
+-----------------------------------------------------------------------------------------------------------------+
Thanks!
Please make sure that the location /user/biadmin/lineitem.txt actually exists and you have data present there. Since you are using LOCATION clause your data must be present there, instead of the default warehouse location, /user/hive/warehouse.
Do a quick ls to verify that :
bin/hadoop fs -ls /user/biadmin/lineitem.txt
Also, make sure that you are using the proper delimiter.
Did you tried with LOAD DATA LOCAL INFILE
LOAD DATA LOCAL INFILE'/user/biadmin/lineitem.txt' INTO TABLE table2
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
(id1,id2,id3........);
Documentation: http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Are you using managed table or external table?? If it is external table you should use external keyword while creating the table. after creating the table load data into the table using load command. if it is managed table, after loading data to the table, you could see the data in your hive warehouse directory in hadoop. the default path is "/user/hive/warehouse/yourtablename".
you should run the load command in the hive shell.
I was able to load the data to the table. The problem was:
LOCATION '/user/biadmin/lineitem';
wasn't loading any data. But when I gave the directory containing the file as the path like:
LOCATION '/user/biadmin/tpc-h';
where I put the lineite.txt file in the tpc-h directory.
It worked!

mysql load query not working perfectly

I want to insert data into a table via load command in my sql but when ever i run my query the data is entered only in first column and the other one is null
My text file is:
- 1 server
- 2 client
- 3 network
- 4 system
First column is error code and second is comment and query is:
load data local infile 'C:/Users/nco/Desktop/help.txt' into table help;
After that select * from help;
And the output is:
mysql> select * from help;
+------------+-------------+
| error_code | description |
+------------+-------------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
| 4 | NULL |
+------------+-------------+
4 rows in set (0.03 sec)
Any idea what the problem might be?
If you created the file on Windows with an editor that uses \r\n as a line terminator, you should use this statement instead:
LOAD DATA LOCAL INFILE 'C:/Users/nco/Desktop/help.txt' INTO TABLE help
LINES TERMINATED BY '\r\n';