I have a table of the following format:
mysql> describe tweet_info;
+-----------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+-------------------+-----------------------------+
| tweet_id | bigint(20) | NO | PRI | NULL | |
| user_id | bigint(20) | YES | | NULL | |
| tweet | varchar(140) | YES | | NULL | |
| timestamp | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| fav_count | int(11) | YES | | NULL | |
| lat | float | YES | | NULL | |
| longi | float | YES | | NULL | |
| hashtags | varchar(140) | YES | | NULL | |
+-----------+--------------+------+-----+-------------------+-----------------------------+
8 rows in set (0.00 sec)
and a file called mini.txt of the following schema:
<tweet_id> <user_id> <tweet_text> <timestamp> <favourite_count> <latitude> <longitude> <hashtags>
244435656850411520 522575984 #SGodoyAlmirall #hongostibetanos Sat Sep 08 14:02:56 +0000 2012 0 -70.29044372 -18.48140825 hongostibetanos
When I used the following query:
load data infile 'mini.txt'into table tweet_info fields terminated by '\t' lines terminated by '\n';
The query works fine and all lines in the file are inserted into my database. Just that the timestamp is not well handled and all of them stay null. Upon searching the internet a bit, I found that we can set the format of the timestamp as follows:
load data infile 'mini.txt' into table tweet_info fields terminated by '\t' lines terminated by '\n' (#var4) SET timestamp=STR_TO_DATE(#var4,'%a %b %d %H:%i:%s +0000 %Y');
However, this generates the following error:
ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'
This seems weird since
There are no duplicates in my file (I have manually checked on the small file on which I am presently running my command).
The first command didn't say anything about the duplication of entries and was OK.
I would be really grateful if someoen could help me out.
You need to list all the columns in the column list:
load data infile 'mini.txt'
into table tweet_info
fields terminated by '\t'
lines terminated by '\n'
(tweet_id, user_id, tweet_text, #var4, favourite_count, latitude, longitude, hashtags>)
SET timestamp=STR_TO_DATE(#var4,'%a %b %d %H:%i:%s +0000 %Y');
Your code was assigning the first column in the input file to #var4, convering that to a date, and then inserting a row with only the timestamp column specified. So it was defaulting all the other columns, and creating duplicate tweet_id = 0 rows.
Related
When I'm trying to import csv into MySQL table, I'm getting an error
Data too long for column 'incident' at row 1
I'm sure the values are not higher than varchar(12). But, still I'm getting the error.
MariaDB [pagerduty]>
LOAD DATA INFILE '/var/lib/mysql/pagerduty/script_output.csv'
REPLACE INTO TABLE incidents
ignore 1 lines;
ERROR 1406 (22001): Data too long for column 'incident' at row 1
MariaDB [pagerduty]>
LOAD DATA INFILE '/var/lib/mysql/pagerduty/script_output.csv'
INTO TABLE incidents
ignore 1 lines;
ERROR 1406 (22001): Data too long for column 'incident' at row 1
While trying with REPLACE, the data is uploading only one column(which set on primary key)
MariaDB [pagerduty]>
LOAD DATA INFILE '/var/lib/mysql/pagerduty/script_output.csv'
IGNORE INTO TABLE incidents
ignore 1 lines;
Query OK, 246 rows affected, 1968 warnings (0.015 sec)
Records: 246 Deleted: 0 Skipped: 0 Warnings: 1968
**Columns:**
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| incident | varchar(12) | NO | PRI | NULL | |
| description | varchar(300) | YES | | NULL | |
| status | varchar(12) | YES | | NULL | |
| urgency | varchar(7) | YES | | NULL | |
| service | varchar(27) | YES | | NULL | |
| trigger | varchar(25) | YES | | NULL | |
| team | varchar(20) | YES | | NULL | |
| incident_start | datetime(6) | YES | | NULL | |
| incident_end | datetime(6) | YES | | NULL | |
| resolved_by | varchar(20) | YES | | NULL | |
+----------------+--------------+------+-----+---------+-------+
10 rows in set (0.003 sec)
By default, MySQL looks for a TAB character to separate values. Your file is using a comma, so MySQL reads the entire line and assumes it is the value for the first column only.
You need to tell MySQL that the column terminator is a comma, and while you're at it, tell it about the enclosing double quotes.
Try this:
LOAD DATA INFILE '/var/lib/mysql/pagerduty/script_output.csv' REPLACE INTO TABLE incidents
columns terminated by ','
optionally enclosed by '"'
ignore 1 lines;
Reference
If you THINK your data appears ok, and its still nagging about the too long data, how about creating a new temporary table structure and set your first column incident to a varchar( 100 ) just for grins... maybe even a few others if they too might be causing a problem.
Import the data to THAT table to see if same error or not.
If no error, then check the maximum trimmed length of the data in respective columns and analyze the data itself... bad format, longer than expected, etc.
Once resolved, then you can pull into production once you have figured it out. You could also always PRE-LOAD the data into this larger table structure, truncate it before each load so no dups via primary key on a fresh load.
I have had to do that in the past, also was efficient for pre-qualifying lookup table IDs for new incoming data. Additionally could apply data cleansing in the temp table before pulling into production.
I have a CSV file containing rows just like the following one:
2,1,abc123,1,2,"Hello World"
2,1,abc123,1,2,"Hello World2"
2,1,abc123,1,2,"Hello World3"
I'm running the following query:
LOAD DATA LOCAL INFILE :path INTO TABLE errors
CHARACTER SET 'utf8'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\n'
(import_id, type, code, row, cell, message);
It does not insert any of my rows into the database.
Here's the structure for the errors table:
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| import_id | int(10) unsigned | NO | MUL | NULL | |
| type | int(10) unsigned | NO | | NULL | |
| code | varchar(128) | YES | | NULL | |
| row | int(10) unsigned | YES | | NULL | |
| cell | varchar(32) | YES | | NULL | |
| message | varchar(128) | YES | | NULL | |
+-----------+------------------+------+-----+---------+----------------+
I've noticed that if I change the order of the columns, it works.
For example, in my CSV file
1,abc123,1,2,"Hello World",2
1,abc123,1,2,"Hello World2",2
1,abc123,1,2,"Hello World3",2
Also changed the query:
LOAD DATA LOCAL INFILE :path INTO TABLE errors
CHARACTER SET 'utf8'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\n'
(type, code, row, cell, message, import_id);
Why is it working with a different order for columns?
Please verify the version of your mysql and whether this option (load data infile local) is enabled by your tool that you are using, along with it being enabled on server side. It is considered a security risk and is disabled by default by newer versions of mysql server.
Here is more info on security issues about utilizing load infile local MYSQL OFFICIAL DOCS
Also pay attention that you probably need to have local_infile=1 and you can check it with the following command:
SHOW GLOBAL VARIABLES LIKE 'local_infile';
To enable it, use the following command:
SET GLOBAL local_infile = 1;
Also verify that the lines are terminated by '\n' and not by '\r\n' (this is for windows environment)
Hope that helps!
In the other question, I asked about why non-empty values become NULL. But in this question, I asked about how I could make empty values become NULL but not zero.
I found that missing float values are always represented as 0, but not NULL. How could I change that?
The following are the codes with which I created the table and loaded data.
CREATE TABLE Products(sku INTEGER, name VARCHAR(255), description TEXT,
regularPrice FLOAT,
customerReviewAverage FLOAT default NULL );
LOAD DATA LOCAL INFILE 'product.csv'
INTO TABLE Products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(sku, name, #description, regularPrice, customerReviewAverage)
SET description = IF(#description='',NULL,#description);
This is a sample of data in product.csv.
19658847,Glanzlichter - CD,,12.99,5.0
19658856,Glanzlichter - CD,,6.99,
19658865,Glanzlichter - CD,,8.99,
1965886,Beach Boys '69 - CASSETTE,,6.99,4.5
The issue is how MySQL interprets an empty field vs a missing field. From the docs for LOAD DATA INFILE...
If an input line has too few fields, the table columns for which input fields are missing are set to their default values.
An empty field value is interpreted different from a missing field. For string types, the column is set to the empty string. For numeric types, the column is set to 0.
In this case it seems MySQL considers it to be empty. You can see this from show warnings.
mysql> show warnings;
+---------+------+------------------------------------------------------------+
| Level | Code | Message |
+---------+------+------------------------------------------------------------+
| Warning | 1265 | Data truncated for column 'customerReviewAverage' at row 2 |
| Warning | 1265 | Data truncated for column 'customerReviewAverage' at row 3 |
+---------+------+------------------------------------------------------------+
2 rows in set (0.00 sec)
Whereas if we remove the trailing commas so the field is missing...
19658847,Glanzlichter - CD,,12.99,5.0
19658856,Glanzlichter - CD,,6.99
19658865,Glanzlichter - CD,,8.99
1965886,Beach Boys '69 - CASSETTE,,6.99,4.5
Then the data is set to null.
mysql> show warnings;
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1261 | Row 2 doesn't contain data for all columns |
| Warning | 1261 | Row 3 doesn't contain data for all columns |
+---------+------+--------------------------------------------+
mysql> select * from products;
+----------+---------------------------+-------------+--------------+-----------------------+
| sku | name | description | regularPrice | customerReviewAverage |
+----------+---------------------------+-------------+--------------+-----------------------+
| 19658847 | Glanzlichter - CD | NULL | 12.99 | 5 |
| 19658856 | Glanzlichter - CD | NULL | 6.99 | NULL |
| 19658865 | Glanzlichter - CD | NULL | 8.99 | NULL |
| 1965886 | Beach Boys '69 - CASSETTE | NULL | 6.99 | 4.5 |
+----------+---------------------------+-------------+--------------+-----------------------+
4 rows in set (0.00 sec)
Doing the same thing for #customerReviewAverage as for #description worked for me in MySQL 5.7.
LOAD DATA
LOCAL INFILE 'product.csv'
INTO TABLE Products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(sku, name, #description, regularPrice, #customerReviewAverage)
SET description = IF(#description='',NULL,#description),
customerReviewAverage = IF(#customerReviewAverage='',NULL,#customerReviewAverage);
mysql> select * from products;
+----------+---------------------------+-------------+--------------+-----------------------+
| sku | name | description | regularPrice | customerReviewAverage |
+----------+---------------------------+-------------+--------------+-----------------------+
| 19658847 | Glanzlichter - CD | NULL | 12.99 | 5 |
| 19658856 | Glanzlichter - CD | NULL | 6.99 | NULL |
| 19658865 | Glanzlichter - CD | NULL | 8.99 | NULL |
| 1965886 | Beach Boys '69 - CASSETTE | NULL | 6.99 | 4.5 |
+----------+---------------------------+-------------+--------------+-----------------------+
4 rows in set (0.00 sec)
I want to import data from excel to a mysql database using the command line client.
This is an example of how my csv-file is built:
Name 1 | 1 | 2 | 3 |
Name 2 | 1 | 2 | 3 |
Name 3 | 1 | 2 | 3 |
I'm using the code:
LOAD DATA LOCAL INFILE 'path to file.csv'
INTO TABLE table_name
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
I get the "Query OK" and with this code the formatting on the table should be almost exactly as the csv-file but I get this result:
| NULL | NULL | NULL |
| NULL | NULL | NULL |
| NULL | NULL | NULL |
What is wrong?
seems u have used '|' in your csv file as delimiters instead of
comma, try the code as
`LOAD DATA LOCAL INFILE 'path to file.csv' INTO TABLE table_name FIELDS TERMINATED BY '|'LINES TERMINATED BY '\n';`
Assume the following table structure:
CREATE TABLE `table_name` (
`name` VARCHAR(20) CHARACTER SET utf8 DEFAULT NULL,
`value1` INT(11) DEFAULT NULL,
`value2` INT(11) DEFAULT NULL,
`value3` INT(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and the file, csv-file.csv:
Name 1,1,2,3
Name 2,1,2,3
Name 3,1,2,3
when I run the statement:
mysql> LOAD DATA INFILE '/path/csv-file.csv'
-> INTO TABLE `table_name`
-> FIELDS TERMINATED BY ','
-> LINES TERMINATED BY '\n';
mysql> SELECT `name`, `value1`, `value2`, `value3`
FROM `table_name`;
get the following result:
+--------+--------+--------+--------+
| name | value1 | value2 | value3 |
+--------+--------+--------+--------+
| Name 1 | 1 | 2 | 3 |
| Name 2 | 1 | 2 | 3 |
| Name 3 | 1 | 2 | 3 |
+--------+--------+--------+--------+
3 rows in set (0.00 sec)
I'm trying to load different data, from different files, into multiple columns in MySQL. I'm not a big database guy, so maybe I have my data structured wrong. :)
Here's how I have it set up:
DATABASE: mydb
TABLE: aixserver1
COLUMNS: os, hostname, num_users, num_groups, pkg_epoch
shown from mysql:
+---------------+-----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| cur_timestamp | timestamp | NO | | CURRENT_TIMESTAMP | |
| pkg_epoch | int(11) | NO | | NULL | |
| os | char(5) | YES | | NULL | |
| hostname | char(40) | YES | | NULL | |
| num_users | int(10) | YES | | NULL | |
| num_groups | int(10) | YES | | NULL | |
+---------------+-----------+------+-----+-------------------+----------------+
So basically I want to populate pkg_epoch, os, hostname, num_users and num_groups into the database. The data I want to load is inside 5 flat files on the server. I'm using ruby to load the data.
My question is how do I load all these values from those files into my table at once. If I do my inserts one at a time, then the other records become NULL. I.E, I load data into just the hostname column, and all the other columns become NULL for that row.
What am I missing? :)
You can do this a couple ways but the trick is to use a variable placeholder. Here is an example if you used the database's LOAD DATA function:
LOAD DATA INFILE '/PATH/TO/FILE' IGNORE INTO TABLE tableName FIELDS TERMINATED BY '\t' LINES
TERMINATED BY '\r' (#skip, #skip, #skip, login_name, pwd, #skip, #skip, #skip, #skip, #skip, first_name, last_name);
You see I just set a variable #skip or #anything for the fields I don't want to include in the database and name the columns that I do want.
I can get you halfway there with this but am uncertain best approach if you build your own loader with Ruby. I would suggest you retrieve the file and let MySQL import using LOAD DATA as it'll be very performant and you can use trick above.