Loading csv into mysql selecting columns - mysql

I am trying to learn how to use efficiently mysql. Now, I want to load into a mysql database a csv containing the bibliography of an author. This is the code I have generating the database and trying to upload the file:
USE stephenkingbooks;
DROP TABLE IF EXISTS stephenkingbooks;
CREATE TABLE stephenkingbooks
(
`id` int unsigned NOT NULL auto_increment,
`original_title` varchar(255) NOT NULL,
`spanish_title` varchar(255) NOT NULL,
`year` decimal(4) NOT NULL,
`pages` decimal(10) NOT NULL,
`in_collection` enum('Y','N') NOT NULL DEFAULT 'N',
`read` enum('Y','N') NOT NULL DEFAULT 'N',
PRIMARY KEY (id)
);
LOAD DATA LOCAL INFILE '../files/unprocessed_sking.csv'
INTO TABLE stephenkingbooks (column1, column2, column4, column3)
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS;
The csv file is format like this:
Carrie,Carrie,Terror,199,19745,"En 1976, el director de cine Brian de Palma hizo la primera pelĂ­cula basada en la novela.7 3"
My idea is to load only the two first columns corresponding to the original_title, the second being the spanish title (the same in mysql and the csv) and after the column3 in csv would be the pages and the column4 the year.
In addition, for the year column, I only want to take the 4 first numbers of the field because I have some of them with a reference that is not part of the year. For example, Carrie was released in 1974 but the csv includes a 5 in the date that I do not want to consider.
My problem is I am not able to obtain what I want without errors in my terminal... any suggestion?

13.2.6 LOAD DATA INFILE Syntax
...
You must also specify a column list if the order of the fields in the
input file differs from the order of the columns in the table.
...
Try:
mysql> LOAD DATA INFILE '../files/unprocessed_sking.csv'
-> INTO TABLE `stephenkingbooks`
-> FIELDS TERMINATED BY ','
-> ENCLOSED BY '"'
-> LINES TERMINATED BY '\r\n'
-> (`original_title`, `spanish_title`, #`genre`, #`pages`, #`year`)
-> SET `year` = LEFT(#`year`, 4), `pages` = #`pages`;
Query OK, 1 row affected (0.00 sec)
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
mysql> SELECT
-> `id`,
-> `original_title`,
-> `spanish_title`,
-> `year`,
-> `pages`,
-> `in_collection`,
-> `read`
-> FROM `stephenkingbooks`;
+----+----------------+---------------+------+-------+---------------+------+
| id | original_title | spanish_title | year | pages | in_collection | read |
+----+----------------+---------------+------+-------+---------------+------+
| 1 | Carrie | Carrie | 1974 | 199 | N | N |
+----+----------------+---------------+------+-------+---------------+------+
1 row in set (0.00 sec)

Related

mysql change HEX number to decimal in LOAD DATA LOCAL INFILE

all,
I have a test.csv file with Id in HEX numbers as below:
Id, DateTime,...
66031851, ...
2E337E4E, ...
The table_test is created using MYSQL as below:
CREATE TABLE table_test(
Id BIGINT NOT NULL,
DateTime DATETIME NOT NULL,
OtherId BIGINT NOT NULL,
...,
PRIMARY KEY (Id, DateTime, OtherId)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;
The created table_test is as below:
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| Id | bigint(20) | NO | PRI | NULL | |
| DateTime | datetime | NO | PRI | NULL | |
I am using MYSQL as below to load the data in a table:
load data local infile 'test.csv' replace into table table_test character set utf8mb4 fields terminated by ',' ENCLOSED BY '\"' lines terminated by '\n' ignore 1 lines SET Id=CONV(Id, 16, 10);
Also tried:
SET Id=cast(CONV(Id, 16, 10) AS UNSIGNED)
and
SET Id=cast(CONV(CONVERT(Id,CHAR), 16, 10) AS UNSIGNED)
But the HEX numbers with letters like "2E337E4E" do not work. They become some very big number which is bigger than a BIGINT. But when I try MYSQL below:
select CONV('2E337E4E', 16, 10);
It works as expected with the correct result "775126606". So I think I miss a step in "LOAD DATA" to make the Id as string for the CONV(). Searched for some time, but did not find a solution.
Anyone has some idea or hint?
Thanks very much
Zhihong
The typical solution for this type of problem is to load the value into a user-defined-variable, then do the conversion in a SET statement.
Something like this should work for you:
load data local infile 'test.csv'
replace into table table_test
character set utf8mb4
fields terminated by ','
ENCLOSED BY '\"'
lines terminated by '\n'
ignore 1 lines
(#Id, `DateTime`, <explicitly list all other columns>)
SET Id=CONV(#Id, 16, 10);

MySQL export table to text file fields name

let's say I have the following table in MySQL
create table test_tbl
(
col1 varchar(100),
col2 varchar(100),
amount int,
created datetime
)
Data insert
Insert into test_tbl values('unu', 'doi', 10, '05/01/2015');
Insert into test_tbl values('patru', trei', 400, '04/01/2015');
I need export all the data from that table in the following format. The file should be txt file.
"col1"="unu","col2"="doi","amount"="10","created"="05/01/2015"
"col1"="patru","col2"="trei","amount"="400","created"="04/01/2015"
So the logic is:
Each column name with value separated by comma.
Does it possible get such result in MySQL ?
For export the table's data-
SELECT CONCAT('"col1"="',col1,'","col2"="',col2,'","amount"="',amount,'","created"="',DATE_FORMAT(created,'%d/%m/%Y'),'"') t FROM test_tbl INTO OUTFILE '/tmp/test.txt' CHARACTER SET latin1 FIELDS ENCLOSED BY '' LINES TERMINATED BY '\r\n';
For import the table from csv
mysql> CREATE TABLE `test_tbl` (
-> `col1` varchar(100) DEFAULT NULL,
-> `col2` varchar(100) DEFAULT NULL,
-> `amount` int DEFAULT NULL,
-> `created` datetime DEFAULT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
-> ;
Query OK, 0 rows affected (0.44 sec)
mysql> load data local infile 'test.txt' into table test_tbl fields terminated by ',' ENCLOSED BY '"' lines terminated by '\r\n' (#col1, #col2,#col3,#col4)
-> set col1 = substr(#col1,8), col2 = substr(#col2,8), amount = substr(#col3,10), created = str_to_date(substr(#col4,11), '%d/%m/%Y');
Query OK, 2 rows affected (0.09 sec)
Records: 2 Deleted: 0 Skipped: 0 Warnings: 0
mysql> select * from test_tbl;
+-------+------+--------+---------------------+
| col1 | col2 | amount | created |
+-------+------+--------+---------------------+
| unu | doi | 10 | 2015-01-05 00:00:00 |
| patru | trei | 400 | 2015-01-04 00:00:00 |
+-------+------+--------+---------------------+
2 rows in set (0.00 sec)
Maybe this could work.
Use CONCAT to build a string like this
SELECT
CONCAT('"col1"="',col1,'","col2"="',col2,'","amount"="',amount,'","created"="',created,'"') t
FROM test_tbl;
Then you can also dump it to a text file using INTO OUTFILE.
SELECT
CONCAT('"col1"="',col1,'","col2"="',col2,'","amount"="',amount,'","created"="',created,'"') t
FROM test_tbl
INTO OUTFILE 'C:/yourtextfile.txt'
CHARACTER SET latin1
FIELDS ENCLOSED BY ''
LINES TERMINATED BY '\r\n';
Since CONCAT only has 1 row, you dont need to enclose any columns/fields with value since they are customised. Only a line break is used to terminate each ROW.
Hope it works!

Set timestamp on insert when loading CSV [duplicate]

This question already has answers here:
How can i add date as auto update when import data from csv file?
(2 answers)
Closed 9 years ago.
I have a Timestamp field that is defined to be automatically updated with the CURRENT_TIMESTAMP value.
It works fine when I fire a query, but when I import a csv (which I'm forced to do since one of the fields is longtext) , the update does not work.
I have tried to:
Give timestamp column as now() function in csv
Manually enter timestamp like 2013-08-08 in the csv
Both the approaches do not work
From what I gather, after updating your question, is that you're actually updating rows using a CSV, and expect the ON UPDATE clause to set the value of your timestamp field to be updated.
Sadly, when loading a CSV into a database you're not updating, but inserting data, and overwriting existing records. At least, when using a LOCAL INFILE, if the INFILE isn't local, the query will produce an error, if it's a local file, these errors (duplicates) will produce warnings and the operation will continue.
If this isn't the case for you, perhaps consider following one of the examples on the doc pages:
LOAD DATA INFILE 'your.csv'
INTO TABLE tbl
(field_name1, field_name2, field_name3)
SET updated = NOW()
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY ('\n');
Just in case you can't/won't/forget to add additional information, loading a csv int a MySQL table is quite easy:
LOAD DATA
LOCAL INFILE '/path/to/file/filename1.csv'
INTO TABLE db.tbl
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(`field_name1`,`field_name2`,`field_name3`)
If you create a table along the lines of:
CREATE TABLE tbl(
id INT AUTO_INCREMENT PRIMARY KEY, -- since your previous question mentioned auto-increment
field_name1 VARCHAR(255) NOT NULL PRIMARY KEY, -- normal fields
field_name2 INTEGER(11) NOT NULL PRIMARY KEY,
field_name3 VARCHAR(255) NOT NULL DEFAULT '',
-- when not specified, this field will receive current_timestamp as value:
inserted TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
-- if row is updated, this field will hold the timestamp of update-time
updated TIMESTAMP NOT NULL DEFAULT 0
ON UPDATE CURRENT_TIMESTAMP
)ENGINE = INNODB
CHARACTER SET utf8 COLLATE utf8_general_ci;
This query is untested, so please be careful with it, it's just to give a general idea of what you need to do to get the insert timestamp in there.
This example table will work like so:
> INSERT INTO tbl (field_name1, field_name2) VALUES ('foobar', 123);
> SELECT FROM tbl WHERE field_name1 = 'foobar' AND field_name2 = 123;
This will show:
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| id | field_name1 | field_name2 | field_name3 | inserted | updated |
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| 1 | foobar | 123 | | 2013-08-07 00:00:00 | 0000-00-00 00:00:00 |
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
As you can see, because we didn't explicitly insert a value into the last three fields, MySQL used their DEFAULT values. For field_name3, an empty string was used, for inserted, the default was CURRENT_TIMESTAMP, for updated the default value was 0 which, because the field-type is TIMESTAMP is represented by the value 0000-00-00 00:00:00. If you were to run the following query next:
UPDATE tbl
SET field_name3 = 'an update'
WHERE field_name1 = 'foobar'
AND field_name2 = 123
AND id = 1;
The row would look like this:
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| id | field_name1 | field_name2 | field_name3 | inserted | updated |
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| 1 | foobar | 123 | an update | 2013-08-07 00:00:00 | 2013-08-07 00:00:20 |
+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
that's all. Some basics can be found here, on mysqltutorial.org, but best keep the official manual ready. It's not bad once you get used to it.
Perhaps this question might be worth a quick peek, too.

Load data from CSV inside bit field in mysql

What's the right syntax to insert a value inside a column of type bit(1) in `MySQL'?
My column definition is:
payed bit(1) NOT NULL
I'm loading the data from a csv where the data is saved as 0 or 1.
I've tried to do the insert using:
b'value' or 0bvalue (example b'1' or 0b1)
As indicated from the manual.
But I keep getting this error:
Warning | 1264 | Out of range value for column 'payed' at row 1
What's the right way to insert a bit value?
I'm not doing the insert manually but I'm loading the data from a csv (using load data infile) in which the data for the column is 0 or 1.
This is my load query, I've renamed the fields for privacy questions, there's no error in that definition:
load data local infile 'input_data.csv' into table table
fields terminated by ',' lines terminated by '\n'
(id, year, field1, #date2, #date1, field2, field3, field4, field5, field6, payed, field8, field9, field10, field11, project_id)
set
date1 = str_to_date(#date1, '%a %b %d %x:%x:%x UTC %Y'),
date2 = str_to_date(#date2, '%a %b %d %x:%x:%x UTC %Y');
show warnings;
This is an example row of my CSV:
200014,2013,0.0,Wed Feb 09 00:00:00 UTC 2014,Thu Feb 28 00:00:00 UTC 2013,2500.0,21,Business,0,,0,40.0,0,PROSPECT,1,200013
Update:
I didn't find a solution with the bit, so I've changed the column data type from bit to tinyint to make it work.
I've finally found the solution and I'm posting it here for future reference. I've found help in the mysql load data manual page.
So for test purpose my table structure is:
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| nome | varchar(45) | YES | | NULL | |
| valore | bit(1) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
My csv test file is:
1,primo_valore,1
2,secondo_valore,0
3,terzo_valore,1
The query to load the csv into the table is:
load data infile 'test.csv' into table test
fields terminated by ',' lines terminated by '\n'
(id, nome, #valore) set
valore=cast(#valore as signed);
show warnings;
As you can see do load the csv you need to do a cast cast(#valore as signed) and in your csv you can use the integer notation 1 or 0 to indicate the bit value. This is because BIT values cannot be loaded using binary notation (for example, b'011010').
Replace the "0" values in the csv by no value at all. That worked for me.
You can use BIN() function like this :
INSERT INTO `table` VALUES (`column` = BIN(1)), (`column` = BIN(0));
Let me guess, but I think you should ignore 1st line of your CSV file in LOAD query.
See "IGNORE number LINES"

Loading data from text file to mysql database by eliminating duplicates

I want to load the data from text file to database, if data already exists i need to escape that data while loading.
I am using below query to load the data from text file to mysql data base.
"Load data infile 'F:/wbrdata.txt' into table wbrdatatable
fields terminated by ','
optionally enclosed by ""
lines terminated by '\r\n'
Ignore 1 lines (channel, time, pulserate, dwellid, targetid);"
It is appending the data to existing table data. I want to avoid the common data which is already exists(duplicates) in table & file while loading to the database.
How can i achieve this?
thank you
regards
sankar
Try to load text file into the temporary table (the same as target table), then remove duplicates from the temporary table and copy the rest to the target table.
Example (suppose that wbrdatatable_temp is a temporary table with all data from text file):
CREATE TABLE wbrdatatable(
id INT(11) NOT NULL AUTO_INCREMENT,
column1 VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (id)
);
INSERT INTO wbrdatatable VALUES
(1, '111'),
(2, '222'),
(3, '333'),
(4, '444'),
(5, '555');
CREATE TABLE wbrdatatable_temp(
id INT(11) NOT NULL AUTO_INCREMENT,
column1 VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (id)
);
INSERT INTO wbrdatatable_temp VALUES
(1, '111'),
(2, '222'),
(10, '100'), -- new record that should be added
(11, '200'); -- new record that should be added
-- Copy only new records!
INSERT INTO wbrdatatable
SELECT t1.* FROM wbrdatatable_temp t1
LEFT JOIN wbrdatatable t2
ON t1.id = t2.id AND t1.column1 = t2.column1
WHERE t2.id IS NULL;
-- Test result
SELECT * FROM wbrdatatable;
+----+---------+
| id | column1 |
+----+---------+
| 1 | 111 |
| 2 | 222 |
| 3 | 333 |
| 4 | 444 |
| 5 | 555 |
| 10 | 100 | -- only new record is added
| 11 | 200 | -- only new record is added
+----+---------+
Try this Logic.
1. Upload Text File Data
2. Check record using select statement on your database
if(recordexist==true)
save
else
not save
Regards