importing csv file with custom delimiter in mysql - mysql

I'm doing mysql in my terminal as well as in my cmd.
I have a table :
mysql> create table stu (sno int, name char(10));
It's CSV file (b.csv):
sno, name
1,a
2,b
3,"c,d"
I imported that in mysql :
mysql> load data local infile 'd://b.csv' into table stu fields terminated by ',' enclosed by '"' lines terminated by '\n' ignore 1 rows;
Query OK, 3 rows affected (0.05 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
mysql> select * from stu;
+------+------+
| sno | name |
+------+------+
| 1 | a |
| 2 | b |
| 3 | c,d |
+------+------+
3 rows in set (0.00 sec)
When I change the default delimiter alone :
I have another file with a custom delimiter and a custom quotechar (a.csv):
*rno*-*name*
*1*-*ron*
*3*-*vince*
*5*-*Abi-nav*
when I import this into mysql,
mysql> load data local infile 'd://a.csv' into table stu1 fields terminated by "-"
enclosed by "*" lines terminated by '\n' ignore 1 rows;
Query OK, 2 rows affected, 1 warning (0.05 sec)
Records: 2 Deleted: 0 Skipped: 0 Warnings: 1
mysql> show warnings;
+---------+------+---------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------------------------+
| Warning | 1262 | Row 1 was truncated; it contained more data than there were input columns |
+---------+------+---------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select * from stu1;
+------+------------+
| sno | name |
+------+------------+
| 1 | ron*
*3 |
| 5 | *Abi-nav*
+------+------------+
2 rows in set (0.00 sec)
I get this problem only when I change the default quotechar from " into other characters in both Ubuntu and in Windows.
Why I get this warning and this output??? How to solve this?
Thanks in advance
EDIT :
I tried as said by in the answer:
mysql> select version();
+-------------------------+
| version() |
+-------------------------+
| 5.7.31-0ubuntu0.16.04.1 |
+-------------------------+
1 row in set (0.04 sec)
mysql> load data local infile '/home/data.csv' into table stu fields terminated by "-" enclosed by '*' lines terminated by '\n' ignore 1 rows;
Query OK, 3 rows affected (0.19 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
mysql> select * from stu;
+------+---------+
| sno | name |
+------+---------+
| 1 | ron |
| 3 | vince |
| 5 | Abi-nav |
+------+---------+
3 rows in set (0.00 sec)
Then here you go :
mysql> load data local infile '/home/a.csv' into table stu fields terminated by "-" enclosed by '|' lines terminated by '\n' ignore 1 rows;
Query OK, 2 rows affected, 1 warning (0.09 sec)
Records: 2 Deleted: 0 Skipped: 0 Warnings: 1
mysql> show warnings;
+---------+------+---------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------------------------+
| Warning | 1262 | Row 1 was truncated; it contained more data than there were input columns |
+---------+------+---------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select * from stu;
+------+------------+
| sno | name |
+------+------------+
| 1 | ron|
|3 |
| 5 | |Abinav|
|
+------+------------+
2 rows in set (0.00 sec)
Again I got the warning. Why that happens???

*-*name*
*1*-*ron*
*3*-*vince*
*5*-*Abi-nav*
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.20 |
+-----------+
1 row in set (0.00 sec)
mysql> create table stu1 (sno int, name char(10));
Query OK, 0 rows affected (1.23 sec)
mysql> load data local infile '/data/data.csv' into table stu1 fields terminated by "-" enclosed by "*" lines terminated by '\n' ignore 1 rows;
Query OK, 3 rows affected (0.07 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
mysql> select * from stu1;
+------+---------+
| sno | name |
+------+---------+
| 1 | ron |
| 3 | vince |
| 5 | Abi-nav |
+------+---------+
3 rows in set (0.00 sec)
it is working in mysql version - 8.0.20

Under windows i needed to change termination. so that it runs
MySQL 8.0.22
Load data infile 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/a.csv'
into table stu
fields terminated by "-"
enclosed by "*"
lines terminated by '\r\n'
ignore 1 LINES;
The main problem that i see, is that we can't see your csv file further you seem to switch between operating system Windows and linux.
Like i showed in my example the csv was produced in windows and had so \r\n as line termination, and this is what the MySQL interpreter had problems with, when i used \n\allone.
It had nothing to do with field termination or enclosure.

Your problem can be due to line termination (on Windows \r\n, on Unix/Linux \n)
You can try with a program like Notepad++, set the line termination to \n.
After delete all line break and recreate the line break.
If the load statement works, that was the issue.

Related

Why data inside one column cannot be seen with select command in the terminal,but the data can be seen in server web view?

Data inside column "ward_code" can not be seen with command while in web server view they can be seen.
The above data were imported to the database with below command
MariaDB [samaritan]> LOAD DATA LOCAL INFILE "C:////kino.txt"
-> INTO TABLE ward
-> COLUMNS TERMINATED BY "\t";
Query OK, 21 rows affected, 6 warnings (0.107 sec)
Records: 21 Deleted: 0 Skipped: 0 Warnings: 6
MariaDB [samaritan]>
I am just curious why in the terminal the data inside ward_code is not visible? Sorry for my English, Thanks In Advance.
After adding SHOW WARNINGS; immediately after LOAD DATA, I got the below
MariaDB [samaritan]> LOAD DATA LOCAL INFILE "C:///kino.txt"
-> INTO TABLE ward
-> COLUMNS TERMINATED BY "\t";
Query OK, 21 rows affected, 6 warnings (0.083 sec)
Records: 21 Deleted: 0 Skipped: 0 Warnings: 6
MariaDB [samaritan]> show warnings;
+---------+------+-------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------+
| Warning | 1265 | Data truncated for column 'ward_name' at row 7 |
| Warning | 1265 | Data truncated for column 'ward_name' at row 17 |
| Warning | 1261 | Row 21 doesn't contain data for all columns |
| Warning | 1261 | Row 21 doesn't contain data for all columns |
| Warning | 1261 | Row 21 doesn't contain data for all columns |
| Warning | 1261 | Row 21 doesn't contain data for all columns |
+---------+------+-------------------------------------------------+
6 rows in set (0.000 sec)
MariaDB [samaritan]>

what syntax to use to update a SET column in mysql?

I created a column called oilcompany that has SET data (Hunt, Pioneer, Chevron, BP)
I can enter any one of those into the oilcompany column and change from one to another one but I can not figure out how to change from one oilcompany to multiple oilcompany (eg. Hunt and BP)... any suggestion?
In the MySQL documentation there are not examples for UPDATE statements, but I normally use two ways to update these kind of columns:
Using text values
Using numeric values
Creating the test environment
mysql> CREATE TABLE tmp_table(
-> id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> oilcompany SET('Hunt', 'Pioneer', 'Chevron', 'BP')
-> );
Query OK, 0 rows affected (0.54 sec)
mysql> INSERT INTO tmp_table(oilcompany) VALUES ('Hunt'), ('Pioneer');
Query OK, 2 rows affected (0.11 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM tmp_table;
+----+------------+
| id | oilcompany |
+----+------------+
| 1 | Hunt |
| 2 | Pioneer |
+----+------------+
2 rows in set (0.00 sec)
Alternative#1: Using Text Values
As a SET is a collection of ENUM elements, and any ENUM element can be treated as a string, then we can do things like:
mysql> UPDATE tmp_table
-> SET oilcompany = 'Hunt,BP'
-> WHERE id = 1;
Query OK, 1 row affected (0.07 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM tmp_table;
+----+------------+
| id | oilcompany |
+----+------------+
| 1 | Hunt,BP |
| 2 | Pioneer |
+----+------------+
2 rows in set (0.00 sec)
Alternative#2: Using Numeric Values
Any SET element is stored internally as a 64bit number containing the combination of the bits that represent each SET element.
In our table: 'Hunt'=1, 'Pioneer'=2, 'Chevron'=4, 'BP'=8.
Also, mysql allows to use these numbers instead of text values. If we need to see the numeric value in the select, we need to use the SET column inside a numeric expression (E.g. adding zero).
Let's see the current values:
mysql> SELECT id, oilcompany+0, oilcompany FROM tmp_table;
+----+--------------+------------+
| id | oilcompany+0 | oilcompany |
+----+--------------+------------+
| 1 | 9 | Hunt,BP |
| 2 | 2 | Pioneer |
+----+--------------+------------+
2 rows in set (0.00 sec)
Here 9 = 'Hunt' (1) + 'BP' (8) and 2 = 'Pioneer' (2).
Now, let's change the Pioneer to 'Hunt' (1) + 'Chevron' (4):
mysql> UPDATE tmp_table
-> SET oilcompany = 5
-> WHERE id = 2;
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT id, oilcompany+0, oilcompany FROM tmp_table;
+----+--------------+--------------+
| id | oilcompany+0 | oilcompany |
+----+--------------+--------------+
| 1 | 9 | Hunt,BP |
| 2 | 5 | Hunt,Chevron |
+----+--------------+--------------+
2 rows in set (0.00 sec)

Import Only Second Column of CSV into MySQL

I want to import the second column of a CSV file into MySQL. Here's the CSV file:
Name,Source,Follows
John,Youtube,Y
Kat,FB,N
Jacob,Twitter,N
Here's the code I have so far:
DROP TABLE temp;
CREATE TABLE temp
(ID INT AUTO_INCREMENT primary key,
sn VARCHAR(50)
);
DESCRIBE temp;
LOAD DATA LOCAL INFILE '/Users/...temp.csv' INTO TABLE Person
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
IGNORE 1 LINES
(#col2) set sn=#col2;
SELECT * FROM temp;
However, I get that there is an empty set.
Try:
File: /path/to/temp.csv:
Name,Source,Follows
John,Youtube,Y
Kat,FB,N
Jacob,Twitter,N
MySQL Command-Line:
mysql> DROP TABLE IF EXISTS `temp`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE IF NOT EXISTS `temp` (
-> `ID` INT AUTO_INCREMENT PRIMARY KEY,
-> `sn` VARCHAR(50)
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> LOAD DATA LOCAL INFILE '/path/to/temp.csv'
-> INTO TABLE `temp`
-> FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
-> IGNORE 1 LINES
-> (#`null`, `sn`, #`null`);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
mysql> SELECT `ID`, `sn`
-> FROM `temp`;
+----+---------+
| ID | sn |
+----+---------+
| 1 | Youtube |
| 2 | FB |
| 3 | Twitter |
+----+---------+
3 rows in set (0.00 sec)
UPDATE
mysql> LOAD DATA LOCAL INFILE '/tmp/temp.csv'
-> INTO TABLE `temp`
-> FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
-> IGNORE 1 LINES
-> (#`null`, `sn`);
Query OK, 3 rows affected, 3 warnings (0.00 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 3
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------------------------+
| Warning | 1262 | Row 1 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 2 was truncated; it contained more data than there were input columns |
| Warning | 1262 | Row 3 was truncated; it contained more data than there were input columns |
+---------+------+---------------------------------------------------------------------------+
3 rows in set (0.00 sec)
mysql> SELECT `ID`, `sn`
-> FROM `temp`;
+----+---------+
| ID | sn |
+----+---------+
| 1 | Youtube |
| 2 | FB |
| 3 | Twitter |
+----+---------+
3 rows in set (0.00 sec)

How to remove certain strings from rows in MySQL

I am attempting to clean up in a messy table consisting of words which are unnecessary.
The example below shows the typical content:
row1 |
-------------
text <12> |
more [dada] |
(123) foo |
la {55w} da |
Basically what i define as unnecessary content is all the words starting and ending with a particular symbol: <...>, [...], {...} and (...). Usually i would use the replace function, but since the data inside of the symbols are arbitrary it is not quite possible.
Is it possible to use some kind of RegEX in the REPLACE function?
UPDATE
Please take notice that the content wrapped inside the symbols can any letters and numbers, basically unpredictable.
Ok i see now !
use the replace like this - see example(will clean everything from inside '()')
mysql> CREATE TABLE tbl (
-> txt VARCHAR(255)
-> );
Query OK, 0 rows affected (0.50 sec)
mysql> INSERT INTO tbl VALUES
-> ('sometext (asdebtrw)'),
-> ('some other text ( sd sdasddebtrw)'),
-> ('somesdaftext ( (sd)( ))ebt rw)()'),
-> ('sometext1'),
-> ('sometext2'),
-> ('sometext1 (replacethistext) anothertext1'),
-> ('s'),
-> ('w(sdf) rr')
-> ;
Query OK, 8 rows affected (0.00 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> select * from tbl;
+------------------------------------------+
| txt |
+------------------------------------------+
| sometext (asdebtrw) |
| some other text ( sd sdasddebtrw) |
| somesdaftext ( (sd)( ))ebt rw)() |
| sometext1 |
| sometext2 |
| sometext1 (replacethistext) anothertext1 |
| s |
| w(sdf) rr |
+------------------------------------------+
8 rows in set (0.00 sec)
mysql> UPDATE tbl
-> SET txt = REPLACE(txt, SUBSTRING(txt, LOCATE('(', txt), LENGTH(txt) - LOCATE(')', REVERSE(txt)) - LOCATE('(', txt) + 2), '')
-> WHERE txt LIKE '%(%)%';
Query OK, 5 rows affected (0.20 sec)
Rows matched: 5 Changed: 5 Warnings: 0
mysql> select * from tbl;
+-------------------------+
| txt |
+-------------------------+
| sometext |
| some other text |
| somesdaftext |
| sometext1 |
| sometext2 |
| sometext1 anothertext1 |
| s |
| w rr |
+-------------------------+
8 rows in set (0.22 sec)
regex_replace is your mate here:
SELECT REGEXP_REPLACE('ab12cd','[0-9]','') AS remove_digits;
-> abcd
Though it may be a MariaDB enhancement.

MySQL Select Into Outfile Without Quotes

Is it possible, and if so how, can I SELECT ... INTO OUTFILE and have it not enclose with any character.
So far this doesn't work:
SELECT hour_stamp,
day_stamp,
month_stamp,
hour,
day,
month,
year,
quarter,
day_of_week,
week_of_year,
SUBSTR(hour_text,1,24),
SUBSTR(day_text,1,24)
FROM date_dim
INTO OUTFILE '/media/ssd0/temp/dates.tsv'
FIELDS TERMINATED BY '\t'
ENCLOSED BY '';
I'm not sure if the engine matters in this case, but it may be important to note that I am using InfoBright on a Linux machine.
The output is as follows:
1293840000000 1293840000000 1293840000000 0 1 1 2011 1 5 52 "2011-01-01T00:00:00" "2011-01-01T00:00:00"
1293843600000 1293840000000 1293840000000 1 1 1 2011 1 5 52 "2011-01-01T01:00:00" "2011-01-01T00:00:00"
1293847200000 1293840000000 1293840000000 2 1 1 2011 1 5 52 "2011-01-01T02:00:00" "2011-01-01T00:00:00"
1293850800000 1293840000000 1293840000000 3 1 1 2011 1 5 52 "2011-01-01T03:00:00" "2011-01-01T00:00:00"
Adding OPTIONALLY ENCLOSED BY '' might have the desired effect.
Try it without anything:
SELECT hour_stamp,
day_stamp,
month_stamp,
hour,
day,
month,
year,
quarter,
day_of_week,
week_of_year,
SUBSTR(hour_text,1,24),
SUBSTR(day_text,1,24)
FROM date_dim
INTO OUTFILE '/media/ssd0/temp/dates.tsv';
Here is a sample
mysql> desc veto.prova;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| provaemail | varchar(255) | NO | | NULL | |
+------------+--------------+------+-----+---------+----------------+
2 rows in set (0.05 sec)
mysql> select * from veto.prova;
+----+--------------------------+
| id | provaemail |
+----+--------------------------+
| 1 | redwards#logicworks.net |
| 2 | rolandoedwards#yahoo.com |
+----+--------------------------+
2 rows in set (0.00 sec)
mysql> select id,provaemail from prova into outfile 'C:/lwdba/prova.txt';
Query OK, 2 rows affected (0.01 sec)
mysql>
What does it look like on disk ???
C:\>cd lwdba
C:\LWDBA>type prova.txt
1 redwards#logicworks.net
2 rolandoedwards#yahoo.com
C:\LWDBA>
I tried something weird. I terminated with \0
mysql> select id,provaemail,substr(provaemail,1,5) from prova into outfile 'C:/lwdba/prova9.txt' fields terminated by '\0';
Query OK, 2 rows affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+------------------------------------------------------------------------------------------------------------------------+
| Warning | 1475 | First character of the FIELDS TERMINATED string is ambiguous; please use non-optional and non-empty FIELDS ENCLOSED BY |
+---------+------+------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select id,provaemail,substr(provaemail,1,5) from prova into outfile 'C:/lwdba/prova8.txt' fields enclosed by '\0';
Query OK, 2 rows affected (0.00 sec)
mysql>
The files look like the this:
C:\LWDBA>type prova9.txt
1 redwards#logicworks.net redwa
2 rolandoedwards#yahoo.com rolan
C:\LWDBA>type prova8.txt
1 redwards#logicworks.net redwa
2 rolandoedwards#yahoo.com rolan
C:\LWDBA>
I used \0 because it is a null character.
The double quote anomaly you are seeing is probably due to the InfoBright Storage Engine and how it renders character output of function calls.
Here is weird suggestion, but I don't know if it will work...
If you make a subquery, data are always stored in a MySQL temp table. Alter the query:
mysql> select * from (select id,provaemail,substr(provaemail,1,5) as stuff from prova) A
-> A into outfile 'C:/lwdba/prova444.txt' fields terminated by '\0' enclosed by '\0';
Query OK, 2 rows affected (0.00 sec)
mysql>
In you case, that would be
SELECT * FROM (
SELECT hour_stamp,
day_stamp,
month_stamp,
hour,
day,
month,
year,
quarter,
day_of_week,
week_of_year,
SUBSTR(hour_text,1,24) ht,
SUBSTR(day_text,1,24) dt
FROM date_dim) A
INTO OUTFILE '/media/ssd0/temp/dates.tsv';
See if that does something different
Did you try with "Null"?
SELECT hour_stamp,
day_stamp,
month_stamp,
hour,
day,
month,
year,
quarter,
day_of_week,
week_of_year,
SUBSTR(hour_text,1,24),
SUBSTR(day_text,1,24)
FROM date_dim
INTO OUTFILE '/media/ssd0/temp/dates.tsv'
FIELDS TERMINATED BY '\t'
ENCLOSED BY 'NULL';