MySQL fields terminated by tab - mysql

I am trying to upload a tab delimitted file with MySQL. I want a query something likes this: LOAD DATA LOCAL INFILE 'file' INTO TABLE tbl FIELDS TERMINATED BY 'TAB' Is there something I can subsitute for TAB to make this work?

have you tried '\t' the escape sequence + "T" is considered tab... haven't tried, but might be what you need

Just tried to find the answer to this question myself to save re-saving my file with commas separating instead of tabs...
From an old MySQL reference manual, a long way down the page, you can find that TAB is the default separater for files loaded using LOAD DATA on MySQL.
See: http://dev.mysql.com/doc/refman/4.1/en/load-data.html
I just loaded a CSV file in this way into MySQL5.1.
BW

fields terminated by '\t'
Try this one
Note :
Field and Line Handling
For both the LOAD DATA and SELECT ... INTO OUTFILE statements, the syntax of the FIELDS and LINES clauses is the same. Both clauses are optional, but FIELDS must precede LINES if both are specified.
If you specify a FIELDS clause, each of its subclauses (TERMINATED BY, [OPTIONALLY] ENCLOSED BY, and ESCAPED BY) is also optional, except that you must specify at least one of them. Arguments to these clauses are permitted to contain only ASCII characters.
If you specify no FIELDS or LINES clause, the defaults are the same as if you had written this:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''
Backslash is the MySQL escape character within strings in SQL statements. Thus, to specify a literal backslash, you must specify two backslashes for the value to be interpreted as a single backslash. The escape sequences '\t' and '\n' specify tab and newline characters, respectively.
In other words, the defaults cause LOAD DATA to act as follows when reading input:
Look for line boundaries at newlines.
Do not skip any line prefix.
Break lines into fields at tabs.
Do not expect fields to be enclosed within any quoting characters.
Interpret characters preceded by the escape character \ as escape sequences. For example, \t, \n, and \ signify tab, newline, and backslash, respectively. See the discussion of FIELDS ESCAPED BY later for the full list of escape sequences.
Conversely, the defaults cause SELECT ... INTO OUTFILE to act as follows when writing output:
Write tabs between fields.
Do not enclose fields within any quoting characters.
Use \ to escape instances of tab, newline, or \ that occur within field values.
Write newlines at the ends of lines.
see: https://dev.mysql.com/doc/refman/8.0/en/load-data.html
for more details.

Related

MySQL bulk load

I'm trying to load csv files into mysql table.
Delimiter : ,(comma)
As part of the source data few of the field values are enclosed in double quotes and inside the double quotes we have ,
There are few records for which / is part of the field data and we need to escape it.
By default / is getting escaped and when I specified the " as escape character " is getting escaped. As we have multiple special characters inside the same file, we need to escape multiple special characters.
Any suggestion
Eg:
id name location
1 A "Location , name here"
2 B "Different Location"
3 C Another Location
4 D Location / with escape character
LOAD DATA LOCAL INFILE 'data.csv' INTO TABLE table_name FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 LINES;
I think it's not possible. Referring to LOAD DATA reference
Any of the field- or line-handling options can specify an empty string (''). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY and FIELDS ESCAPED BY values must be a single character.
Only a single char is supported for ESCAPED BY field.
My proposal is to use any programming language (e.g. PHP, C# etc.) for opening and processing file line-by-line using regexp

MySQL Load data infile -- double quotes in a double quoted value as "a "double" quoted value"

I have a csv file with millions of rows. Here is the command I am using to load data
load data local infile 'myfile' into table test.mytable
fields terminated by ',' optionally enclosed by '"'
lines terminated by '\n' ignore 1 lines
This caters almost everything except some of the lines where there are double quotes inside a double quoted string. as in
"first column",second column,"third column has "double quotes" inside", fourth column
It truncates the third column and give me warning as this row does not contain data for all columns.
Appreciate your help
The CSV is broken. There is no way MySQL or any program can import it. The double quotes needed to be escaped if inside a column.
You might fix the CSV with a script. If the quotes doesn't have a comma in front or behind it, it's probably part of the text and should be escaped.
The following regular expression will do a negative lookbehind and lookahead to find quotes that don't have a quote right in front or behind it.
/(?<!^)(?<!,)(\s*)"(\s*)(?!,)(?!$)/
See it on regex101
On the command like you can run
perl -pe 's/(?<!,)(?<!^)(\s*)"(\s*)(?!,)(?!$)/\1\\"\2/g' data.csv > data-fixed.csv
Note that this method isn't fool proof. If there is a double quote that does have a comma behind it but is part of the text, there is little you can do to fix the CSV. In that case, the script simply has no way of knowing if it's a column delimiter or not.
Try this:
mysqlimport --fields-optionally-enclosed-by='"' --fields-terminated-by=, --lines-terminated-by="\r\n" --user=YOUR_USERNAME --password YOUR_DATABASE YOUR_TABLE.csv

Prevent LOAD DATA INFILE from escaping double double quotes

I have csv data like the following:
"E12 98003";1085894;"HELLA";"8GS007949261";"";1
"5 3/4"";652493;"HELLA";"9HD140976001";"";1
Some fields are included in double quotes. The problem is that
as you may see in the second line the data in the first column contains a double quotation mark at the end as part of the data.
I tried something along the lines of:
LOAD DATA INFILE file.csv
INTO TABLE mytable
FIELDS TERMINATED BY ';' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
but it will use the quotation mark in the data to escape the field enclosing quotation mark. I also tried ESCAPED BY '' and ESCAPED BY '\\' with no success.
Is there a way to stop the LOAD DATA INFILE command from escaping the double double quotation marks?
Or should I parse the csv and put double quotation marks when there is only one?
I am parsing the files anyway using powershell to change the encoding to utf8. Is there some way to fix this quickly there? My powershell code:
function Convert-FileToUTF8 {
param([string]$infile,
[string]$outfile,
[System.Int32]$encodingCode)
$encoding = [System.Text.Encoding]::GetEncoding($encodingCode)
$text = [System.IO.File]::ReadAllText($infile, $encoding)
[System.IO.File]::WriteAllText($outfile, $text)
}
Ok, I did it using a .NET regular expression to fix the csv. It is costly, but not too much.
I wrote
$text = [regex]::Replace($text, "(?m)(?<!^)(?<!\;)""(?!\;)(?!\r?$)", '""');
just before the last line in the function and it seems to work ok. Since I am a novice in regular expressions this could probably be improved.
The main problem is that the input data constitutes invalid CSV syntax, as stated in RFC-4180, paragraph 7:
If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.
But in your PowerShell script you could try to fix this issue with an extra line, using the replace method on $text, once you got it's value:
$text = $text.Replace('"";', '""";')
This should be enough, as the loader will deal well with unescaped double quotes if they appear elsewhere in the data, as stated on mysql.com (my highlight):
If the field begins with the ENCLOSED BY character, instances of that character are recognized as terminating a field value only if followed by the field or line TERMINATED BY sequence.
Of course, if the badly formatted CSV has data that contains ";, then you still have a problem. But it is very hard to determine whether such an occurrence terminates the data or should be seen as part of the data, even for humans :-)
Another thing to pay attention to as found on mysql.com:
If the input values are not necessarily enclosed within quotation marks, use OPTIONALLY before the ENCLOSED BY keywords.
In addition: importing CSV files in MySQL having the values enclosed in quotes works fine when using the ENCLOSED BY option.. UNLESS the enclosed field is the last field in a row, AND you used Excel to create the CSV file. Excel omits the field separator after the last field in a row. MySQL doesn't mind... unless the last field is enclosed in quotes. Then the import terminates at that line.
Examples:
This works fine: ...;value2;value3 (no trailing separator)
This also works fine ...;"value 2";value3 (value enclosed in quotes)
This also works fine ...;value 2;"value3"; (last field value enclosed in quotes and trailing separator)
But this breaks the import: ...;value2;"value 3" (last field value enclosed in quotes and no trailing separator)
Took me some time to figure this out; hope sharing this saves somebody else that time.

MySQL INTO OUTFILE issue with new lines in content

I'm exporting a database report with a shell file. If I run the query in PHPMyAdmin the file comes out fine, new lines at the end of each row in the database only.
However when I run the query in my shell script using outfile to generate the file I get /n, /r and /r/n in some of the columns content. I can't work out what causes this or how to avoid it.
The issue only seems to be caused in the colour column which is the third in the example export.
Query:
mysql $MYSQLOPTS << EOFMYSQL
SELECT Product_Name, Item_Size, Item_Colour, Item_Price, Current_Stock, Item_Price * Current_Stock AS Stock_Value
FROM Items
ORDER BY Product_Name
INTO OUTFILE '$FILE'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
EOFMYSQL
Example result:
"Scarf_in_Peach","ONE SIZE","12/04-B2B2 ",10.00,3,30.00
"Scarf_in_Pink","ONE SIZE ","11/06-odds-C1C12100",10.00,0,0.00
"Scarf_in_Red","ONE SIZE ","11/06-B7B2-C1C12100",10.00,0,0.00
"Scarf_in_Sand_","ONE SIZE","11/06-B1I3-C1C12100
",10.00,0,0.00
"Scarf_in_Sand_/_Blue_Flowers","ONE SIZE","12/04-B2E2-C1C12100 ",10.00,4,40.00
"Scarf_in_Teal","ONE SIZE","11/06-B5G1-C1C12100
",10.00,0,0.00
"Scarf_in_Teal_/_Red_Flowers","ONE SIZE","12/04 - B2B2 ",10.00,1,10.00
"Sunrise_Skinnies","16","ODD-R1S009-1-BLUE",20.00,0,0.00
"Sunrise_Skinnies","8","ODD-R1S009-1
BLUE",20.00,0,0.00
You have 2 options:
Replace carriage return and line feed characters with empty string within your query. Pro: it is completely up to you what characters you filter out and from which fields. Con: you have to create expression for each affected field manually.
Use FIELDS ESCAPED BY character option of the SELECT ... INTO OUTFILE ... command:
FIELDS ESCAPED BY controls how to write special characters. If the
FIELDS ESCAPED BY character is not empty, it is used when necessary to
avoid ambiguity as a prefix that precedes following characters on
output:
The FIELDS ESCAPED BY character
The FIELDS [OPTIONALLY] ENCLOSED BY character
The first character of the FIELDS TERMINATED BY and LINES TERMINATED BY values
ASCII NUL (the zero-valued byte; what is actually written following the escape character is ASCII “0”, not a zero-valued byte)
The FIELDS TERMINATED BY, ENCLOSED BY, ESCAPED BY, or LINES TERMINATED
BY characters must be escaped so that you can read the file back in
reliably. ASCII NUL is escaped to make it easier to view with some
pagers.
Pro: this is a fast and standard approach, that you can easily apply to all export functionality using this approach. Con: less flexible. For example, if the lines terminated by option is set to \n, then \r is not going to be escaped, which can still cause some issues on some systems.

error with MySQL load data infile field with double quotes

I have .csv file data like this:
"UPRR 38 PAN AM "M"","1"
and I loaded data into table using below command which is having two columns (a and b).
LOAD DATA LOCAL INFILE 'E:\monthly_data.csv'
INTO TABLE test_data_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';
But when I select table, it's giving unexpected results which is shown below.
a contains:
UPRR 38 PAN AM "M","1
... and b is NULL.
Thanks
You can replace all the instances of "Double quote double quote" in your file
either A. open the files and find replace them
or B. make a script to open the files and replace the extra quote that is messing it up
You have this:
ENCLOSED BY '"'
Thus " is not a regular character any more. It's a special character that has a special meaning: it highlights the start and end of a column value. If you want to type a " that does not behave that way you need to escape it. The RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files document explains how to do that:
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote
a;b
"UPRR 38 PAN AM ""M""";1
As they say, garbage in, garbage out ;-)