I am using this query to import data from a txt file into my table:
LOAD DATA LOCAL INFILE '~/Desktop/data.txt' INTO TABLE codes LINES TERMINATED BY '\n' (code)
This is working fine. But when I take a look in the "code"-field, every entry has a line break at its end. Is there a way to get rid of this?
Load data infile command is not really suitable for data cleansing, but you may get lucky. First of all, determine what characters exactly make up those 'line breaks'.
It is possible, that the text file uses Windows style line breaks (\r\n). In this case use lines terminated by '\r\n'. If the line breaks consist of different characters, but are consistent across all lines, then include those in the line terminated by clause.
If the line break characters are inconsistent, then you may have to create a stored procedure or use an external programming language to cleanse your data.
Related
Hi there I am new to web development.
I am trying to import a CSV file into mysql workbench using 'Table Data Import Wizard'. However, I have read my file needs to be a CSV (MS-DOS), or I get the following error: Can't analyze file. Please try to change encoding type. If that doesn't help, maybe the file is not: csv, or the file is empty.
I cannot use a CSV (MS-DOS) as my data contains a lot of different special characters including those from Nordic Europe. When I convert my CSV (comma delimited) to CSV (MS-DOS) the special characters are no longer the same.
Is there a way to import a CSV comma delimited file into mysql workbench? Or is there a better solution to getting my data into the table such as keeping the special characters the same in the MS-DOS file somehow?
You can import regular CSVs without an issue, just make sure the encoding matches.
Something like
LOAD DATA
INFILE yourfile.csv
INTO TABLE tablename
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
LINES
TERMINATED BY '\n'
IGNORE 1 LINES
should work. If your CSV doesn't have headers, remove the ignore 1 lines line from the code. If your formatting is different, change the enclosing and terminating characters accordingly.
You can look up the exact syntax in the manual.
Your CSV should work fine. You just need to Load Data Infile
You will likely need to define these settings though
LOAD DATA INFILE 'c:/tmp/discounts.csv'
INTO TABLE discounts
-- comma seperated? maybe pipes '|'?
FIELDS TERMINATED BY ','
-- what surrounds input and is it optional? then add OPTIONALLY before ENCLOSED
ENCLOSED BY '"'
-- what is at the end of files
LINES TERMINATED BY '\n'
-- how many header rows are there, if any?
IGNORE 1 ROWS;
I've spent a fair amount of time googling this but I can't seem to point myself in the right direction of exactly what I'm looking for. The problem with my .csv file is that, while the line terminator is ',,,,', some lines do not include this, so when I import the file it's fine until it gets to one of these, but then it treats it as one record that's about twice as long as the amount of columns a standard record should have, and then it's thrown off from that point forward. What I need to do is skip the records (data between ',,,,' terminations) that have more than the correct number of columns (15). I realize this will essentially skip 2 records each time this happens, but that's fine for the purpose of what I'm doing with a pretty large dataset.
I've come across the IGNORE keyword, but that doesn't seem to apply. What I'm looking for is something like: for each record during import, skip record if record.columns.count > 15. Here is my import statement, thanks for any help provided.
LOAD DATA LOCAL INFILE "/Users/foo/Desktop/csvData.csv"
INTO TABLE csvData
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY ',,,,';
If you just want to skip the malformed records, a simple awk command to filter only the good records is:
awk -F, '{ if (NF == 15) print; }' csvData.csv > csvData_fixed.csv
Then LOAD DATA from the fixed file.
If you want to get fancier, you could write a script using awk (or Python or whatever you like) to rewrite the malformed records in the proper format.
Re your comment: The awk command reads your original file and outputs only each line that has exactly 15 fields, where fields are separated by commas.
Apparently your input data has no lines that have exactly 15 fields, even though you described it that way.
Another thought: it's a little bit weird to use the line terminator of ',,,,' in your original LOAD DATA command. Normally the line terminator is '\n' which is a newline character. So when you redefine the line terminator as ',,,,' it means MySQL will keep reading text until it finds ',,,,' even if that ends up reading dozens of fields over multiple lines of text. Perhaps you could set your line terminator to ',,,,\n'.
I have a Table in MySQL and I am adding data to it from a csv file
My code is:
LOAD DATA LOCAL INFILE 'C:/myaddress/file.csv'
INTO TABLE db.mytable
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
(`Currency`,`field2`,`field3`)
This loads fine except for the first row I add.
I'm adding a USD but when I do a query it imports it as USD.
This only happens for the first row. Anybody knows why this happens?
Solution:
This is an encoding issue, so to solve it, here are two options
1.- Encode the file differently
2.- Add a dummy line and use IGNORE 1 LINES
I think this is something related with your csv. It has been encoded with UTF-8 BOM in ISO-8859-1 (spanish?).
If you are using a editor like Notepad++ open your csv and select from the top menu -> Encoding -> utf-8 without DOM , save and try again.
Im trying to import a text file containing:
http://pastebin.com/qhzrq3M7
Into my database using the command
Load data local infile 'C:/Users/Gary/Desktop/XML/jobs.txt'
INTO Table jobs
fields terminated by '\t';
But I keep getting the error Row 1-13 doesn't contain data for all columns
Make sure the last field of each row ends with \t. Alternatively, use LINES TERMINATED BY
LOAD DATA LOCAL INFILE 'C:/Users/Gary/Desktop/XML/jobs.txt' INTO TABLE jobs COLUMNS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r';
\r is a carriage return character, similar to the newline character (i.e. \n)
I faced same issue. How i fixed the issue:
Try to open the CSV file using Notepad++ (text editor)
I've seen a blank line at the end of my file, I've deleted it.
-- Hurrah, it resolved my issue.
Below URL also can help you out to resolve the issue.
http://www.thoughtspot.com/blog/5-magic-fixes-most-common-csv-file-problems
If you're on Windows, make sure to use the LINES TERMINATED BY \r\n as explained by the mariadb docs
sounds like load data local infile expects to see a value for each column.
You can edit the file by hand (to delete those rows -- could be blank lines), or you can create a temp table, insert the rows into a single column, and write a mysql command to split the rows on tab and insert the values into the target table
Make sure there are no "\"s at the end of any field. In the csv viewed as text this would look like "\," which is obviously a no-no, since that comma will be ignored so you won't have enough columns.
(This primarily applies when you don't have field encasings like quotes around each field.)
Good Day
I have created a bat file to import a text file to my MySQL database and it looks as follows:
sqlcmd /user root /pass password /db "MyDB" /command "LOAD DATA LOCAL INFILE 'file.csv' INTO TABLE TG_Orders FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'"
My problem is that I cannot get the "Treat consecutive delimiters as one" to work...
How would I add that?
Now that we have actually got to the real crux of the problem, this is not a consecutive delimiter problem - it's a CSV file format problem.
If your CSV file contains fields like B121,535 and they are not enclosed within quote marks of some kind and your delimeter is , then no amount of SQL jiggery-pokery will sort out your problem. Un-quoted fields with commas like this will always be interpreted as two separate fields unless enclosed within quote marks.
Post a sample line from the CSV file which is causing problems and we can diagnose further. Failing that, export the data from the initial system again making sure that the formatting is correct (either enclose everything in speech marks or just string fields)
Finally, are you sure that your database is MySQL based and not Microsoft SQL? The only references to SQLCMD.EXE I can find all point to Microsoft sites in relation to SQL Server Express but, even then, it has a different option structure (-U for user rather than /user). If this is the case you could have saved a lot of hassle by putting the correct information tags. If not then I would say that SQLCMD.EXE is a custom written application from somewhere and the problem could all stem from that. If that is the case then we can't help if the CSV formatting is correct - you're on your own