I am dumping some csv data into mysql with this query
LOAD DATA LOCAL INFILE 'path/LYRIC.csv' INTO TABLE LYRIC CHARACTER SET euckr FIELDS TERMINATED BY '|';
When I did this, I can see follow logs from console.
...
[2017-09-13 11:24:10] ' for column 'SONG_ID' at row 3
[2017-09-13 11:24:10] [01000][1261] Row 3 doesn't contain data for all columns
[2017-09-13 11:24:10] [01000][1261] Row 3 doesn't contain data for all columns
...
I think csv got some line feed as a column data so it breaks all parsing process.
A single record in csv looks like ...
000001|2014-11-17 18:10:00|2014-11-17 18:10:00|If I were your September
I''d whisper a sunset to fly through
And if I were your September
|0|dba|asass|2014-11-17 18:10:00||||2014-11-17 18:10:00
So LOAD DATA pushes line 1 as a record and then try line 2 and so on, even if this is a single data.
How can I fix it? Should I request different type of the file to the client?
P.S. I am so new with this csv work.
Multiline fieds in csv should be surrounded with double quotes, like this:
000001|2014-11-17 18:10:00|2014-11-17 18:10:00|"If I were your September
I''d whisper a sunset to fly through
And if I were your September
"|0|dba|asass|2014-11-17 18:10:00||||2014-11-17 18:10:00
And any double quote inside that field should be escaped with another double quote.
Of course, the parser has to support (and maybe be instructed to use) multiline fields.
Related
I have a column of data, let's call it bank_date, that I receive from an external vendor as a csv file every day. As such the dates in that column show as '1/1/2020'.
I am trying to upload that raw csv file directly to SQL daily. We used to store the SQL bank_date format as text, but we have converted it to a Data data type, and now it keeps zero'ing out every time, with some sort of truncate / "datetime value incorrect" error.
I have now tested 17 different versions of utilizing STR_TO_date (mostly), CAST, and CONVERT, and feel like I'm close, but I'm not quite getting the syntax right.
Also for reference, I did find 2 other workarounds that are successful, but my boss specifically wants it uploaded and converted directly through the import process (not manipulating the raw csv data) for safety reasons. For reference:
Workaround 1: Convert csv date column to the YYYY-MM-DD format and save file. The issue with this is that if you try to open that CSV file again, it auto-changes the date format back to the standard mm/dd/yyyy. If someone doesn't know to watch out for this and is re-opening the csv file to double check something, they're gonna find an error when they upload, and the problem is not easy to identify.
Workaround 2:Create an extra dummy_date column in the table that is formatted as a text data type and upload as normal. Then copy and paste the data into the correct bank_date column using a str_to_date function as follows: UPDATE dummy_date SET bank_date = STR_TO_DATE(dummy_date, ‘%c/%e/%Y’); The issue with this is that it just creates extra unnecessary data that can be confused when other people may not know that 1 of the columns is not intended for querying.
Here is my current code:
USE database_name;
LOAD DATA LOCAL INFILE 'C:/Users/Shelly/Desktop/Date Import.csv'
INTO TABLE bank_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(bank_date, bank_amount)
SET bank_date = str_to_date(bank_date,'%Y-%m-%d');
The "SET" line is what I cannot work out on syntax to convert a csv's 1/5/2020' to SQL's 2020-1-5 format. Every test I've made either produces 0000-00-00 or nulls the column cells. I'm thinking maybe I need to tell SQL how to understand the csv's format in order for it to know how to convert it. Newbie here and stuck.
You need to specify a format for a date that is in the file, not a "required" one:
SET bank_date = str_to_date(bank_date,'%c/%e/%Y');
A table created with multiple columns with 5 phone number columns, all of them has data type like
varchar(15)
But when I insert phone number of 10 digits without - or () like 9999999999 then in phonenumber3,phonenumber4,phonenumber5 columns alone get updated as 9999999999.0
I'm reading from CSV and writing into table using pandas to_sql().
Why is this happening?
If the CSV data is generated from server (unaltered by user), "zero" data is preserved in importing. Assuming " quotes as container. Everything should be fine.
If the CSV data is altered by the user such as, editing/saving... The "zero" data will be trimmed, as the software editor like excel or libre office will omit the "zero" during saving. Work-around with this is to edit the CSV file in Text Editor (e.g. Sublime) the columns should be preserved as "012345678" with the quotes as container.
Iam getting the below error when I try to load CSV From my system to Snowflake table:
Unable to copy files into table.
Numeric value '"4' is not recognized File '#EMPP/ui1591621834308/snow.csv', line 2, character 25 Row 1, column "EMPP"["SALARY":5] If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.
You appear to be loading your CSV with the file format option of FIELD_OPTIONALLY_ENCLOSED_BY='"' specified.
This option will allow reading any fields properly quoted with the " character, and even support such fields carrying the delimiter character as well as the " character if properly escaped. Some examples that could be considered valid:
CSV FORM | ACTUAL DATA
------------------------
abc | abc
"abc" | abc
"a,bc" | a,bc
"a,""bc""" | a,"bc"
In particular, notice that the final example follows the specified rule:
When a field contains this character, escape it using the same character. For example, if the value is the double quote character and a field contains the string A "B" C, escape the double quotes as follows:
A ""B"" C
If your CSV file carries quote marks within the data but is not necessarily quoting the fields (and delimiters and newlines do not appear within data fields), you can remove the FIELD_OPTIONALLY_ENCLOSED_BY option from your file format definition and just read the file at the delimited (,) fields.
If your CSV does use quoting, ensure that whatever is producing the CSV files is using a valid CSV format writer and not simple string munging, and recreate it with the quotes properly escaped. If the above data example is to be considered valid in quoted form, it must instead appear within the file as "4" or 4.
The error message is saying that you have a value in your file that contains a "4 which is being added into a table that has a number field for that value. Since that isn't a number, it fails. This appears to be happening in your very first row of your file, so you could open it up and take a look at the value. If its just one record, you can add the ON_ERROR = 'CONTINUE' to your command, so that it skips it and moves on.
I have a text file that has the field headers on the first line, and the following lines are space separated fields. The spacing between each field is different. There are 45 fields. Specifically, this is an eBird dataset. I want to load this text file into a newly created table within a database I created, so that each header is the field header in the database and each line below the headers are records. Here is a small example of the files format:
Header1 Header2 Header3 Header4
082739 United States US-CA-01 1
I haven't tried anything yet, because I want to know what to do before I load this data into a table. I have this command prepared:
LOAD DATA INFILE <file_path> INTO TABLE <table_name> FIELDS TERMINATED BY ' ';
Will this command populate the table so that each field corresponds to each header, or do I need to tell the command how many spaces there are between each field? Will the command know that each line is terminated by a newline if not told?
Looks like there's an R package for filtering and processing the EBD dataset. Using R you should be able to process the data into a more manageable format for insertion into MariaDB.
For example, you can separate the fields with commas to generate CSV-formatted data that can be readily ingested by MariaDB.
How can I load 10,000 rows of test.xls file into mysql db table?
When I use below query it shows this error.
LOAD DATA INFILE 'd:/test.xls' INTO TABLE karmaasolutions.tbl_candidatedetail (candidate_firstname,candidate_lastname);
My primary key is candidateid and has below properties.
The test.xls contains data like below.
I have added rows starting from candidateid 61 because upto 60 there are already candidates in table.
please suggest the solutions.
Export your Excel spreadsheet to CSV format.
Import the CSV file into mysql using a similar command to the one you are currently trying:
LOAD DATA INFILE 'd:/test.csv'
INTO TABLE karmaasolutions.tbl_candidatedetail
(candidate_firstname,candidate_lastname);
To import data from Excel (or any other program that can produce a text file) is very simple using the LOAD DATA command from the MySQL Command prompt.
Save your Excel data as a csv file (In Excel 2007 using Save As) Check
the saved file using a text editor such as Notepad to see what it
actually looks like, i.e. what delimiter was used etc. Start the MySQL
Command Prompt (I’m lazy so I usually do this from the MySQL Query
Browser – Tools – MySQL Command Line Client to avoid having to enter
username and password etc.) Enter this command: LOAD DATA LOCAL INFILE
‘C:\temp\yourfile.csv’ INTO TABLE database.table FIELDS TERMINATED
BY ‘;’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\r\n’ (field1, field2);
[Edit: Make sure to check your single quotes (') and double quotes (")
if you copy and paste this code - it seems WordPress is changing them
into some similar but different characters] Done! Very quick and
simple once you know it :)
Some notes from my own import – may not apply to you if you run a different language version, MySQL version, Excel version etc…
TERMINATED BY – this is why I included step 2. I thought a csv would default to comma separated but at least in my case semicolon was the deafult
ENCLOSED BY – my data was not enclosed by anything so I left this as empty string ”
LINES TERMINATED BY – at first I tried with only ‘\n’ but had to add the ‘\r’ to get rid of a carriage return character being imported into the database
Also make sure that if you do not import into the primary key field/column that it has auto increment on, otherwhise only the first row will be imported
Original Author reference