LOAD DATA FROM S3 command failing because of timestamp - mysql

I'm running the "LOAD DATA FROM S3" command to load a CSV file from S3 to Aurora MySQL. The command works fine if run it in the Mysql Workbench (it gives me the below exception as warnings though but still inserts the dates fine), but when I run it in Java I get the following exception:
com.mysql.cj.jdbc.exceptions.MysqlDataTruncation:
Data truncation: Incorrect datetime value: '2018-05-16T00:31:14-07:00'
Is there a workaround? Is there something I need to setup on the mysql side or in my app to make this transformation seamless? Should I somehow run a REPLACE() command on the timestamp?
Update 1:
When I use REPLACE to remove the "-07:00" from the time original timestamp (2018-05-16T00:31:14-07:00) it loads the data appropriately. Here's my load statement:
LOAD DATA FROM S3 's3://bucket/object.csv'
REPLACE
INTO TABLE sample
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(#myDate)
SET `created-date` = replace(#myDate, '-07:00', ' ');
For obvious reasons it's not a good solution. Why would the LOAD statement work in the mysql workbench and not in my java code? Can I set some parameter to make it work? Any help is appreciated!!

The way I solved it is by using mysql's SUBSTRING function in the 'SET' part of the LOAD DATA query (instead of the 'replace'):
SUBSTRING(#myDate, 1, 10)
This way the trailing '-07:00' was removed (I actually opted to remove the time as well, since I didn't need it, but you can use it for TIMESTAMPS as well.

Related

MySql naming the automatically downloaded CSV file using first and last date

MySql query gives me data from the 2020-09-21 to 2022-11-02. I want to save the file as FieldData_20200921_20221102.csv.
Mysql query
SELECT 'datetime','sensor_1','sensor_2'
UNION ALL
SELECT datetime,sensor_1,sensor_2
FROM `field_schema`.`sensor_table`
INTO OUTFILE "FieldData.csv"
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
;
Present output file:
Presently I named the file as FieldData.csv and it is accordingly giving me the same. But I want the query to automatically append the first and last dates to this file, so, it helps me know the duration of data without having to open it.
Expected output file
FieldData_20200921_20221102.csv.
MySQL's SELECT ... INTO OUTFILE syntax accepts only a fixed string literal for the filename, not a variable or an expression.
To make a custom filename, you would have to format the filename yourself and then write dynamic SQL so the filename could be a string literal. But to do that, you first would have to know the minimum and maximum date values in the data set you are dumping.
I hardly ever use SELECT ... INTO OUTFILE, because it can only create the outfile on the database server. I usually want the file to be saved on the server where my client application is running, and the database server's filesystem is not accessible to the application.
Both the file naming problem and the filesystem access problem are better solved by avoiding the SELECT ... INTO OUTFILE feature, and instead writing to a CSV file using application code. Then you can name the file whatever you want.

MariaDB CSV output limitations

I have some data in a MariaDB 10.2 database that I'm trying to do the following to:
Compose a JSON document using nested JSON_SET calls
Export a list of query results into a CSV file for use later on in another tool.
These JSON documents can vary in size, depending on what the field values are, but in general, it looks like they're in the range of 250-400 characters long. The issue I am running into is that if the JSON documents are getting truncated to 278 characters (often resulting in malformed JSON that cannot be used).
Is there a configuration or query parameter that I can use to configure this? I tried Googling for it earlier, but so far I've been unable to find anything.
Would appreciate any help!
As an example, the query looks like:
SELECT field_1, JSON_SET(JSON_SET('{"foo":{}}', '$.foo.bar', field_2), '$.foo.baz', field_3)
FROM test
INTO OUTFILE '/tmp/test.csv'
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n';

How can I load 10,000 rows of test.xls file into mysql db table?

How can I load 10,000 rows of test.xls file into mysql db table?
When I use below query it shows this error.
LOAD DATA INFILE 'd:/test.xls' INTO TABLE karmaasolutions.tbl_candidatedetail (candidate_firstname,candidate_lastname);
My primary key is candidateid and has below properties.
The test.xls contains data like below.
I have added rows starting from candidateid 61 because upto 60 there are already candidates in table.
please suggest the solutions.
Export your Excel spreadsheet to CSV format.
Import the CSV file into mysql using a similar command to the one you are currently trying:
LOAD DATA INFILE 'd:/test.csv'
INTO TABLE karmaasolutions.tbl_candidatedetail
(candidate_firstname,candidate_lastname);
To import data from Excel (or any other program that can produce a text file) is very simple using the LOAD DATA command from the MySQL Command prompt.
Save your Excel data as a csv file (In Excel 2007 using Save As) Check
the saved file using a text editor such as Notepad to see what it
actually looks like, i.e. what delimiter was used etc. Start the MySQL
Command Prompt (I’m lazy so I usually do this from the MySQL Query
Browser – Tools – MySQL Command Line Client to avoid having to enter
username and password etc.) Enter this command: LOAD DATA LOCAL INFILE
‘C:\temp\yourfile.csv’ INTO TABLE database.table FIELDS TERMINATED
BY ‘;’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\r\n’ (field1, field2);
[Edit: Make sure to check your single quotes (') and double quotes (")
if you copy and paste this code - it seems WordPress is changing them
into some similar but different characters] Done! Very quick and
simple once you know it :)
Some notes from my own import – may not apply to you if you run a different language version, MySQL version, Excel version etc…
TERMINATED BY – this is why I included step 2. I thought a csv would default to comma separated but at least in my case semicolon was the deafult
ENCLOSED BY – my data was not enclosed by anything so I left this as empty string ”
LINES TERMINATED BY – at first I tried with only ‘\n’ but had to add the ‘\r’ to get rid of a carriage return character being imported into the database
Also make sure that if you do not import into the primary key field/column that it has auto increment on, otherwhise only the first row will be imported
Original Author reference

how to populate a database?

I have a mysql database with a single table, that includes an autoincrement ID, a string and two numbers. I want to populate this database with many strings, coming from a text file, with all numbers initially reset to 0.
Is there a way to do it quickly? I thought of creating a script that generates many INSERT statements, but that seems somewhat primitive and slow. Especially since mysql is on remote site.
Yes - use LOAD DATA INFILE docs are here Example :
LOAD DATA INFILE 'csvfile'
INTO TABLE table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 0 LINES
(cola, colb, colc)
SET cold = 0,
cole = 0;
Notice the set line - here is where you set a default value.
Depending on your field separator change the line FIELDS TERMINATED BY ','
The other answers only respond to half of your question. For the other half (zeroing numeric columns):
Either:
Set the default value of your number columns to 0,
In your text file, simply delete the numeric values,
This will cause the field to be read by LOAD INFILE as null, and the defauly value will be assigned, which you have set to 0.
Or:
Once you have your data in the table, issue a MySQL command to zero the fields, like
UPDATE table SET first_numeric_column_name = 0, second_numeric_column_name = 0 WHERE 1;
And to sum everything up, use LOAD DATA INFILE.
If you have access to server's file system, you can utilize LOAD DATA
If you don't want to fight with syntax, easiest way (if on windows) is to use HeidiSQL
It has friendly wizard for this purpose.
Maybe I can help you with right syntax, if you post sample line from text file.
I recommend you to use SB Data Generator by Softbuilder (which I work for), download and install the free trial.
First, create a connection to your MySQL database then go to “Tools -> Virtual Data” and import your test data (the file must be in CSV format).
After importing the test data, you will be able to preview them and query them in the same window without inserting them into the database.
Now, if you want to insert test data into your database, go to “Tools -> Data Generation” and select "generate data from virtual data".
SB data generator from Softbuilder

Having troubles loading data in InfoBright ICE

ICE Version: infobright-3.5.2-p1-win_32
I’m trying to load a large file but keep running into problems with errors such as:
Wrong data or column definition. Row: 989, field: 5.
This is row 989, field 5:
”(450)568-3***"
Note: The last 3 chars are numbers as well, but didn’t want to post somebodys phone number on here.
It’s really no different to any of the other entries in that field.
The datatype of that field is VARCHAR(255) NOT NULL
Also, if you upgrade to the current release 4.0.6, we now support row level error checking during LOAD and support a reject file.
To enable the reject file functionality, you must specify BH_REJECT_FILE_PATH and one of the associated parameters (BH_ABORT_ON_COUNT or BH_ABORT_ON_THRESHOLD). For example, if you want to load data from the file DATAFILE.csv to table T but you expects that 10 rows in this file might be wrongly formatted, you would run the following commands:
set #BH_REJECT_FILE_PATH = '/tmp/reject_file';
set #BH_ABORT_ON_COUNT = 10;
load data infile DATAFILE.csv into table T;
If less than 10 rows are rejected, a warning will be output, the load will succeed and all problematic rows will be output to the file /tmp/reject_file. If the Infobright Loader finds a tenth bad row, the load will terminate with an error and all bad rows found so far will be output to the file /tmp/reject_file.
I've run into this issue when the last line of the file is not terminated by the value of --lines-terminated-by="\n".
For example If I am importing a file with 9000 lines of data I have to make sure there is a new line at the end of the file.
Depending on the size of the file, you can just open it with a text editor and hit the return k
I have found this to be consistent with the '\r\n' vs. '\n' difference. Even when running on the loader on Windows, '\n' succeeds 100% of times (assuming you don't have real issues with your data vs. col. definition)