MySQL LOAD DATA LOCAL INFILE vs. SQL file - mysql

Every day we load around 6GB of CSV files into MySQL using:
LOAD DATA LOCAL INFILE 'file$i.csv' INTO TABLE tableName FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n';
We have 6 files that goes through this process, so it takes some time. Since we generate these files ourselves, we're in control of what format it gets outputted as.
Originally we chose CSV because this was a smaller process and we needed the data to be moved around and easily read by a non-developer. Now however, it's not so much of an issue since the loading time is so dramatic, we're now talking hours.
Is it quicker to output each row as an INSERT query into a single file and execute that or is CSV still quicker?
We're using the InnoDB storage engine.

If you use MyISAM tables, try ALTER TABLE table_name DISABLE KEYS; before loading the data and ALTER TABLE table_name ENABLE KEYS; after data import is done. This will greatly reduce your time taken for huge data.
Load data is faster than separate insert statement for each row.

Related

Load Data InFile SQL statement is not allowed to be executed when creating MySQL Events

I know that the Load Data InFile SQL statement is not allowed to be executed when creating MySQL Events.
I am trying to find an alternative solution to this, but until now I cant.
https://dev.mysql.com/doc/mysql-reslimits-excerpt/5.6/en/stored-program-restrictions.html
Anyone has any idea if there is any other way that we can set to load external file into the tables on a scheduled basis?
Any reason you cant just do the Load Data into a temporary file of the expected input format? Add a few extra columns such as for pre-processing flags if that might help? Then that could be a single load without any other interactions of triggers and such. Then you could have a stored procedure run query / fetch / process / insert, etc. based on the already loaded records into a table. If you skip some, so be it, others process as normal? Just something I have done in the past and might be an option for you.
although I did not quite understand your question if you are on a Linux based system, you could create a simple bash script to load any external data into MySQL . Either dump it locally and load it or pull from external source and load it the same way you would load any file.
For example as below I am trying to import my data from a csv file to my customers table and I want to schedule it every 5 minutes.
Hence I include the load data infile under a mysql event. Code as below
However, it returns an error: ERROR Code 1314: LOAD DATA is not allowed in stored procedures.
Any alternative way that i can import a csv file data into a table and execute it during a schedule of every 5 minutes?
USE sales;
CREATE EVENT import_file_event
ON SCHEDULE EVERY 5 MINUTE
DO
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/Importfile.csv'
INTO TABLE customers
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(#first_name,#last_name,#gender,#email_address,#number_of_complaints)
SET first_name = NULLIF(#first_name,''),
last_name = NULLIF(#last_name,''),
gender = NULLIF(#gender,''),
email_address = NULLIF(#email_address,'');

MySQL importing large csv

I have a large CSV file I am trying to import into MySQL (around 45GB, around 150mil rows, most columns small but one with variable length text, can get up to KBs of size). I am using LOAD DATA LOCAL INFILE to try and import it but the server always times out my connection before it finishes. I have tried modifying the global connection timeout variables to fix this, but it already has some hours before it times out. Is there another way to import a database this large, or am I doing something wrong with this approach?
LOAD DATA LOCAL INFILE 'filename.csv' INTO TABLE table CHARACTER SET latin1
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';
I am executing this command on Windows 10, with the MySQL command line. My MySQL version is 8.0.
The way I have handled this in the past is by writing a php script that reads the file and outputs the top 50% into a new file, then deletes those rows. Then perform two load data infiles, one for original and one for the new.

Scheduling Mysql procedure/macro to load CSV data

As I'm beginner to mysql ,I'm asking this question. Please help me.
I had .csv file and i'm loading this file data into mysql table. using the following command
"load data infile 'D:/xampp/htdocs/test/test.csv' into table offers fields terminated by ',' enclosed by '"' lines terminated by '\n' ignore 1 rows; "
It is inserting data into data into table successfully.
Now my question as follows
test.csv file(it has a huge volume of data)is going to update for every 24 hours. So that I want a stored procedure/macro( whatever it may be) to load the updated data into offers table it is going to call for every 24 hours, So that table data is in sync with .csv file.
Steps to remember
I want to truncate the offers table data before insert into table
and load the data using above command
Create a success log status in another log table(optional)
I heared that "load data" not going to work in stored procedure (I don't exactly).please give me any answer/suggesstions.

Regular transfer of .csv to database

I am working on a program that will regularly read the data from a .csv file and import it to my database. The csv is a copy from a database on another server so the table structure will be different when I upload to the new one.
What I am unsure of is the best method to do this on a nightly basis, and hopefully automate the process. Any suggestions?
The database is MySQL on an apache server
Consider using LOAD DATA INFILE query on a timed script with PHP, Python, or other language to upload into temp table:
LOAD DATA INFILE 'filename.csv'
INTO TABLE tempname
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';
Then run an append query migrating the different structured temp table data into final table.
INSERT INTO finaltable (Col1, Col2, Col3,...)
SELECT [Col1], [Col2], [Col3], ... FROM tempname
Best solution in my opinion is create a PHP script to manipulate csv data and match the format of the file to the tables in your database. After which, you can set up a cron job(linux) or scheduled task(windows) to run the script automatically at your desired time and recursion. Hope this helps you.

Does it make any sense that I use multi processes to insert data to MySQL?

I need to insert about 300 millions data records into MySQL, I wonder does it make any sense that I use multi-processes to make it ?
Situation 1 : 300 millions records insert into only one table.
Situation 2 : 300 millions records insert into multi tables.
What are the bottlenecks is on these two situations ?
The data source is about 800+ txt files.
I know there's a command LOAD DATA INFILE, I just want to understand this question. :D
Since you have lots of data consider using LOAD DATA. It's the fastest method of importing data from files according to mysql docs.
LOAD DATA INFILE
The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.
Speed of INSERT Statements
When loading a table from a text file, use LOAD DATA INFILE. This is
usually 20 times faster than using INSERT statements. See Section
13.2.6, “LOAD DATA INFILE Syntax”.
...
INSERT is still much slower for loading data than LOAD DATA INFILE, even when using the strategies just outlined.
LOAD DATA INFILE '/path/to/your/file.csv'
INTO TABLE table_name
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' -- or '\r\n'
IGNORE 1 LINES; -- use IGNORE if you have a header line in your file