Exporting comma delimited data from a very large table

Exporting comma delimited data from a very large table - mysql

I am trying to get all the data from a very large table from a remote host having around 13million entries into a text file. I have tried the following command but after sometime process gets killed and shows a message called "Killed." in the console.
mysql --user=username --password -h host -e "select * from db.table_name" >> output_file.txt
My primary goal is to copy data from mysql to redshift, which I am doing it by getting all the data with "," delimited int a text file uploading it on s3 and executing COPY query on redshift.
P.S for small tables the above command is working properly but not for large tables.

You could try mysqldump instead. It can be parameterized to output CSV if I recall correctly. I haven't tried this myself, so you might want to check the docs, but this should work:
mysqldump --user=username --password -h host \
--fields-terminated-by="," --fields-enclosed-by="\"" --lines-terminated-by="\n" \
dbname tablename > output_file.txt
If that does not work, you could try SELECT INTO OUTFILE. You will need to do that directly on the MySQL host like this:
SELECT * INTO OUTFILE '/tmp/data.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\' LINES TERMINATED BY '\n'
FROM db.table_name

Related

Exporting data from mysql RDS for import into questDb

I've got a big table (~500m rows) in mysql RDS and I need to export specific columns from it to csv, to enable import into questDb.
Normally I'd use into outfile but this isn't supported on RDS as there is no access to the file system.
I've tried using workbench to do the export but due to size of the table, I keep getting out-of-memory issues.

Finally figured it out with help from this: Exporting a table from Amazon RDS into a CSV file
This solution works well as long as you have a sequential column of some kind, e.g. an auto incrementing integer PK or a date column. Make sure you have your date column indexed if you have a lot of data!
#!bin/bash
# Maximum number of rows to export/total rows in table, set a bit higher if live data being written
MAX=500000000
# Size of each export batch
STEP=1000000
mkdir -p parts
for (( c=0; c<= $MAX; c = c + $STEP ))
do
mysql --port 3306 --protocol=TCP -h <rdshostname> -u <username> -p<password> --quick --database=<db> -e "select column1, column2, column3 <table> order by <timestamp> ASC limit $STEP offset $c" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > export$c.csv
# split down in to chunks under questdbs 65k line limit
split -d -l 64999 --additional-suffix=.csv $FILE_NAME.csv ./parts/$FILE_NAME
done
# print out import statements to a file
for i in $(ls -v ./parts); do echo "COPY reading from '$i';" >> import.sql; done;
A slightly different approach which may be faster depending on indexing you have in place is step through the data month by month:
#!bin/bash
START_YEAR=2020
END_YEAR=2022
mkdir -p parts
for (( YEAR=$START_YEAR; YEAR<=$END_YEAR; YEAR++ ))
do
for (( MONTH=1; MONTH<=12; MONTH++ ))
do
NEXT_MONTH=1
let NEXT_YEAR=$YEAR+1
if [ $MONTH -lt 12 ]
then
let NEXT_MONTH=$MONTH+1
NEXT_YEAR=$YEAR
fi
FILE_NAME="export-$YEAR-$MONTH-to-$NEXT_YEAR-$NEXT_MONTH"
mysql --port 3306 --protocol=TCP -h <rdshost> -u app -p<password> --quick --database=<database> -e "select <column1>, <column2>, round(UNIX_TIMESTAMP(<dateColumn>)) * 1000000 as date from <table> where <table>.<dateColumn> >= '$YEAR-$MONTH-01 00:00:00' and table.<dateColumn> < '$NEXT_YEAR-$NEXT_MONTH-01 00:00:00' order by <table>.<dateColumn> ASC" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > $FILE_NAME.csv
# split down in to chunks under questdbs 65k line limit
split -d -l 64999 --additional-suffix=.csv $FILE_NAME.csv ./parts/$FILE_NAME
done
done
# print out import statements to a file
for i in $(ls -v ./parts); do echo "COPY reading from '$i';" >> import.sql; done;
The above scripts will output a import.sql containing all the sql statements you need to import your data. See: https://questdb.io/docs/guides/importing-data/

Edit: this solution would work only if exporting the whole table, not when exporting specific columns
You could try using mysqldump with extra params for CSV conversion. AWS documents how to use mysqldump with RDS and you can see at this stackoverflow question how to use extra params to convert into CSV.
I am quoting here the relevant part from that last link (since there are a lot of answers and comments)
mysqldump <DBNAME> <TABLENAME> --fields-terminated-by ',' \
--fields-enclosed-by '"' --fields-escaped-by '\' \
--no-create-info --tab /var/lib/mysql-files/

You can use the SELECT ... INTO OUTFILE syntax to export the data to a file on the server.
You can then use the mysql command line client to connect to the RDS instance and retrieve the file from the server.
The only slight snag is that mysql won't connect to the RDS instance unless the instance is in a VPC, so if it isn't you'll need to connect to a bastion host first, then connect to the RDS instance from there.
SELECT * FROM mydb.mytable INTO OUTFILE '/tmp/mytable.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
You can then get the file from the server:
mysql -uusername -p -hmyrds.rds.amazonaws.com -P3306
When you have a prompt from the mysql command line client you can retrieve the file using the SELECT command:
SELECT LOAD_FILE('/tmp/mytable.csv');
You can then pipe the output to a file using:
SELECT LOAD_FILE('/tmp/mytable.csv') INTO OUTFILE '/tmp/mytable_out.csv';
You can then use the mysql command line client to connect to your questDB instance and load the data.
If you want to retrieve a specific column then you can specify the column name in the SELECT command when creating the file on the RDS server:
SELECT column1, column2, column3 FROM mydb.mytable INTO OUTFILE '/tmp/mytable.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';

cron job - how to export data table to csv/xls file?

i'm trying to configure cron job to export automatically everyday data table from mysql DB to csv/xls.
and i'm stuck in te command field.
which command should i use ?
thank you.
mor

This is an old topic, but for posterity, what I've found would work in this case is to simply write the SQL in a separate file, then source the file into mysql. That way, you don't have to worry about quotes and escapes etc.
So, for example, I would have a SQL file like so, called exportinventory.sql
SELECT *
FROM inventory INTO OUTFILE '/tmp/inventory.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';
And then, from the command line:
mysql -u user -p database < exportinventory.sql

There are three steps to accomplish:
write proggy, that exports the data as you want (php, java)
write a bash script, which properly sets the environment and calls
this proggy
insert this script in to a users crontab
you might find information for all of the three over here.

You can try something like:
mysql -h $DB_HOST -u $DB_USER --password=$DB_PASSWORD $DB_NAME -e "$YOUR_QUERY INTO OUTFILE '$PATH_TO_OUTPUT_FILE' FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n'"

How to use a bash script to write to a mysql table

I'm using a bash script to pull data from online sources. Right now I just have it writing to a text file, but it would be better if the script could automatically put this data into mysql tables. How can this be done? Examples would be helpful.

Suppose you download a .csv file. which has header and have a database test in mysql.
Download the file first.
wget http://domain.com/data.csv -O data.csv
Dump the data to mysql table tbl
cat <<FINISH | mysql -uUSERNAME -pPASSWORD test
LOAD DATA INFILE 'data.csv' INTO TABLE `tbl`
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
FINISH
Here USERNAME must have FILE privilege.

You can use bash like this:
#!/bin/bash
params="dbname -uuser -ppassswd"
echo "SELECT * FROM table;" | mysql $params
# or
mysql $params <<DELIMITER
SELECT * FROM table;
DELIMITER

how to add properties with quote to a MySQL request design to output a CSV

I am trying to output an SQL request into a CSV file but I can't add two options that will help me to set it up. Here is the part of the request that works well:
mysql --host=localhost --user=root --password=pass --quick -e 'SELECT * FROM DB.TABLE' > '/stupidpath withaspace/stuff/myrep/export.csv'
I would like to add those two options to this request, but there something with the quote I don't get:
FIELDS TERMINATED BY ','
and
ENCLOSED BY '"'
How can I integrate this?

Probably the easiest way is to put your exporting SQL in a separate file and then feed that into mysql. The SQL file, exporter.sql, would look like this:
SELECT * INTO OUTFILE '/stupidpath withaspace/stuff/myrep/export.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
FROM DB.TABLE;
And then run it with:
mysql --host=localhost --user=root --password=pass --quick < exporter.sql
Putting the SQL in a separate file avoids the usual escaping and quoting problems of trying to send quotes into something from the shell.

How do I store a MySQL query result into a local CSV file?

How do I store a MySQL query result into a local CSV file? I don't have access to the remote machine.

You're going to have to use the command line execution '-e' flag.
$> /usr/local/mysql/bin/mysql -u user -p -h remote.example.com -e "select t1.a,t1.b from db_schema.table1 t1 limit 10;" > hello.txt
Will generate a local hello.txt file in your current working directory with the output from the query.

Use the concat function of mysql
mysql -u <username> -p -h <hostname> -e "select concat(userid,',',surname,',',firstname,',',midname) as user from dbname.tablename;" > user.csv
You can delete the first line which contains the column name "user".

We can use command line execution '-e' flag and a simple python script to generate result in csv format.
create a python file (lets name = tab2csv) and write the following code in it..
#!/usr/bin/env python
import csv
import sys
tab_in = csv.reader(sys.stdin, dialect=csv.excel_tab)
comma_out = csv.writer(sys.stdout, dialect=csv.excel)
for row in tab_in:
comma_out.writerow(row)
Run the following command by updating mysql credentials correctly
mysql -u orch -p -h database_ip -e "select * from database_name.table_name limit 10;" | python tab2csv > outfile.csv
Result will be stored in outfile.csv.

I haven't had a chance to test it against content with difficult characters yet, but the fantastic mycli may be a solution for many.
Command line
mycli --csv -e "select * from table;" mysql://user#host:port/db > file.csv
Interactive mode:
\n \T csv ; \o ~/file.csv ; select * from table1; \P
\n disables pager that requires pressing space to display each page
\T csv ; - sets the output format to csv
\o <filename> ; - appends next output to a file
<query> ;
\P turns the pager back on

I am facing this problem and I've been reading some time for a solution: importing into excel, importing into access, saving as text file...
I think the best solution for windows is the following:
use the command insert...select to create a "result" table. The ideal scenario whould be to automatically create the fields of this result table, but this is not possible in mysql
create an ODBC connection to the database
use access or excel to extract the data and then save or process in the way you want
For Unix/Linux I think that the best solution might be using this -e option tmarthal said before and process the output through a processor like awk to get a proper format (CSV, xml, whatever).

Run the MySQl query to generate CSV from your App like below
SELECT order_id,product_name,qty FROM orders INTO OUTFILE '/tmp/orders.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'
It will create the csv file in the tmp folder of the application.
Then you can add logic to send the file through headers.
Make sure the database user has permissions to write into the remote file-system.

You can try doing :
SELECT a,b,c
FROM table_name
INTO OUTFILE '/tmp/file.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
We use the OUTFILE clause here to store the output into a CSV file.We enclose the fields in double-quotes to handle field values that have the comma in them and we separate the fields by comma and separate individual line using newline.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Exporting comma delimited data from a very large table - mysql

Related

Exporting data from mysql RDS for import into questDb

cron job - how to export data table to csv/xls file?

How to use a bash script to write to a mysql table

how to add properties with quote to a MySQL request design to output a CSV

How do I store a MySQL query result into a local CSV file?

Categories

Resources