Using Prepare Statement in mysql [duplicate] - mysql

I have a MySQL database running in Amazon RDS, and I want to know how to export an entire table to CSV format.
I currently use MySQL server on Windows to query the Amazon database, but when I try to run an export I get an error, probably because there's no dedicated file server for amazon RDS. Is there a solution to this?

Presumably, you are trying to export from an Amazon RDS database via a SELECT ... INTO OUTFILE query, which yields this indeed commonly encountered issue, see e.g. export database to CSV. The respective AWS team response confirms your assumption of lacking server access preventing an export like so, and suggests an alternative approach as well via exporting your data in CSV format by selecting the data in the MySQL command line client and piping the output to reformat the data as CSV, like so:
mysql -u username -p --database=dbname --host=rdshostname --port=rdsport --batch
-e "select * from yourtable"
| sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > yourlocalfilename
User fpalero provides an alternative and supposedly simpler approach, if you know and specify the fields upfront:
mysql -uroot -ppassword --database=dbtest
-e "select concat(field1,',',field2,',',field3) FROM tabletest" > tabletest.csv

First of all, Steffen's answer works in most cases.
I recently encountered some larger and more complex outputs where "sed" was not enough and decided to come up with a simple utility to do exactly that.
I build a module called sql2csv that can parse the output of the MySQL CLI:
$ mysql my_db -e "SELECT * FROM some_mysql_table"
+----+----------+-------------+---------------------+
| id | some_int | some_str | some_date |
+----+----------+-------------+---------------------+
| 1 | 12 | hello world | 2018-12-01 12:23:12 |
| 2 | 15 | hello | 2018-12-05 12:18:12 |
| 3 | 18 | world | 2018-12-08 12:17:12 |
+----+----------+-------------+---------------------+
$ mysql my_db -e "SELECT * FROM some_mysql_table" | sql2csv
id,some_int,some_str,some_date
1,12,hello world,2018-12-01 12:23:12
2,15,hello,2018-12-05 12:18:12
3,18,world,2018-12-08 12:17:12
You can also use the built in CLI:
sql2csv -u root -p "secret" -d my_db --query "SELECT * FROM some_mysql_table;"
1,12,hello world,2018-12-01 12:23:12
2,15,hello,2018-12-05 12:18:12
3,18,world,2018-12-08 12:17:12
More information in on sql2csv (GitHub).

Assuming MySQL in RDS, an alternative is to use batch mode which outputs TAB-separated values and escapes newlines, tabs and other special characters. I haven't yet struck a CSV import tool that can't handle TAB-separated data. So for example:
$ mysql -h myhost.rds.amazonaws.com -u user -D my_database -p --batch --quick -e "SELECT * FROM my_table" > output.csv
As noted by Halfgaar, the --quick option flushes immediately, so it avoids out-of-memory errors for large tables. To quote strings (recommended), you'll need to do a bit of extra work in your query:
SELECT id, CONCAT('"', REPLACE(text_column, '"', '""'), '"'), float_column
FROM my_table
The REPLACE escapes any double-quote characters in the text_column values. I would also suggest using iso8601 strings for datetime fields, so:
SELECT CONCAT('"', DATE_FORMAT(datetime_column, '%Y%m%dT%T'), '"') FROM my_table
Be aware that CONCAT returns NULL if you have a NULL column value.
I've run this on some fairly large tables with reasonable performance. 600M rows and 23 GB data took ~30 minutes when running the MySQL command in the same VPC as the RDS instance.

There is a new way from AWS how to do it. Just use their DMS (database migration service).
Here is documentation on how to export table(s) to files on S3 storage: Using Amazon S3 as a target for AWS Database Migration Service - AWS Database Migration Service
You will have possibility to export in two formats: CSV or Parquet.

I'm using the Yii framework on EC2 connecting to an RDS MySQL. The key is to use fputcsv(). The following works perfectly, both on my localhost as well as in production.
$file = 'path/to/filename.csv';
$export_csv = "SELECT * FROM table";
$qry = Yii::app()->db->createCommand($export_csv)->queryAll();
$fh = fopen($file, "w+");
foreach ($qry as $row) {
fputcsv($fh, $row, ',' , '"');
}
fclose($fh);

If you use Steffen Opel's solution, you'll notice that it generates a header that includes the 'concat' string literal. Obviously this is not what you want. Most likely you will want the corresponding headers of your data.
This query will work without any modifications, other than substituting column names and table names:
mysql -h xxx.xxx.us-east-2.rds.amazonaws.com
--database=mydb -u admin -p
-e "SELECT 'column1','column2'
UNION ALL SELECT column1,column2
FROM table_name WHERE condition = value" > dataset.csv
I just opened the results in the Numbers OS X app and the output looks perfect.

With a very large table (~500m rows), even with --quick, nothing was being written to my export file and the process never finished (+6 hours). I wrote the following bash script to get around this. Another bonus is you have an indication of progress as each batch file gets written.
This solution works well as long as you have a sequential column of some kind, e.g. an auto incrementing integer PK or a date column. Make sure you have your date column indexed if you have a lot of data!
#!bin/bash
# Maximum number of rows to export/total rows in table, set a bit higher if live data being written
MAX=500000000
# Size of each export batch
STEP=1000000
for (( c=0; c<= $MAX; c = c + $STEP ))
do
mysql --port 3306 --protocol=TCP -h <rdshostname> -u <username> -p<password> --quick --database=<db> -e "select column1, column2, column3 <table> order by <timestamp> ASC limit $STEP offset $c" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > export$c.csv
done
A slight different approach which may be faster depending on indexing you have in place is step through the data month by month:
#!bin/bash
START_YEAR=2000
END_YEAR=2022
for (( YEAR=$START_YEAR; YEAR<=$END_YEAR; YEAR++ ))
do
for (( MONTH=1; MONTH<=12; MONTH++ ))
do
NEXT_MONTH=1
let NEXT_YEAR=$YEAR+1
if [ $MONTH -lt 12 ]
then
let NEXT_MONTH=$MONTH+1
NEXT_YEAR=$YEAR
fi
mysql --port 3306 --protocol=TCP -h <rdshostname> -u app -p<password> --quick --database=<database> -e "select column1, column2, column3 from <table> where <dateColumn> >= '$YEAR-$MONTH-01 00:00:00' and <dateColumn> < '$NEXT_YEAR-$NEXT_MONTH-01 00:00:00' order by <dateColumn> ASC" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > export-$YEAR-$MONTH-to-$NEXT_YEAR-$NEXT_MONTH.csv
done
done
Hopefully this helps someone

Related

Exporting data from mysql RDS for import into questDb

I've got a big table (~500m rows) in mysql RDS and I need to export specific columns from it to csv, to enable import into questDb.
Normally I'd use into outfile but this isn't supported on RDS as there is no access to the file system.
I've tried using workbench to do the export but due to size of the table, I keep getting out-of-memory issues.
Finally figured it out with help from this: Exporting a table from Amazon RDS into a CSV file
This solution works well as long as you have a sequential column of some kind, e.g. an auto incrementing integer PK or a date column. Make sure you have your date column indexed if you have a lot of data!
#!bin/bash
# Maximum number of rows to export/total rows in table, set a bit higher if live data being written
MAX=500000000
# Size of each export batch
STEP=1000000
mkdir -p parts
for (( c=0; c<= $MAX; c = c + $STEP ))
do
mysql --port 3306 --protocol=TCP -h <rdshostname> -u <username> -p<password> --quick --database=<db> -e "select column1, column2, column3 <table> order by <timestamp> ASC limit $STEP offset $c" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > export$c.csv
# split down in to chunks under questdbs 65k line limit
split -d -l 64999 --additional-suffix=.csv $FILE_NAME.csv ./parts/$FILE_NAME
done
# print out import statements to a file
for i in $(ls -v ./parts); do echo "COPY reading from '$i';" >> import.sql; done;
A slightly different approach which may be faster depending on indexing you have in place is step through the data month by month:
#!bin/bash
START_YEAR=2020
END_YEAR=2022
mkdir -p parts
for (( YEAR=$START_YEAR; YEAR<=$END_YEAR; YEAR++ ))
do
for (( MONTH=1; MONTH<=12; MONTH++ ))
do
NEXT_MONTH=1
let NEXT_YEAR=$YEAR+1
if [ $MONTH -lt 12 ]
then
let NEXT_MONTH=$MONTH+1
NEXT_YEAR=$YEAR
fi
FILE_NAME="export-$YEAR-$MONTH-to-$NEXT_YEAR-$NEXT_MONTH"
mysql --port 3306 --protocol=TCP -h <rdshost> -u app -p<password> --quick --database=<database> -e "select <column1>, <column2>, round(UNIX_TIMESTAMP(<dateColumn>)) * 1000000 as date from <table> where <table>.<dateColumn> >= '$YEAR-$MONTH-01 00:00:00' and table.<dateColumn> < '$NEXT_YEAR-$NEXT_MONTH-01 00:00:00' order by <table>.<dateColumn> ASC" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > $FILE_NAME.csv
# split down in to chunks under questdbs 65k line limit
split -d -l 64999 --additional-suffix=.csv $FILE_NAME.csv ./parts/$FILE_NAME
done
done
# print out import statements to a file
for i in $(ls -v ./parts); do echo "COPY reading from '$i';" >> import.sql; done;
The above scripts will output a import.sql containing all the sql statements you need to import your data. See: https://questdb.io/docs/guides/importing-data/
Edit: this solution would work only if exporting the whole table, not when exporting specific columns
You could try using mysqldump with extra params for CSV conversion. AWS documents how to use mysqldump with RDS and you can see at this stackoverflow question how to use extra params to convert into CSV.
I am quoting here the relevant part from that last link (since there are a lot of answers and comments)
mysqldump <DBNAME> <TABLENAME> --fields-terminated-by ',' \
--fields-enclosed-by '"' --fields-escaped-by '\' \
--no-create-info --tab /var/lib/mysql-files/
You can use the SELECT ... INTO OUTFILE syntax to export the data to a file on the server.
You can then use the mysql command line client to connect to the RDS instance and retrieve the file from the server.
The only slight snag is that mysql won't connect to the RDS instance unless the instance is in a VPC, so if it isn't you'll need to connect to a bastion host first, then connect to the RDS instance from there.
SELECT * FROM mydb.mytable INTO OUTFILE '/tmp/mytable.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
You can then get the file from the server:
mysql -uusername -p -hmyrds.rds.amazonaws.com -P3306
When you have a prompt from the mysql command line client you can retrieve the file using the SELECT command:
SELECT LOAD_FILE('/tmp/mytable.csv');
You can then pipe the output to a file using:
SELECT LOAD_FILE('/tmp/mytable.csv') INTO OUTFILE '/tmp/mytable_out.csv';
You can then use the mysql command line client to connect to your questDB instance and load the data.
If you want to retrieve a specific column then you can specify the column name in the SELECT command when creating the file on the RDS server:
SELECT column1, column2, column3 FROM mydb.mytable INTO OUTFILE '/tmp/mytable.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';

How to import or outfile data to another server using Linux and MariaDB

I need to set up a single table on a MariaDB-Database on a Linux Server gathering data (concatenating the same type of data in one table) from various other Linux MariaDB database Servers. I can't get the data across the servers.
I am logged onto server A connect to server B with -hB --port=3306 -u -p, I run my code, it runs perfectly and gives me exactly the data I need, only thing is the CSV file is stored on server B where I am reading the data from, I want the CSV file to store on server A.
I have used 'into outfile' I then plan to use 'mysqlimport' to load all my files from server B, C & D into a database on Server A.
Perhaps I should use mysqldump rather?
My colleague achieves these results using BCPOUT.
mysql -hB --port=3306 -u -p < $SCRIPTPATH/mysqlcode.sql
SELECT
*
FROM
Database.Table
WHERE DATE(DateCreated) = CURDATE() INTO OUTFILE '/data/file.csv' FIELDS TERMINATED BY ',';
I need to get data a subset of data from numerous Linux-MariaDB servers onto 1 Linux-mariaDB server where I can import the various subsets of data into a single database.
you can do it in following 2 ways
mysql -u root -ptest -h hostname --batch -e "select * from db.table where date = now()" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > file_name.csv
OR
mysqldump -u root -ppwd dbname --tab='/home/user/Documents/db/' --tables stats --no-create-info --where='dates = "2017-12-31"'

Oneliner pipe to PostgreSQL from MySQL with bash

So I am piping quite a lot of data using bash everyday between 3 servers:
Server A is mysql (connection over SSH)
Server B is just a centos server where I run the
bash script.
Server C is postgresql 9.6.
All was good until one table got one row with a double quote in the middle of a varchar. This is breaking my pipe at the insertion level (on pg side).
Indeed, when getting the data this way from Mysql, it is not quoted. So, I believe in the end it's because of the basic behaviour of COPY and its QUOTE parameter.
Here is the bash code:
ssh -o ConnectTimeout=5 -i "$SSH_KEY" "$SSH_USER"#"$SSH_IP" 'mysql -h "$MYHOST" -u "$USER"-p"$PWD" prod -e "SELECT * FROM tableA "' | \
psql -h "$DWH_IP" "$PG_DB" -c "COPY tableA FROM stdin WITH CSV HEADER DELIMITER E'\t' NULL AS 'NULL';"
I tried playing with the COPY parameter QUOTE but unsuccessfully.
Should I put some sed in the middle of the pipeline?
I also tried double quoting when getting the data out of mysql but could not find the relevant parameter when mysql is used in a pipe like this.
I'd lik to keep things in one pipe (no MYSQL->CSV then CSV->PG please).#
Thanks!
Here's working sample of importing csv to postgres:
t=# create table so10 (i int,t text);
CREATE TABLE
t=# \q
postgres#vao-VirtualBox:~$ echo "1,Bro" | psql -d t -c "copy so10 from stdin with csv"
COPY 1
postgres#vao-VirtualBox:~$ psql t -c "select * from so10"
i | t
---+-----
1 | Bro
(1 row)
You can open ssh tunnel to mysql and run mysql -h "$MYHOST" -u "$USER"-p"$PWD" prod -e "SELECT * FROM tableA "' locally (instead of echo in my example)

HOWTO read mysql SELECT into bash variables, then use those variables for INSERT INTO a different table

i am fairly new to this so please be patient, my understanding of bash however is that i can run
mysql --host=hostname --user=username --password=password -e "SELECT * FROM database.table;"
but i have less than no idea from other manuals how to get those into actual bash variables someone mentioned using
read a b c
do while
echo "..${a}..${b}..${c}.."
but i fail to see how that will read them into the variables?
also on reading the varibles back in i will be doing something like
#>WGET $a
then login to mysql again and doing something like
LOAD DATA INFILE data.csv INTO thattable ON DUPLICATE UPDATE
i want to also so something like
INSERT INTO thattable WHERE (i just loaded the info) date = today
but because there will be multiple dates how do i do this, and yes this all needs to be bash-able php is too slow and C i want to avoid unless it's the only way,
thanks i know this is a lot!
-AW
Option 1
Use "select ... \G", store result in a tmp file and grep for the columns.
By example:
mytmp=$(mktemp /tmp/mytemp.XXXXXX)
mysql --host=hostname --user=username --password=password -e "SELECT * FROM database.table \G;" > $mytmp
column_foo=$( fgrep COLUMN_FOO $mytmp | cut -d ':' -f2-)
column_bar=$( fgrep COLUMN_BAR $mytmp | cut -d ':' -f2-)
echo $column_foo
echo $column_bar
Option 2
If the amount of columns is high store all them in a hash array:
mytmp=$(mktemp /tmp/mytemp.XXXXXX)
mysql --host=hostname --user=username --password=password -e "SELECT * FROM database.table \G;" | xargs -I{} echo {} > $mytmp
declare -A a
while IFS=':' read k s; do a[$k]=$s; done < $mytmp
echo ${a[COLUMN_FOO]}
echo ${a[COLUMN_BAR]}

saving blob field to disk from bash

I have a mysql database with a blob field containing a zip and I need to save it as a file on disk, from bash. I'm doing the following but the end result doesn't read as a zip... Am I doing something wrong or is the file stored not actually a zip (the entry in the database is actually created by a seismological station, so I have no control over it)?
echo "USE database; SELECT blobcolumn FROM table LIMIT 1" | mysql -u root > file.zip
then I open file.zip with a file editor and remove first line which contains the column header. Then 'unzip' doesn't recognize it as a zip file.
For a gzipped blob you can use:
echo "use db; select blob from table where id=blah" | mysql -N --raw -uuser -ppass > mysql.gz
I have not tried this with a zip file.
The proper way to do this would be to use DUMPFILE, otherwise mysql will mess up your data.
mysql -uroot -e "SELECT blobcolumn INTO DUMPFILE '/tmp/file.zip' FROM table LIMIT 1" database
I know this is an old question, but I needed the answer myself, so this is what worked for me.
I found that mysql appends a newline character at the end, which needs to be removed before the correct binary value remains.
echo "USE database; SELECT blobcolumn FROM table LIMIT 1" | mysql -N --raw -u root | head -c -1 > file.zip
you would need to skip column, like
sql="USE database; SELECT blobcolumn FROM table LIMIT 1"
mysql -u root -N <<< $sql > file.zip