Need to add a delimiter in MySQL output from SHELL - mysql

First off, due to my MySQL user not having FILE rights on the server, I am having to use the below line to pipe my SELECT statement output to a file in shell instead of doing it directly in MySQL and being able to use INTO OUTFILE & FIELDS TERMINATED BY '|' which I'm guessing would solve all my problems.
So I have the following line to grab my fields:
echo "select id, UNIX_TIMESTAMP(time), company from database.table_name" | mysql -h database.mysql.host.com -u username -ppassword user > /root/sql/output.txt
This outputs the following 3 columns:
63 1414574321 person one
50 1225271921 Another person
8 1225271921 Company with many names
10 1414574567 Person with Company
I then use that data in other scripts to do some tasks.
My issue is that some columns, of which the third here, 'company', is an example, has spaces in its data meaning my WHILE loops later get thrown off.
I would like to add a delimiter to my output so it looks like this instead:
63|1414574321|person one
50|1225271921|Another person
8|1225271921|Company with many names
10|1414574567|Person with Company
and that way I could hopefully manipulate the data in blocks using awk -F| and IFS=| later.
There are many many more columns with variable lengths and number of words pr column to be added when I get it working, so I cannot use a method that relies on position to add the delimiter.
I feel the delimiter needs to be set when the data is dumped in the first place.
I've tried things like:
echo "select (id, + '|' + UNIX_TIMESTAMP(time), + '|' + company) from database.table_name" | mysql -h database.mysql.host.com -u username -ppassword user > /root/sql/output.txt
without any luck, its just adds the characters to the header of the output file.
Does anyone out there see a solution to what I could do?
In case anyone wonders, I'm dumping data from 2 databases, comparing timestamps and writing back the latest data to both databases.

You could use concat_ws function to recieve one concateneted string per row:
select concat_ws( '|', id, UNIX_TIMESTAMP(time) , company ) from database.table_name
Edit: Missing comma added, sorry!

Related

Loading multiple csv files to MariaDB/MySQL through bash script

After trying for a full day, I'm hoping someone here can help me make below script work. I've combined information from multiple threads (example) and websites, but can't get it to work.
What I'm trying to do:
I'm trying to get a MariaDB10 database called 'stock_db' on my Synology NAS to load all *.csv files from a specific folder (where I save downloaded historical prices of stocks) and add these to a table called 'prices'. The files are all equally named "price_history_'isin'.csv".
Below SQL statement works when running it individually from HeidiSQL on my Windows machine:
Working SQL
LOAD DATA LOW_PRIORITY LOCAL INFILE 'D:\\Downloads\\price_history_NL0010366407.csv'
IGNORE INTO TABLE `stock_db`.`prices`
CHARACTER SET utf8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 2 LINES
(#vdate, #vprice)
SET
isin = 'NL0010366407',
date = STR_TO_DATE(#vdate, '%d-%m-%Y'),
price = #vprice
;
The issue
Unfortunately, when I try to batch loading all csv's from a folder on my NAS through below script, I keep getting the same error.
#!/bin/bash
for filename in ./price_history/*.csv; do
echo $filename
isin=${filename:30:12}
echo $isin
/volume1/#appstore/MariaDB10/usr/local/mariadb10/bin/mysql -u root -p \
"LOAD DATA LOW_PRIORITY LOCAL INFILE '$filename'\
IGNORE INTO TABLE 'stock_db.prices'\
CHARACTER SET utf8\
FIELDS TERMINATED BY ';'\
OPTIONALLY ENCLOSED BY '"'"'"'\
ESCAPED BY '"'"'"'\
LINES TERMINATED BY '\r\n'\
IGNORE 2 LINES (#vdate, #vprice)\
SET\
isin = '$isin',\
date = STR_TO_DATE(#vdate, '%d-%m-%Y'),\
price = #vprice;"
done
ERROR 1102 (42000): Incorrect database name
What I've tried
Took the database name out of stock_db.prices and mentioned it separately as [database] outside of the quoted SQL statement - Doesn't work
Changed quotes around 'stock_db.prices' in many different ways - Doesn't work
Separated the SQL into a separate file and referenced it '< stmt.sql' - Complicates things even further and couldn't get it to work at all (although probably preferred)
Considered (or even preferred) using a PREPARE statement, but seems I can't use this in combination with LOAD DATA (reference)
Bonus Question
If someone can help me do this without having to re-enter the user's password or putting the password in the script, this would be really nice bonus!
Update
Got the 'Incorrect Database Error' resolved by adding '-e' option
Now I have a new error on the csv files:
ERROR 13 "Permission Denied"
While the folder and files are full access for everyone.
Anyone any thoughts to this?
Thanks a lot!
Try to set database using -D option: change the first line to
/volume1/#appstore/MariaDB10/usr/local/mariadb10/bin/mysql -D stock_db -u root -p \ ...
You may have an error in this line IGNORE INTO TABLE 'stock_db.prices'\ - try to remove the single quotes.
Create file .my.cnf in your user's home directory and put the following information into it:
[client]
password="my password"
Info about option files.
'stock_db.prices'
Incorrect quoting. This will work since neither are keywords:
stock_db.prices
This will also work:
`stock_db`.`prices`
Note that the db name and the table name are quoted separately, using backtics.
I can't predict what will happen with this nightmare:
'"'"'"'

Importing a series of .CSV files that contain one field while adding additional 'known' data in other fields

I've got a process that creates a csv file that contains ONE set of values that I need to import into a field in a MySQL database table. This process creates a specific file name that identifies the values of the other fields in that table. For instance, the file name T001U020C075.csv would be broken down as follows:
T001 = Test 001
U020 = User 020
C075 = Channel 075
The file contains a single row of data separated by commas for all of the test results for that user on a specific channel and it might look something like:
12.555, 15.275, 18.333, 25.000 ... (there are hundreds, maybe thousands, of results per user, per channel).
What I'm looking to do is to import directly from the CSV file adding the field information from the file name so that it looks something like:
insert into results (test_no, user_id, channel_id, result) values (1, 20, 75, 12.555)
I've tried to use "Bulk Insert" but that seems to want to import all of the fields where each ROW is a record. Sure, I could go into each file and convert the row to a column and add the data from the file name into the columns preceding the results but that would be a very time consuming task as there are hundreds of files that have been created and need to be imported.
I've found several "import CSV" solutions but they all assume all of the data is in the file. Obviously, it's not...
The process that generated these files is unable to be modified (yes, I asked). Even if it could be modified, it would only provide the proper format going forward and what is needed is analysis of the historical data. And, the new format would take significantly more space.
I'm limited to using either MATLAB or MySQL Workbench to import the data.
Any help is appreciated.
Bob
A possible SQL approach to getting the data loaded into the table would be to run a statement like this:
LOAD DATA LOCAL INFILE '/dir/T001U020C075.csv'
INTO TABLE results
FIELDS TERMINATED BY '|'
LINES TERMINATED BY ','
( result )
SET test_no = '001'
, user_id = '020'
, channel_id = '075'
;
We need the comma to be the line separator. We can specify some character that we are guaranteed not to tppear to be the field separator. So we get LOAD DATA to see a single "field" on each "line".
(If there isn't trailing comma at the end of the file, after the last value, we need to test to make sure we are getting the last value (the last "line" as we're telling LOAD DATA to look at the file.)
We could use user-defined variables in place of the literals, but that leaves the part about parsing the filename. That's really ugly in SQL, but it could be done, assuming a consistent filename format...
-- parse filename components into user-defined variables
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(f.n,'T',-1),'U',1) AS t
, SUBSTRING_INDEX(SUBSTRING_INDEX(f.n,'U',-1),'C',1) AS u
, SUBSTRING_INDEX(f.n,'C',-1) AS c
, f.n AS n
FROM ( SELECT SUBSTRING_INDEX(SUBSTRING_INDEX( i.filename ,'/',-1),'.csv',1) AS n
FROM ( SELECT '/tmp/T001U020C075.csv' AS filename ) i
) f
INTO #ls_u
, #ls_t
, #ls_c
, #ls_n
;
while we're testing, we probably want to see the result of the parsing.
-- for debugging/testing
SELECT #ls_t
, #ls_u
, #ls_c
, #ls_n
;
And then the part about running of the actual LOAD DATA statement. We've got to specify the filename again. We need to make sure we're using the same filename ...
LOAD DATA LOCAL INFILE '/tmp/T001U020C075.csv'
INTO TABLE results
FIELDS TERMINATED BY '|'
LINES TERMINATED BY ','
( result )
SET test_no = #ls_t
, user_id = #ls_u
, channel_id = #ls_c
;
(The client will need read permission the .csv file)
Unfortunately, we can't wrap this in a procedure because running LOAD DATA
statement is not allowed from a stored program.
Some would correctly point out that as a workaround, we could compile/build a user-defined function (UDF) to execute an external program, and a procedure could call that. Personally, I wouldn't do it. But it is an alternative we should mention, given the constraints.

Load multiple CSV's into MySQL using shell script

Inside a directory /mylog/ I have a bunch of CSV files. Each CSV file only has one line of data. I need this data to be inputted into a MySQL Database.
An example of a line would be:
2015-08-14 00:00:00,HOSTNAME,10271kB,17182kB,92874kB,10%,/dev/disk1,/
I need to remove the 'kB' from each file size and remove the % from the percentage field. I also need to make sure that the date time and hostname are always unique and no duplicate entries are ever put in.
This is what I started to write so far. But I'm obviously missing the database name to use and removing the kB and %. If there's anything else wrong or missing, let me know. There's also the fact mysql is called each time, is there a way to do multiple load data?
Shell script:
#!/bin/bash
for f in /var/log/mylog/*.csv
do
mysql -e "load data local infile '"$f"' into table myTable fields TERMINATED BY ',' LINES TERMINATED BY '\n'" -u myUser --password=myPassword --local-infile
done

MySQL LOAD DATA INFILE with PostgreSQL COPY FROM command

I am very new to PostgreSQL. Actually I was using MySQL before but for some project specific reason I need to use postgreSQL and do a POC.
Now the problem is:
I was using MySQL LOAD DATA INFILE command to load the column content from a file to my database table.
My table structure is :
Table name: MSISDN
Table Column Names: ID(primary key - auto_generated), JOB_ID, MSISDN, REGION, STATUS
But my input text file(rawBase.txt) is having only below columns:
MSISDN, REGION
so I was using the below command to load these above 2 column with initial JOB_ID and STATUS.
LOAD DATA INFILE 'D:\\project\\rawBase.txt' INTO TABLE MSISDN
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(MSISDN,REGION)
SET JOB_ID = 'XYZ1374147779999', STATUS = 0;
as you can see that there is a option available in LOAD DATA INFILE command where I can SET a particular initial value for the columns which are not present(JOB_ID and STATUS) in the input text file.
NOW,
in case of postgreSQL, I want same thing to happen.
There is also a same kind of command available COPY FROM
like below:
COPY MSISDN FROM 'D:\\project\\rawBase.txt' WITH DELIMITER AS ','
but I am not able to SET a particular initial value for the rest columns which are not present(JOB_ID and STATUS) in my input text file. I am not getting any fruitful example of doing this.
Please give some suggestion if possible.
Regards,
Sandy
You may do it the "Unix way" using pipes:
cat rawbase.txt | awk '{print $0",XYZ1374147779999,0"}' | psql -d dbname -c "copy MSISDN FROM stdin with delimiter AS ','"
Now from the file paths in the question it appears you're using MS-Windows, but a Unix shell and command-line tools like awk are available for Windows through MSYS or Cygwin.
COPY with a column-list, and set a DEFAULT on the table columns you don't specify.
regress=> CREATE TABLE copydemo(a text not null, b text not null default 'blah');
regres=> \COPY copydemo(a) FROM stdin
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> blah
>> otherblah
>> \.
regres=> SELECT * FROM copydemo;
a | b
-----------+------
blah | blah
otherblah | blah
(2 rows)
You're probably COPYing from a file rather than stdin; I just did it on stdin for a quick demo of what I mean. The key thing is that columns that require values not in the CSV have DEFAULTs set, and you specify a column-list in COPY, eg COPY (col1, col2).
There is unfortunately no equivalent to the COPY-specific SET that you want there. You can stage via a temporary table and do an INSERT INTO ... SELECT, as Igor suggested, if you can't or don't want to ALTER your table to set column DEFAULTs.

How can I add headers and format MySQL query output files?

I connect to mysql from my Linux shell and use something like this:
SELECT * FROM students INTO OUTFILE '/tmp/students'.
Why do I see \N at line endings? I want each record in a row, but why do I see the \N explicitly printed?
How can I print all column headers in the first row?
SELECT ... INTO OUTFILE exports the result to a rather mysql specific delimited format. \N means a NULL value, not end-of-line.
Run e.g. from a command line:
echo 'select * from students' | mysql mydb >/tmp/students
The documentation for SELECT shows you how what options you have when using INTO OUTFILE, but you can't export the headers directly that way. See the comments in that documentation for a hacky way of adding header columns though.