How to import a JSON file to PostgreSQL database in BASH - json

I'm working on a bash script and I need to import a JSON file into my Postgres database. This JSON file is really big and I have tried diferents ways but all of this didn´t work.
I can´t post a example of the JSON file because it is really big (like 15Mb)
Using a bash variable to store the data:
VAR=$(cat cucumber.json.1)
su -c "psql -A -t -d tareas -c \"insert into consulta (url, identificador, fecha, artefacto) values ('UNKNOWN', $identificadorBBDD, '$diaBBDD', :'$VAR')\"" postgres
It returns that the list of the arguments is too long, I think is because when is reading the variable, the command thinks that the variable just finish and it breaks the structure of the query.
I tried to use the function in postgres lo_import but I got the same result.
I used a postgres command to store the data but it didn´t work:
\set content `cat cucumber.json.1` create temp table t ( j json); insert into t values(:'content');
Thanks for your help.

Thanks for your help. Finally i can solve using the next code into my .sh
VAR=$(cat cucumber.json.1)
psql postgres://user:password#localhost:5432/tareas << EOF
insert into consulta (url, identificador, fecha ,artefacto) values ('UNKNOWN',
ident, 'date', '$VAR');
EOF
I don´t know why it doesn´t give me the same error as before but in this way, the query works and it imports the json file to the database.

Related

PostgreSQL multiple CSV import and add filename to each column

I've got 200k csv files and I need to import them all to a single postgresql table. It's a list of parameters from various devices and each csv's file name contains device's serial number and I need it to be in one of the colums for each row.
So to simplify, I've got few columns of data (no headers), let's say that columns in each csv file are: Date, Variable, Value and file name contains SERIALNUMBER_and_someOtherStuffIDontNeed.csv
I'm trying to use cygwin to write a bash script to iterate over files and do it for me, however for some reason it won't work, showing 'syntax error at or near "as" '
Here's my code:
#!/bin/bash
FILELIST=/cygdrive/c/devices/files/*
for INPUT_FILE in $FILELIST
do
psql -U postgres -d devices -c "copy devicelist
(
Date,
Variable,
Value,
SN as CURRENT_LOAD_SOURCE(),
)
from '$INPUT_FILE
delimiter ',' ;"
done
I'm learning SQL so it might be an obvious mistake, but I can't see it.
Also I know that in that form I will get full file name, not just the serial number bit I want but I can probably handle that somehow later.
Please advise.
Thanks.
I dont think there is a CURRENT_LOAD_SOURCE() function in postgres. A work-around is to leave the name-column NULL on copy, and patch is to the desired value just after the copy. I prefer a shell here-document because that make quoting inside the SQL body easier. (BTW: for 10K of files, the globbing needed to obtain FILELIST might exceed argmax for the shell ...)
#!/bin/bash
FILELIST="`ls /tmp/*.c`"
for INPUT_FILE in $FILELIST
do
echo "File:" $INPUT_FILE
psql -U postgres -d devices <<OMG
-- I have a schema "tmp" for testing purposes
CREATE TABLE IF NOT EXISTS tmp.filelist(name text, content text);
COPY tmp.filelist ( content)
from '/$INPUT_FILE' delimiter ',' ;
UPDATE tmp.filelist SET name = '$FILELIST'
WHERE name IS NULL;
OMG
done
For anyone interested in an answer, I've used a python script to change file names and then another script using psycopg2 to connect to the database and then done everyting in one connection. Took 10 minutes instead of 10 hours.
Here's the code:
Renaming files (also apparently to import from CSV you need all the rows to be filled and the information I needed was in first 4 columns anyway, therefore I've put together a solution to generate whole new CSVs instead of just renaming them):
import os
import csv
path='C:/devices/files'
os.chdir(path)
i=0
for file in os.listdir(path):
try:
i+=1
if i%10000 == 0:
#just to see the progress
print(i)
serial_number = (file[:8])
creader = csv.reader(open(file))
cwriter = csv.writer(open('processed_'+file, 'w'))
for cline in creader:
new_line = [val for col, val in enumerate(cline) if col not in (4, 5, 6, 7)]
new_line.insert(0, serial_number)
#print(new_line)
cwriter.writerow(new_line)
except:
print('problem with file: ' + file)
pass
Updating database:
import os
import psycopg2
path="C:\\devices\\files"
directory_listing = os.listdir(path)
conn = psycopg2.connect("dbname='devices' user='postgres' host='localhost'")
cursor = conn.cursor()
print(len(directory_listing))
i=100001
while i < 218792:
current_file=(directory_listing[i])
i+=1
full_path = "C:/devices/files/" + current_file
with open(full_path) as f:
cursor.copy_from(file=f, table='devicelistlive', sep=",")
conn.commit()
conn.close()
Don't mind while and weird numbers, it's just because I was doing it in portions for testing purposes. Can easily be replaced with for

automate csv import in mysql db in linux environment

Is there a way to have a .csv imported into a SQL Table automatically in mysql db? I know how to do it manually, but there is a situation where a .csv is exported nightly from PeopleSoft and we want that imported automatically into SQL Table in linux environment. plese give me a sample script to do that.. If there's a way, can anyone point me in that direction (I'm not a SQL expert)!!
You can try creating Stored procedure,
Write load csv query into SP.
Create Event to call SP.
I hope this helps.
CREATE EVENT IF NOT EXISTS `load_csv_event`
ON SCHEDULE EVERY 23 DAY_HOUR
DO CALL my_sp_load_csv();
Alos, You can directly create an event and write a load query into it.
You could create a crontab job, for example:
* * * * * /path/to/load_script.sh
Where load_script.sh may be like (do not forget make it executable):
#!/bin/bash
IMPORTED_FILE_PATH=/path/to/your/imported/file.csv
TABLENAME=target_table_name
DATABASE=db_name
TMP_FILENAME=/tmp/${TABLENAME}.cvs
# do nothing if imported file does not exist
[ -f "$IMPORTED_FILE_PATH" ] || exit 0
# if temporary file exists, then it means previous import job is running. Also do nothing
[ -f "$TMP_FILENAME" ] && exit 0
# Move it to tmp and rename to target table name
mv "$IMPORTED_FILE_PATH" "$TMP_FILENAME"
mysqlimport --user=mysqlusername --password=mysqlpassword --host=mysqlhost --local $DATABASE $TMP_FILENAME
rm -f "$TMP_FILENAME"
It is just an example (not tested). You should add error handling, logging, etc.
Also look at manual of mysqlimport

mdb-export changes the GUID for every row

I'm using mdb-tools on FreeBSD to convert a Microsoft Access DB to MySQL.
The script looks like this (to_mysql.sh):
#!/usr/local/bin/bash
echo "DROP TABLE IF EXISTS Student;"
mdb-schema -T Student $1 mysql
mdb-export -D '%Y-%m-%d %H:%M:%S' -I mysql $1 Student
And I'm using it like:
./to_mysql.sh accessDb.MDB > data.sql
The problem is that the GUID (the second column) in the mdb changes for all rows.
In the access DB one row looks like this:
|{D115266B-D5A3-4617-80F8-7B80EE3022DA}|2013-06-11 08.54.14|2015-12-17
14.57.01|2|2||||||0|111111-1111||Nameson|Name|||||3|||SA|0||||0|Gatan 2|222 22|1234 567
And when I convert it to MySQL using the script above it looks like this:
INSERT INTO `Student` (
`UsedFields`,`GUID`,`Changed`,`ChangedLesson`,`AccessInWebViewer`,`VisibleInWebViewer`
,`PasswordInWebViewer`,`Language`,`UserMan`,`SchoolID`,`Owner`,`DoNotExport`
,`Student`,`Category`,`LastName`,`FirstName`,`Signature`,`Sex`
,`Phone`,`SchoolType`,`Grade`,`EMail`,`Program`,`IgnoreLunch`
,`ExcludedTime`,`Individual timetable`,`Adress(TEXT) `,`Postnr(TEXT) `
,`Ort(TEXT) `
)
VALUES (
NULL,"{266bd115-d5a3-4617-f880-807b30eeda22}","2013-06-11 08:54:14"
,"2015-12-17 14:57:01",2,2,NULL,NULL,NULL,NULL,NULL,0,"111111-1111"
,NULL,"Nameson","Name ",NULL,NULL,NULL,NULL,"3",NULL,"SA"
,0,NULL,0,"Gatan 2","222 22","1234 567"
);
Everything is correct except the GUID column, it changes from:
{D115266B-D5A3-4617-80F8-7B80EE3022DA}
to
{266bd115-d5a3-4617-f880-807b30eeda22}
It looks like all the chars just reordering, but I have no idea why.
Does anyone know why and how I can prevent this?
Thank you!
seems like a byte order issue in mdbtools. for a workaround create a small sed script ''mdb_fixguids'', something like
#!/bin/sed -f
s/{\(....\)\(....\)-\(....-....-....-............\)}/{\2\1-\3}/g;
s/{\(........-....-....\)-\(..\)\(..\)-\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)}/{\1-\3\2-\5\4\7\6\9\8}/g
put it into the path and use it in the conversion pipe, something like
./to_mysql.sh accessDb.MDB | mdb_fixguids > data.sql
BTW :) this is the first time I needed all the possible backrefs in sed

Heredocs, variables and single quotes in BASH and MySQL

I'm trying to send some data to a remote MySQL database using a BASH script on GNU/Linux, but get various errors.. Here's the line that's not working:
mysql --host=192.168.0.100 --user=petercapaldi --password=mypassword mystartrekcharacterbase << EOF
INSERT into myfourlegs values ('$PERSON','$THETIME','$THETIME','$THEDATE','$DAYOFWEEK');
EOF
and this too (just in case):
mysql --host=192.168.0.100 --user=petercapaldi --password=mypassword mystartrekcharacterbase << EOF
INSERT into myfourlegs values (\047$PERSON\047,\047$THETIME\047,\047$THETIME\047,\047$THEDATE\047,\047$DAYOFWEEK\047);
EOF
Scrap that. My fault - missed the first field in the database. The single quotes work as they should with heredocs.. (i.e. '$VARIABLE' prints 'myvariable' just like $VARIABLE prints myvariable).

Using shell script to insert data into remote MYSQL database

I've been trying to get a shell(bash) script to insert a row into a REMOTE database, but I've been having some trouble :(
The script is meant to upload a file to a server, get a URL, HASH, and a file size, connect to a remote mysql database, and insert the data into an existing table. I've gotten it working until the remote MYSQL database bit.
It looks like this:
#!/bin/bash
zxw=randomtext
description=randomtext2
for file in "$#"
do
echo -n *****
ident= *****
data= ****
size=` ****
hash=`****
mysql --host=randomhost --user=randomuser --password=randompass randomdb
insert into table (field1,field2,field3) values('http://www.example.com/$hash','$file','$size');
echo "done"
done
I'm a total noob at programming so yeah :P
Anyway, I added the \ to escape the brackets as I was getting errors. As it is right now, the script is works fine until connects to the mysql database. It just connects to the mysql database and doesn't do the insert command (and I don't even know if the insert command would work in bash).
PS: I've tried both the mysql commands from the command line one by one, and they worked, though I defined the hash/file/size and didn't have the escaping "".
Anyway, what do you guys think? Is what I'm trying to do even possible? If so how?
Any help would be appreciated :)
The insert statement has to be sent to mysql, not another line in the shell script, so you need to make it a "here document".
mysql --host=randomhost --user=randomuser --password=randompass randomdb << EOF
insert into table (field1,field2,field3) values('http://www.site.com/$hash','$file','$size');
EOF
The << EOF means take everything before the next line that contains nothing but EOF (no whitespace at the beginning) as standard input to the program.
This might not be exactly what you are looking for but it is an option.
If you want to bypass the annoyance of actually including your query in the sh script, you can save the query as .sql file (useful sometimes when the query is REALLY big and complicated). This can be done with simple file IO in whatever language you are using.
Then you can simply include in your sh scrip something like:
mysql -u youruser -p yourpass -h remoteHost < query.sql &
This is called batch mode execution. Optionally, you can include the ampersand at the end to ensure that that line of the sh script does not block.
Also if you are concerned about the same data getting entered multiple times and your rdbms getting inconsistent, you should explore MySql transactions (commit, rollback, etc).
Don't use raw SQL from bash; bash has no sane facility for sanitizing the data beforehand. Generate a CSV file and upload that instead.