Importing csv with json value with psql COPY (problem with escaping) - json

I am trying to import csv file to table in postgres using COPY command. I have problem that one column is of json data type. I tried to escape json data in csv using dollars ($$...$$) docu_4.1.2.2.
This is first line of csv:
3f382d8c-bd27-4092-bd9c-8b50e24df7ec;370038757|PRIMARY_RESIDENTIAL;$${"CustomerData": "{}", "PersonModule": "{}"}$$
This is command used for import:
psql -c "COPY table(id, name, details) FROM '/path/table.csv' DELIMITER ';' ENCODING 'UTF-8' CSV;"
This is error I get:
ERROR: invalid input syntax for type json
DETAIL: Token "$" is invalid.
CONTEXT: JSON data, line 1: $...
COPY table, line 1, column details: "$${CustomerData: {}, PersonModule: {}}$$"
How should I escape/import json value using COPY? Should I give up and use something like pg_loader instead? Thank you

In case of failing with importing the JSON data please give a try to the following setup - this worked for me even for quite complicated data:
COPY "your_schema_name.yor_table_name" (your, column_names, here)
FROM STDIN
WITH CSV DELIMITER E'\t' QUOTE '\b' ESCAPE '\';
--here rows data
\.

Related

Postgresql can't copy JSON document

I created a table by using Postgresql after that I tried to copy the json document inside of that table with:
\copy document_json FROM '/path/to/document.json';
However it gives an error such as: ERROR: character with byte sequence 0x9e in encoding "WIN1254" has no equivalent in encoding "UTF8"
CONTEXT: COPY document_json, line 1
I couldn't find any explanation for this error. I would appreciate any help. Thank you!

Insert escaped quote into Postgres JSON column

The data that I need to insert into a Postgres JSON column from a text file is as before:
"{\"server\":\"[localhost:9001]\",\"event\":\"STARTED\",\"success\":true}"
Inserting directly will result in the following error:
ERROR: invalid input syntax for type json
DETAIL: Token "\" is invalid.
How can I insert this data without doing text pre-processing i.e. replacing \ escape character?
You're using the wrong quotes.
'{"server":"[localhost:9001]","event":"STARTED","success":true}'

Using \COPY to load CSV with JSON fields into Postgres

I'm attempting to load TSV data from a file into a Postgres table using the \COPY command.
Here's an example data row:
2017-11-22 23:00:00 "{\"id\":123,\"class\":101,\"level\":3}"
Here's the psql command I'm using:
\COPY bogus.test_table (timestamp, sample_json) FROM '/local/file.txt' DELIMITER E'\t'
Here's the error I'm receiving:
ERROR: invalid input syntax for type json
DETAIL: Token "sample_json" is invalid.
CONTEXT: JSON data, line 1: "{"sample_json...
COPY test_table, line 1, column sample_json: ""{\"id\":123,\"class\":101,\"level\":3}""
I verified the JSON is in the correct JSON format and read a couple similar questions, but I'm still not sure what's going on here. An explanation would be awesome
To load your data file as it is:
\COPY bogus.test_table (timestamp, sample_json) FROM '/local/file.txt' CSV DELIMITER E'\t' QUOTE '"' ESCAPE '\'
Your json is quoted. It shouldn't have surrounding " characters, and the " characters surrounding the field names shouldn't be escaped.
It should look like this:
2017-11-22 23:00:00 {"id":123,"class":101,"level":3}
The answer of Aeblisto almost did the trick for my crazy JSON fields, but needed to modify an only small bit - THE QUOTE with backslash - here it is in full form:
COPY "your_schema_name.yor_table_name" (your, column_names, here)
FROM STDIN
WITH CSV DELIMITER E'\t' QUOTE E'\b' ESCAPE '\';
--here rows data
\.

Insert JSON into PostgreSQL that contains quotation marks

I'm trying to import a JSON file into a table. I'm using the solution mentioned here: https://stackoverflow.com/a/33130304/1663462:
create temporary table temp_json (values text) on commit drop;
copy temp_json from 'data.json';
select
values->>'annotations' as annotationstext
from (
select json_array_elements(replace(values,'\','\\')::json) as values
from temp_json
) a;
Json file content is:
{"annotations": "<?xml version=\"1.0\"?>"}
I have verified that this is a valid JSON file.
The json file contains a \" which I presume is responsible for the following error:
CREATE TABLE
COPY 1
psql:insertJson2.sql:13: ERROR: invalid input syntax for type json
DETAIL: Expected "," or "}", but found "1.0".
CONTEXT: JSON data, line 1: {"annotations": "<?xml version="1.0...
Are there any additional characters that need to be escaped?
Because copy command processes escape ('\') characters for text format without any options there are two ways to import such data.
1) Process file using external utility via copy ... from program, for example using sed:
copy temp_json from program 'sed -e ''s/\\/\\\\/g'' data.json';
It will replace all backslashes to doubled backslashes, which will be converted back to single ones by copy.
2) Use csv import:
copy temp_json from 'data.json' with (format csv, quote '|', delimiter E'\t');
Here you should to set quote and delimiter characters such that it does not occur anywhere in your file.
And after that just use direct conversion:
select values::json->>'annotations' as annotationstext from temp_json;

Inserting valid json with copy into postgres table

Valid JSON can naturally have the backslash character: \. When you insert data in a SQL statement like so:
sidharth=# create temp table foo(data json);
CREATE TABLE
sidharth=# insert into foo values( '{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }');
INSERT 0 1
sidharth=# select * from foo;
data
\-----------------------------------------------------
{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }
(1 row)
Things work fine.
But if I copy the JSON to a file and run the copy command I get:
sidharth=# \copy foo from './tests/foo' (format text);
ERROR: invalid input syntax for type json
DETAIL: Token "mary" is invalid.
CONTEXT: JSON data, line 1: {"foo":"bar", "bam": "{"mary...
COPY foo, line 1, column data: "{"foo":"bar", "bam": "{"mary": "had a lamb"}" }"
Seems like postgres is not processing the backslashes. I think because of http://www.postgresql.org/docs/8.3/interactive/sql-syntax-lexical.html and
it I am forced to use double backslash. And that works, i.e. when the file contents are:
{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }
The copy command works. But is it correct to expect special treatment for json data types
because afterall above is not a valid json.
http://adpgtech.blogspot.ru/2014/09/importing-json-data.html
copy the_table(jsonfield)
from '/path/to/jsondata'
csv quote e'\x01' delimiter e'\x02';
PostgreSQL's default bulk load format, text, is a tab separated markup. It requires backslashes to be escaped because they have special meaning for (e.g.) the \N null placeholder.
Observe what PostgreSQL generates:
regress=> COPY foo TO stdout;
{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }
This isn't a special case for json at all, it's true of any string. Consider, for example, that a string - including json - might contain embedded tabs. Those must be escaped to prevent them from being seen as another field.
You'll need to generate your input data properly escaped. Rather than trying to use the PostgreSQL specific text format, it'll generally be easier to use format csv and use a tool that writes correct CSV, with the escaping done for you on writing.