Error parsing array of strings to JSON ARRAY column in Snowflake - json

I have an SQL column in my Snowflake table declared as the ARRAY data type. I tried doing a COPY INTO from a CSV where one row contained a value for this column of ["A","B"]. However, this returned the following error:
Error parsing JSON: "[""A"",""B""]"
This seems to me to indicate that Snowflake doesn't accept an array of strings as valid JSON. However, I'm getting mixed results when testing this in the Snowflake console:
CALL CHECK_JSON('["A", "B"]') # Returns null
CALL CHECK_JSON("[""A"",""B""]") # Returns error
How should I format this data to deal with this error?

You can use https://jsonlint.com/ to validate your JSON:
["A", "B"] <-- This is a valid JSON
"[""A"",""B""]" <-- This is not a valid JSON:
Error: Parse error on line 1:
"[""A"",""B""]"
---^
Expecting 'EOF', '}', ':', ',', ']', got 'STRING'

For me, I tried using the ARRAY_CONSTRUCT on the column in my COPY INTO statement to fix the issue. However, the CSV column was being treated as a single string, so it was producing an array with a single item. So, what I actually had to do was this:
REGEXP_SUBSTR_ALL(my_col, $$[^\"\\[\\],]+$$)
This worked when I tested the query but Snowflake does not allow the REGEXP_SUBSTR_ALL function in COPY statements. So, I actually had to do this:
SPLIT(TRIM(REPLACE(my_col, '"', ''), '[]'), ',')

Related

Athena (Trino SQL) parsing JSON document using fields (dot notation)

Athena (Trino SQL) parsing JSON document (table column called document 1 in Athena) using fields (dot notation)
If the underlying json (table column called document 1 in Athena) is in the form of {a={b ...
I can parse it in Athena (Trino SQL) using
document1.a.b
However, if the JSON contains {a={"text": value1 ...
the quote marks will not parse correctly.
Is there a way to do JSON parsing of a 'field' with quotes?
If not, is there an elegant way of parsing the "text" and obtain the string in value 1? [Please see my comment below].
I cannot change the quotes in the json and its Athena "table" so I would need something that works in Trino SQL syntax.
The error message is in the form of: SQL Error [100071] [HY000]: [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. SYNTAX_ERROR: Expression [redacted] is not of type ROW
NOTE: This is not a duplicate of Oracle Dot Notation Question
Dot notation works only for columns types as struct<…>. You can do that for JSON data, but judging from the error and your description this seems not to be the case. I assume your column is of type string.
If you have JSON data in a string column you can use JSON functions to parse and extract parts of them with JSONPath.

SQL compilation error: JSON file format can produce one and only one column of type variant or object or array when copying from S3 to Snowflake

I have the following JSON stored in S3:
{"data":"this is a test for firehose"}
I have created the table test_firehose with a varchar column data, and a file_format called JSON with type JSON and the rest in default values. I want to copy the content from s3 to snowflake, and I have tried with the following statement:
COPY INTO test_firehose
FROM 's3://s3_bucket/firehose/2020/12/30/09/tracking-1-2020-12-30-09-38-46'
FILE_FORMAT = 'JSON';
And I receive the error:
SQL compilation error: JSON file format can produce one and only one column of type
variant or object or array. Use CSV file format if you want to load more than one column.
How could I solve this? Thanks
If you want to keep your data as JSON (rather than just as text) then you need to load it into a column with a datatype of VARIANT, not VARCHAR

Insert escaped quote into Postgres JSON column

The data that I need to insert into a Postgres JSON column from a text file is as before:
"{\"server\":\"[localhost:9001]\",\"event\":\"STARTED\",\"success\":true}"
Inserting directly will result in the following error:
ERROR: invalid input syntax for type json
DETAIL: Token "\" is invalid.
How can I insert this data without doing text pre-processing i.e. replacing \ escape character?
You're using the wrong quotes.
'{"server":"[localhost:9001]","event":"STARTED","success":true}'

Importing csv with json value with psql COPY (problem with escaping)

I am trying to import csv file to table in postgres using COPY command. I have problem that one column is of json data type. I tried to escape json data in csv using dollars ($$...$$) docu_4.1.2.2.
This is first line of csv:
3f382d8c-bd27-4092-bd9c-8b50e24df7ec;370038757|PRIMARY_RESIDENTIAL;$${"CustomerData": "{}", "PersonModule": "{}"}$$
This is command used for import:
psql -c "COPY table(id, name, details) FROM '/path/table.csv' DELIMITER ';' ENCODING 'UTF-8' CSV;"
This is error I get:
ERROR: invalid input syntax for type json
DETAIL: Token "$" is invalid.
CONTEXT: JSON data, line 1: $...
COPY table, line 1, column details: "$${CustomerData: {}, PersonModule: {}}$$"
How should I escape/import json value using COPY? Should I give up and use something like pg_loader instead? Thank you
In case of failing with importing the JSON data please give a try to the following setup - this worked for me even for quite complicated data:
COPY "your_schema_name.yor_table_name" (your, column_names, here)
FROM STDIN
WITH CSV DELIMITER E'\t' QUOTE '\b' ESCAPE '\';
--here rows data
\.

Convert XML data containing CDATA to JSON in Oracle

I have a table test_tab, whose column is tst_col of XMLTYPE. while trying to convert tst_col of XML to JSON, I am successfully able to do so by using :
select xml2json(t.tst_col ).to_char() JSON_VAL FROM test_tab t;
But when tst_col contains cdata in it, I get an error.
Whenever it encounters ]] it prematurely exit from an array.
Kindly help to process XML to JSON, when XML contains CDATA.
AFAIK XML2JSON is not a part of the Oracle database...
SQL> select xml2Json(xmltype('<Foo/>')) from dual;
select xml2Json(xmltype('<Foo/>')) from dual
*
ERROR at line 1:
ORA-00904: "XML2JSON": invalid identifier
SQL>
However if I were creating such a function I would definitely consider a different name for it......