postgresql read json that contains character ' in a string - json

Try to read this the json OV-fiets (http://fiets.openov.nl/locaties.json) in a postgres database with json_array_elements. Some names of train station contains the character ' .
Example ..... "description": "Helmond 't Hout"
I believe that my script fails because of the ' between Helmond and the t.
The script i use:
WITH data AS (SELECT 'paste the json from http://fiets.openov.nl/locaties.json'::json AS fc)
SELECT
row_number() OVER () AS gid,
feat->'locaties' AS locaties,
FROM (
SELECT json_array_elements(fc->'locaties') AS feat
FROM data
) AS f;*
++++++++++++++++++++++++++++++
The error i get:
*syntax error at or near "Hout"
LINE 3: ...Images": [], "name": "HMH - OV-fiets - Helmond 't Hout", "ex.*
How can i change the script to avoid the syntax error due to the character '

the easiest workaround here would probably be dollar quotes:
SELECT $dq$paste the json from http://fiets.openov.nl/locaties.json$dq$::json

In SQL, single quotes need to be escaped by doubling them, e.g.:
select 'Arthur''s house';
As an alternative (in Postgres) you can use dollar quoting to avoid changing the string:
SELECT $data$Arthur's house$data$

Related

Error parsing JSON: more than one document in the input (Redshift to Snowflake SQL)

I'm trying to convert a query from Redshift to Snowflake SQL.
The Redshift query looks like this:
SELECT
cr.creatives as creatives
, JSON_ARRAY_LENGTH(cr.creatives) as creatives_length
, JSON_EXTRACT_PATH_TEXT(JSON_EXTRACT_ARRAY_ELEMENT_TEXT (cr.creatives,0),'previewUrl') as preview_url
FROM campaign_revisions cr
The Snowflake query looks like this:
SELECT
cr.creatives as creatives
, ARRAY_SIZE(TO_ARRAY(ARRAY_CONSTRUCT(cr.creatives))) as creatives_length
, PARSE_JSON(PARSE_JSON(cr.creatives)[0]):previewUrl as preview_url
FROM campaign_revisions cr
It seems like JSON_EXTRACT_PATH_TEXT isn't converted correctly, as the Snowflake query results in error:
Error parsing JSON: more than one document in the input
cr.creatives is formatted like this:
"[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]"
It seems to me that you are not working with valid JSON data inside Snowflake.
Please review your file format used for the copy into command.
If you open the "JSON" text provided in a text editor , note that the information is not parsed or formatted as JSON because of the quoting you have. Once your issue with double quotes / escaped quotes is handled, you should be able to make good progress
Proper JSON on Left || Original Data on Right
If you are not inclined to reload your data, see if you can create a Javascript User Defined Function to remove the quotes from your string, then you can use Snowflake to process the variant column.
The following code is working POJO that can be used to remove the doublequotes for you.
var textOriginal = '[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]';
function parseText(input){
var a = input.replaceAll('""','\"');
a = JSON.parse(a);
return a;
}
x = parseText(textOriginal);
console.log(x);
For anyone else seeing this double double quote issue in JSON fields coming from CSV files in a Snowflake external stage (slightly different issue than the original question posted):
The issue is likely that you need to use the FIELD_OPTIONALLY_ENCLOSED_BY setting. Specifically, FIELD_OPTIONALLY_ENCLOSED_BY = '"' when setting up your fileformat.
(docs)
Example of creating such a file format:
create or replace file format mydb.myschema.my_tsv_file_format
type = CSV
field_delimiter = '\t'
FIELD_OPTIONALLY_ENCLOSED_BY = '"';
And example of querying from a stage using this file format:
select
$1 field_one
$2 field_two
-- ...and so on
from '#my_s3_stage/path/to/file/my_tab_separated_file.csv' (file_format => 'my_tsv_file_format')

mySql JSON string field returns encoded

First week having to deal with a MYSQL database and JSON field types and I cannot seem to figure out why values are encoded automatically and then returned in encoded format.
Given the following SQL
-- create a multiline string with a tab example
SET #str ="Line One
Line 2 Tabbed out
Line 3";
-- encode it
SET #j = JSON_OBJECT("str", #str);
-- extract the value by name
SET #strOut = JSON_EXTRACT(#J, "$.str");
-- show the object and attribute value.
SELECT #j, #strOut;
You end up with what appears to be a full formed JSON object with a single attribute encoded.
#j = {"str": "Line One\n\tLine 2\tTabbed out\n\tLine 3"}
but using JSON_EXTRACT to get the attribute value I get the encoded version including outer quotes.
#strOut = "Line One\n\tLine 2\tTabbed out\n\tLine 3"
I would expect to get my original string with the \n \t all unescaped to the original values and no outer quotes. as such
Line One
Line 2 Tabbed out
Line 3
I can't seem to find any JSON_DECODE or JSON_UNESCAPE or similar functions.
I did find a JSON_ESCAPE() function but that appears to be used to manually build a JSON object structure in a string.
What am I missing to extract the values to the original format?
I like to use handy operator ->> for this.
It was introduced in MySQL 5.7.13, and basically combines JSON_EXTRACT() and JSON_UNQUOTE():
SET #strOut = #J ->> '$.str';
You are looking for the JSON_UNQUOTE function
SET #strOut = JSON_UNQUOTE( JSON_EXTRACT(#J, "$.str") );
The result of JSON_EXTRACT() is intentionally a JSON document, not a string.
A JSON document may be:
An object enclosed in { }
An array enclosed in [ ]
A scalar string value enclosed in " "
A scalar number or boolean value
A null — but this is not an SQL NULL, it's a JSON null. This leads to confusing cases because you can extract a JSON field whose JSON value is null, and yet in an SQL expression, this fails IS NULL tests, and it also fails to be equal to an SQL string 'null'. Because it's a JSON type, not a scalar type.

Quotes around dynamic expression Groovy SQL

I am doing a query to my database using Groovy, the query is working perfectly and bringing back the correct data however I get this error in my terminal.
In Groovy SQL please do not use quotes around dynamic expressions
(which start with $) as this means we cannot use a JDBC
PreparedStatement and so is a security hole. Groovy has worked around
your mistake but the security hole is still there.
Here is my query
sql.firstRow("""select elem
from site_content,
lateral jsonb_array_elements(content->'playersContainer'->'series') elem
where elem #> '{"id": "${id}"}'
""")
If I change it to just $id or
sql.firstRow("""select elem
from site_content,
lateral jsonb_array_elements(content->'playersContainer'->'series') elem
where elem #> '{"id": ?}'
""", id)
I get the following error
org.postgresql.util.PSQLException: The column index is out of range:
1, number of columns: 0.
Positional or named parameters are handled by groovy sql properly and should be used instead of "'$id'".
As #Opal mentioned and as described here, you should be passing your params either as a list or map:
sql.execute "select * from tbl where a=? and b=?", [ 'aa', 'bb' ]
sql.execute "select * from tbl where a=:first and b=:last", first: 'aa', last: 'bb'

how to get rid of newline space while using substring_index

i have a field with value like below:
utf8: "\xE2\x9C\x93"
id: "805265"
plan: initial
acc: "123456"
last: "1234"
doc: "1281468479"
validation: field
commit: Accept
i used below query to extract acc value
select SUBSTRING_INDEX(SUBSTRING_INDEX(columnname, 'acc: "', -1),'last',1) as acc from table_name;
i am able to retrieve acc value but problem is when i export the result to csv file, the field is taking newline space which is before last...how do i get rid of that space???
I would expect you would want to strip out the end quote as well. But to answer your specific question, you can just update your SUBSTRING_INDEX delimiter to include the newline, i.e. select SUBSTRING_INDEX(SUBSTRING_INDEX(columnname, 'acc: "', -1),'\nlast',1) as acc from table_name;.
Or, if you prefer, you can use the REPLACE function to strip out any unwanted characters.

MySQL to JSON not formed properly

I am trying to return JSON formatted results from a MySQL query but cannot get the correct format - it needs to be e.g.
{comCom:'test 3', comUid:'63',... etc
But what I'm getting is without apostrophes
{comCom:test 3, comUid:63,... etc
I am running the query in PHP as follows (shortened for ease of reading)
$result = mysql_query("select...
...GROUP_CONCAT(CONCAT('{comCom:',ww.comment, ', comUid:',h.user_id,', comName:',h.name,', comPic:',h.live_prof_pic,',comUrl:',h.url,',comWhen:',time_ago(ww.dateadded),'}')) comment,...
How can I get the punctuation?
I know mysql_query is deprecated btw, just in process of moving things to MySQLi
Can you not just escape the ' character with \'?
...GROUP_CONCAT(CONCAT('{comCom:\'',ww.comment, '\', comUid:\'',h.user_id,'\', comName:\'',h.name,'\', comPic:\'',h.live_prof_pic,'\',comUrl:\'',h.url,'\',comWhen:\'',time_ago(ww.dateadded),'\'}'))
or use a mixture of " with '
...GROUP_CONCAT(CONCAT("{comCom:'",ww.comment, "', comUid:'",h.user_id,"', comName:'",h.name,"', comPic:'",h.live_prof_pic,"',comUrl:'",h.url,"',comWhen:'",time_ago(ww.dateadded),"'}"))