Athena (Trino SQL) parsing JSON document using fields (dot notation) - json

Athena (Trino SQL) parsing JSON document (table column called document 1 in Athena) using fields (dot notation)
If the underlying json (table column called document 1 in Athena) is in the form of {a={b ...
I can parse it in Athena (Trino SQL) using
document1.a.b
However, if the JSON contains {a={"text": value1 ...
the quote marks will not parse correctly.
Is there a way to do JSON parsing of a 'field' with quotes?
If not, is there an elegant way of parsing the "text" and obtain the string in value 1? [Please see my comment below].
I cannot change the quotes in the json and its Athena "table" so I would need something that works in Trino SQL syntax.
The error message is in the form of: SQL Error [100071] [HY000]: [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. SYNTAX_ERROR: Expression [redacted] is not of type ROW
NOTE: This is not a duplicate of Oracle Dot Notation Question

Dot notation works only for columns types as struct<…>. You can do that for JSON data, but judging from the error and your description this seems not to be the case. I assume your column is of type string.
If you have JSON data in a string column you can use JSON functions to parse and extract parts of them with JSONPath.

Related

SQL compilation error: JSON file format can produce one and only one column of type variant or object or array when copying from S3 to Snowflake

I have the following JSON stored in S3:
{"data":"this is a test for firehose"}
I have created the table test_firehose with a varchar column data, and a file_format called JSON with type JSON and the rest in default values. I want to copy the content from s3 to snowflake, and I have tried with the following statement:
COPY INTO test_firehose
FROM 's3://s3_bucket/firehose/2020/12/30/09/tracking-1-2020-12-30-09-38-46'
FILE_FORMAT = 'JSON';
And I receive the error:
SQL compilation error: JSON file format can produce one and only one column of type
variant or object or array. Use CSV file format if you want to load more than one column.
How could I solve this? Thanks
If you want to keep your data as JSON (rather than just as text) then you need to load it into a column with a datatype of VARIANT, not VARCHAR

Read multiple JSONs from single REST Service response and put to Database Table - Talend

I have searched a lot but not found exact slution.
I have a REST service, in response of which I get rows and each row in a JSON, as given bellow:
{"event":"click1","properties":{ "time":"2 dec 2018","clicks":29,"parent":"jbar","isLast":"NO"}}
{"event":"click2","properties":{ "time":"2 dec 2018","clicks":35,"parent":"jbar3","isLast":"NO"}}
{"event":"click3","properties":{ "time":"2 dec 2018","clicks":10,"parent":"jbar2","isLast":"NO"}}
{"event":"click4","properties":{ "time":"2 dec 2018","clicks":9,"parent":"jbar1","isLast":"YES"}}
Each row is a JSON (all are similar to each other). I have a database table having all those fields as columns. I wanted to loop through these and upload all data in Talend. What I have tried is following:
tRestClient--tNormalize--tExtractJsonFields--tOracleOutput
and provided loop criteria and mapping in tExtractJsonFields component but it is not working and throwing me error saying "json can not be null or empty"
Need help in doing that.
Since your webservice returns multiple json objects in the response, it's not valid json but rather a json document.
You need to break it into individual json objects.
You can add a tNormalize between tRESTClient and tExtractJsonFields, and normalize the json document on "\n" character.
The error "json can not be null or empty" is due to an error in your Jsonpath queries. You have to set the loop query to "$", and reference the json properties using "event", "properties.time"
could you try this :
In your tExtractJsonFields, configure the property readBy to JsonPath without loop

Rails serialized JSON to string gives error on mysql JSON field

I have an old table which used to save json to a string sql field using the Rails ActiveRecord serializer. I created a new table with the field datatype as JSON which was introduced in MySQL 5.7. But directly copying the data from the old field to the new one gives me JSON error saying the json structure is wrong.
More specifically the problem is with unicode characters which my database does not support as of yet and the database is too large to just migrate everything to support it.
I am looking for a way to migrate the data from the old field to the new JSON field
I have seen that replacing \ufor the unicode character to \\u in the JSON string solved the issue but I am not just able to do this:
update table_name set column_name=REPLACE(column_name, '\u', '\\u');
since it gives an error again
ERROR 3140 (22032): Invalid JSON text: "Invalid escape character in string." at position 564 in value for column 'table_name.column_name'.
for sql updating table values \u to \\u :
update table_name set column_name=REPLACE(column_name, '\\u', '\\\\u') [where condition];

How to save an array to a mysql database in laravel

So i wanted to save some tags data to the database from the UI form,
Tags: ["male","female","kids"]
I tried everything like but it saves as a string, ialso tried checking if i can alter the data type to array in mysql, or json but i got this
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'JSON(255) NOT NULL' at line 1
I also tried json_encode and json_decode() but still no head so please what can i do?
There isn't any data type as JSON or Array in DB.
What you can do is use the tags field as TEXT as told by #Rick James and then encode your input as json by using json_encode() method before inserting and decode it after retrieving the data from DB by using json_decode() method.
JSON is basically a minimal, readable format for structuring data. Which means it can be considered as a String.
Here is a good post about JSON, in case you need.
What is JSON?
There is no datatype called JSON. Instead use TEXT without the (255).

How to serialise a spark sql row when it contains a comma

I am using Spark Jobserver https://github.com/spark-jobserver/spark-jobserver and Apache Spark for some analytic processing.
I am receiving back the following structure from jobserver when a job finishes
"status": "OK",
"result": [
"[17799.91015625,null,hello there how areyou?]",
"[50000.0,null,Hi, im fine]",
"[0.0,null,All good]"
]
The result doesnt contain valid json, as explained here:
https://github.com/spark-jobserver/spark-jobserver/issues/176
So I'm trying to convert the returned structure into a json structure, however I cant simply make the result string insert ' (single quotes) based on the comma delimiter, as sometimes the result contains a comma itself.
How can i convert a spark Sql row into a json object in the above situation?
I actually found a better way in the end,
from 1.3.0 onwards you can use .toJSON on a Dataframe to convert it to json
df.toJSON.collect()
to output a dataframes schema to json you can use
df.schema.json