extracting json array values from mysql query - mysql

I have data which is of type json in MySQL and column name of json data is IMEI_DATA.
IMEI_DATA: {"SUBSCRIBER_HISTORY": [{"IMEI": "12345678", "COUNTER": "1", "Service Flag": "7", "UPDATE_DATE_UNIX_TIME": "65667"}]}
I need to extract the field IMEI from it and tried with json extract query like this:
SELECT json_extract(IMEI_DATA,"$.SUBSCRIBER_HISTORY.IMEI") FROM SUBSCRIBER_HISTORY_JSON
and it is giving the result as null..

Related

How to query JSON data in Athena with an # symbol in the key name and duplicate keys

The data I have been tasked to query is structured like this:
{
"#timestamp": "2022-11-17T21:00:19.191+00:00",
"#version": 1,
"message": "log message",
"logger_name": "com.logger.name",
"thread_name": "tomcat-thread-13",
"level": "INFO",
"level_value": 20000,
"application_name": "app_name",
"vpc": "vpc_name",
"region": "eu-west-1",
"aid": "ffffffff-ffff-ffff-ffff-ffffffffffff",
"account": "prod",
"rq": "ffffffff-ffff-ffff-ffff-ffffffffffff",
"log_shipper": "firehose",
"application_name": "app_name",
"account": "prod",
"region": "eu-west-1"
}
As you can see there are some duplicate keys in here, so both the Hive and OpenX JSON SerDe throw an error and won't query it at all.
I've created a table using the Ion SerDe, which can read the data, but the #timestamp and #version fields are always blank, all the other fields are read correctly.
The initial table definition I had was this...
CREATE EXTERNAL TABLE firehose_logs_pe (
`#timestamp` STRING,
`#version` STRING,
<other columns>
)
ROW FORMAT SERDE
'com.amazon.ionhiveserde.IonHiveSerDe'
STORED AS ION
LOCATION 's3://s3-bucket-name/folder/'
I also tried to rename the fields and use a path extractor to get the values, like this...
CREATE EXTERNAL TABLE firehose_logs_pe (
ts STRING,
version STRING,
<other columns>
)
ROW FORMAT SERDE
'com.amazon.ionhiveserde.IonHiveSerDe'
WITH SERDEPROPERTIES (
'ion.ts.path_extractor' = '(`#timestamp`)',
'ion.version.path_extractor' = '(`#version`)'
)
STORED AS ION
LOCATION 's3://s3-bucket-name/folder/'
However, the values of the ts and version fields are still empty. The query also seems to run slower using the path extractors.
Is there any way to query this data in this format with Athena? As a test I did a find and replace on one of the JSON files and removed the #, at which point everything worked as it should, however this is not a practical solution when I have about 20Tb of data to query in hundreds of millions of files.

How to query JSON values from a column into one JSON array in MS-SQL 2016?

I'm starting to fiddle out how to handle JSON in MSSQL 2016+
I simply created a table having a ID (int) and a JSON (nvarchar) column.
Here are my queries to show the issue:
First query just returns the relational table result, nice and as expected.
SELECT * FROM WS_Test
-- Results:
1 { "name": "thomas" }
2 { "name": "peter" }
Second query returns just the json column as "JSON" created my MSSQL.
Not nice, because it outputs the json column content as string and not as parsed JSON.
SELECT json FROM WS_Test FOR JSON PATH
-- Results:
[{"json":"{ \"name\": \"thomas\" }"},{"json":"{ \"name\": \"peter\" }"}]
Third query gives me two result rows with json column content as parsed JSON, good.
SELECT JSON_QUERY(json, '$') as json FROM WS_Test
-- Results:
{ "name": "thomas" }
{ "name": "peter" }
Fourth query gives me the json column contents as ONE (!) JSON object, perfectly parsed.
SELECT JSON_QUERY(json, '$') as json FROM WS_Test FOR JSON PATH
-- Results:
[{"json":{ "name": "thomas" }},{"json":{ "name": "peter" }}]
BUT:
I don't want to have the "json" property containing the json column content in each array object of example four. I just want ONE array containing the column contents, not less, not more. Like this:
[
{
"name": "peter"
},
{
"name": "thomas"
}
]
How can I archive this with just T-SQL? Is this even possible?
The FOR JSON clause will always include the column names - however, you can simply concatenate all the values in your json column into a single result, and then add the square brackets around that.
First, create and populate sample table (Please save us this step in your future questions):
CREATE TABLE WS_Test
(
Id int,
Json nvarchar(1000)
);
INSERT INTO WS_Test(Id, Json) VALUES
(1, '{ "name": "thomas" }'),
(2, '{ "name": "peter" }');
For SQL Server 2017 or higher, use the built in string_agg function:
SELECT '[' + STRING_AGG(Json, ',') + ']' As Result
FROM WS_Test
For lower versions, you can use for xml path with stuff to get the same result as the string_agg:
SELECT STUFF(
(
SELECT ',' + Json
FROM WS_Test
FOR XML PATH('')
), 1, 1, '[')+ ']' As Result
The result for both of these queries will be this:
Result
[{ "name": "thomas" },{ "name": "peter" }]
You can see a live demo on DB<>Fiddle

How to parse JSON value of a text column in cassandra

I have a column of text type be contain JSON value.
{
"customer": [
{
"details": {
"customer1": {
"name": "john",
"addresses": {
"address1": {
"line1": "xyz",
"line2": "pqr"
},
"address2": {
"line1": "abc",
"line2": "efg"
}
}
}
"customer2": {
"name": "robin",
"addresses": {
"address1": null
}
}
}
}
]
}
How can I extract 'address1' JSON field of column with query?
First I am trying to fetch JSON value then I will go with parsing.
SELECT JSON customer from text_column;
With my query, I get following error.
com.datastax.driver.core.exceptions.SyntaxError: line 1:12 no viable
alternative at input 'customer' (SELECT [JSON] customer...)
com.datastax.driver.core.exceptions.SyntaxError: line 1:12 no viable
alternative at input 'customer' (SELECT [JSON] customer...)
Cassandra version 2.1.13
You can't use SELECT JSON in Cassandra v2.1.x CQL v3.2.x
For Cassandra v2.1.x CQL v3.2.x :
The only supported operation after SELECT are :
DISTINCT
COUNT (*)
COUNT (1)
column_name AS new_name
WRITETIME (column_name)
TTL (column_name)
dateOf(), now(), minTimeuuid(), maxTimeuuid(), unixTimestampOf(), typeAsBlob() and blobAsType()
In Cassandra v2.2.x CQL v3.3.x Introduce : SELECT JSON
With SELECT statements, the new JSON keyword can be used to return each row as a single JSON encoded map. The remainder of the SELECT statment behavior is the same.
The result map keys are the same as the column names in a normal result set. For example, a statement like “SELECT JSON a, ttl(b) FROM ...” would result in a map with keys "a" and "ttl(b)". However, this is one notable exception: for symmetry with INSERT JSON behavior, case-sensitive column names with upper-case letters will be surrounded with double quotes. For example, “SELECT JSON myColumn FROM ...” would result in a map key "\"myColumn\"" (note the escaped quotes).
The map values will JSON-encoded representations (as described below) of the result set values.
If your Cassandra version is 2.1x and below, you can use the Python-based approach.
Write a python script using Cassandra-Python API
Here you have to get your row first and then use python json's loads method, which will convert your json text column value into JSON object which will be dict in Python. Then you can play around with Python dictionaries and extract your required nested keys. See the below code snippet.
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import json
if __name__ == '__main__':
auth_provider = PlainTextAuthProvider(username='xxxx', password='xxxx')
cluster = Cluster(['0.0.0.0'],
port=9042, auth_provider=auth_provider)
session = cluster.connect("keyspace_name")
print("session created successfully")
rows = session.execute('select * from user limit 10')
for user_row in rows:
customer_dict = json.loads(user_row.customer)
print(customer_dict().keys()

Select JSON array's fields from SQL view to create a column

I have an SQL Table which one of the columns contain a JSON array in the following format:
[
{
"id":"1",
"translation":"something here",
"value":"value of something here"
},
{
"id":"2",
"translation":"something else here",
"value":"value of something else here"
},
..
..
..
]
Is there any way to use an SQL Query and retrieve columns with the ID as header and the "value" as the value of the column? Instead of return only one column with the JSON array.
For example, if I run:
SELECT column_with_json FROM myTable
It will return the above array. Where I want to return
1,2
value of something here, value of something else here
You can't use SQL to retrieve columns from the JSON stored inside the table: to the database engine the JSON is just unstructured text saved in a text field.
Some relational databases, like PostgreSQL, have a JSON type and functions to support JSON query. If this is your case, you should be able to perform the query you want.
Check this for an example on how it work with PostgreSQL:
http://clarkdave.net/2013/06/what-can-you-do-with-postgresql-and-json/

Yahoo YQL API - How to select a JSON field whose name is a reserved YQL keywords?

For example, I got following JSON from a URL
{ "time": "2014-05-10 06:23:36 UTC",
"values": [
{
"time_timetable_utc": "2014-05-10T06:25:00Z",
"time_realtime_utc": null,
"flags": ""
},
{
"time_timetable_utc": "2014-05-10T06:45:00Z",
"time_realtime_utc": null,
"flags": ""
},
]
}
This will work on YQL
select time from json where url="{url}"
It will return me only time field
{"time": "2014-05-10 06:23:36 UTC"}
But if I only want to get "values" array field with following
select values from json where url="{url}"
I will get this error message
Query syntax error(s) [line 1:7 expecting fields_or_star got 'values']
Just want to ask is that possible to select a JSON field whose name is a reserved Yahoo YQL keywords?
I know this will work
select * from json where url="{url}" and itemPath="json.values"
But is that possible to do it without using "itemPath" condition?
How to escape reserved word like "values" in YQL select?
Just want to ask is that possible to select a field names "values"?
No. (Sorry!)