Extract value from mysql json string stored as varchar - mysql

Have a mysql column with json like strings stored as varchar:
{'#type': 'Organization', 'legalName': 'some company inc.'}
I tried to extract it using the following:
SELECT JSON_EXTRACT(columnname, "$.legalName")
FROM tablename
WHERE indexfield='specifics'
But all i get is 'NULL' as output and a warning message " Current selection does not contain a unique column."

JSON has to have its names and values wrapped in DOUBLE QUOTES like this
{"#type": "Organization", "legalName": "some company inc."}
So
SELECT JSON_EXTRACT('{"#type": "Organization", "legalName": "some company inc."}', "$.legalName") as xxx;
Works
"some company inc."
While your example has single quotes
{'#type': 'Organization', 'legalName': 'some company inc.'}
So
SELECT JSON_EXTRACT("{'#type': 'Organization', 'legalName': 'some company inc.'}", "$.legalName") as xxx;
Does not work!
So it looks like the JSON you stored was not real JSON

None of the answers helped resolve. So what I did was exported the table to csv, loaded to google sheet ( after splitting it into manageable size) and than used 'text to column' to extract the value from the json-like string.
After that, imported the csv into a temp table, tested and brought back live.

Related

Redshift JSON Parsing

I have some JSON data in Redshift table of type character varying. An example entry is:
[{"value":["*"], "key":"testData"}, {"value":"["GGG"], key: "differentData"}]
I want to return vales based on keys, how can i do this? I'm attempting to do something like
json_extract_path_text(column, 'value') but unfortunately it errors out. Any ideas?
So the first issue is that your string isn't valid JSON. There are mismatched and missing quotes. I think you mean:
[{"value":["*"], "key":"testData"}, {"value":["GGG"], "key": "differentData"}]
I don't know if this is a data issue or a transcription error but these functions won't work unless the json text is valid.
The next thing to consider is that at the top level this json is an array so you will need to use json_extract_array_element_text() function to pick up an element of the array. For example:
json_extract_array_element_text('json string', 0)
So putting this together we can extract the first "value" with (untested):
json_extract_path_text(
json_extract_array_element_text(
'[{"value":["*"], "key":"testData"}, {"value":["GGG"], "key": "differentData"}]', 0
), 'value'
)
Should return the string ["*"].

In Couchbase Java Query DSL, how do I filter for property-names that are not from the ASCII alphabet?

Couchbase queries should support any String for property-name in a filter ( where clause.)
But the query below returns no values for any of the fieldNames "7", "a", "#", "&", "", "?". It does work for values for fieldName a.
Note that I'm using the Java DSL API, not N1ql directly.
OffsetPath statement = select("*").from(i(bucket.name())).where(x(fieldName).eq(x("$t")));
JsonObject placeholderValues = JsonObject.create().put("t", fieldVal);
N1qlQuery q = N1qlQuery.parameterized(statement, placeholderValues);
N1qlQueryResult result = bucket.query(q);
But my bucket does have each of these JsonObjects, including those with unusual property names, as shown by an unfiltered query:
{"a":"a"}
{"#":"a"}
{"&":"a"}
{"":"a"}
{"?":"a"}
How do I escape property names or otherwise support these legal names in queries?
(This question relates to another one, but that is about values and this is about field names.)
The field name is treated as an identifier. So, back-ticks are needed to escape them thus:
select("*").from(i(bucket.name())).where(x("`" + fieldName + "`").eq(x("$value"))
with parameterization of $value, of course

BigQuery parse json child column with special character

I have load the entire json file into a STRING column of BigQuery table. Now I am trying to access the keys using JSON_EXTRACT_SCALAR function, but I am getting null result for the child keys which contain special character period(".") within their name.
Here's the snippet of the data:
{"server_received_time":"2019-01-17 15:00:00.482000","app":161,"device_carrier":null,"$schema":12,"city":"Caro","user_id":null,"uuid":"9018","event_time":"2019-01-17 15:00:00.045000","platform":"Web","os_version":"49","vendor_id":711,"processed_time":"2019-01-17 15:00:00.817195","user_creation_time":"2018-11-01 19:16:34.971000","version_name":null,"ip_address":null,"paying":null,"dma":null,"group_properties":{},"user_properties":{"location.radio":"ca","vendor.userTier":"free","vendor.userID":"a989","user.id":"a989","user.tier":"free","location.region":"ca"},"client_upload_time":"2019-01-17 15:00:00.424000","$insert_id":"e8410","event_type":"LOADED","library":"amp\/4.5.2","vendor_attribution_ids":null,"device_type":"Mac","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.493000","event_id":64,"location_lat":null,"os_name":"Chrome","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.authenticated":false,"content.subsection1":"regions","custom.DNT":true,"content.subsection2":"ca","referrer.url":"","content.url":"","content.type":"index","content.title":"","custom.cookiesenabled":true,"app.pillar":"feed","content.area":"news","app.name":"oc"},"data":{},"device_id":"","language":"English","device_model":"Mac","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":15,"device_family":"Mac","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.987000"}
{"server_received_time":"2019-01-17 15:00:00.913000","app":161,"device_carrier":null,"$schema":12,"city":"Fo","user_id":null,"uuid":"9052","event_time":"2019-01-17 15:00:00.566000","platform":"Web","os_version":"71","vendor_id":797,"processed_time":"2019-01-17 15:00:01.301936","user_creation_time":"2019-01-17 15:00:00.566000","version_name":null,"ip_address":null,"paying":null,"dma":"CO","group_properties":{},"user_properties":{"user.tier":"free"},"client_upload_time":"2019-01-17 15:00:00.157000","$insert_id":"69ae","event_type":"START WEB SESSION","library":"amp\/4.5.2","vendor_attribution_ids":null,"device_type":"Android","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.925000","event_id":1,"location_lat":null,"os_name":"Chrome Mobile","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.subsection3":"home","content.subsection2":"archives","content.title":"","content.keywords.subject":["Lifestyle\/Recreation and leisure\/Outdoor recreation\/Boating","Lifestyle\/Relationships\/Couples","General news\/Weather","Oddities"],"content.publishedtime":154687,"app.name":"oc","referrer.url":"","content.subsection1":"archives","content.url":"","content.authenticated":false,"content.keywords.location":["Ot"],"content.originaltitle":"","content.type":"story","content.authors":["Archives"],"app.pillar":"feed","content.area":"news","content.id":"1.49","content.updatedtime":1546878600538,"content.keywords.tag":["24 1","boat house","Ot","Rockcliffe","River","m"],"content.keywords.person":["Ber","Shi","Jea","Jean\u00e9tien"]},"data":{"first_event":true},"device_id":"","language":"English","device_model":"Android","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":15477,"device_family":"Android","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.810000"}
{"server_received_time":"2019-01-17 15:00:00.913000","app":16,"device_carrier":null,"$schema":12,"city":"","user_id":null,"uuid":"905","event_time":"2019-01-17 15:00:00.574000","platform":"Web","os_version":"71","vendor_id":7973,"processed_time":"2019-01-17 15:00:01.301957","user_creation_time":"2019-01-17 15:00:00.566000","version_name":null,"ip_address":null,"paying":null,"dma":"DCO","group_properties":{},"user_properties":{"user.tier":"free"},"client_upload_time":"2019-01-17 15:00:00.157000","$insert_id":"d045","event_type":"LOADED","library":"am-js\/4.5.2","vendor_attribution_ids":null,"device_type":"Android","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.925000","event_id":2,"location_lat":null,"os_name":"Chrome Mobile","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.subsection3":"home","content.subsection2":"archives","content.subsection1":"archives","content.keywords.subject":["Lifestyle\/Recreation and leisure\/Outdoor recreation\/Boating","Lifestyle\/Relationships\/Couples","General news\/Weather","Oddities"],"content.type":"story","content.keywords.location":["Ot"],"app.pillar":"feed","app.name":"oc","content.authenticated":false,"custom.DNT":false,"content.id":"1.4","content.keywords.person":["Ber","Shi","Jea","Je\u00e9tien"],"content.title":"","content.url":"","content.originaltitle":"","custom.cookiesenabled":true,"content.authors":["Archives"],"content.publishedtime":1546878600538,"referrer.url":"","content.area":"news","content.updatedtime":1546878600538,"content.keywords.tag":["24 1","boat house","O","Rockcliffe","River","pr"]},"data":{},"device_id":"","language":"English","device_model":"Android","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":1547737199081,"device_family":"Android","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.818000"}
Here's the sample query against the table:
SELECT
CAST(JSON_EXTRACT_SCALAR(data,'$.uuid')AS INT64) AS uuid_id,
CAST(JSON_EXTRACT_SCALAR(data,'$.event_time') AS TIMESTAMP) AS event_time,
JSON_EXTRACT_SCALAR(data,'$[event_properties].app.name') AS app_name,
JSON_EXTRACT_SCALAR(data,'$[user_properties].user.tier') AS user_tier
FROM
mytable
Above query give null result for app_name & user_tier columns even though data exists for them.
Following the BigQuery JSON function documentation - JSON Functions in Standard SQL
In cases where a JSON key uses invalid JSONPath characters, you can escape those characters using single quotes and brackets, [' '].
and running the query as:
SELECT
CAST(JSON_EXTRACT_SCALAR(data,"$.uuid_id")AS INT64) AS uuid_id,
CAST(JSON_EXTRACT_SCALAR(data,"$.event_time") AS TIMESTAMP) AS event_time,
JSON_EXTRACT_SCALAR(data,"$.event_properties.['app.name']") AS app_name,
JSON_EXTRACT_SCALAR(data,"$.user_properties.['user.tier']") AS user_tier
FROM
mytable
result into following error:
Invalid token in JSONPath at: .['app.name']
Please advise. What am I missing here?
You have an extra . before the [. Use
"$.event_properties['app.name']"

OpenEdge ABL reserved keyword as temp-table field name (inferred from JSON data)

I am stuck with the following situation:
My method receives a response from external REST API call. The JSON response structure is as below:
{
"members": [
{
"email_address": "random#address.org",
"status": "randomstatus"
},
...etc...
]}
I am reading this to temp-table with READ-JSON (Inferring ABL schema from JSON data) and try to process the temp-table. And this is where I am stuck:
when I am trying to put together a query that contains temp-table field "status", the error is raised.
Example:
hQuery:QUERY-PREPARE('FOR EACH ' + httSubscriber:NAME + ' WHERE ' + hBuffer:BUFFER-FIELD(iStatus):NAME + ' = "randomstatus"').
gives:
**Unable to understand after -- "members WHERE".(247)
I have tried referencing directly by name as well, same result.
Probably the "status" is a reserved keyword in ABL. Might that be the case? And how can I get over this issue to reference that "status" field?
Unfortunately the format and key names of JSON response are not under my control and I have to work with that.
You could use SERIALIZE-NAME in the temp-table definition to internally rename the field in question. Then you would have to refer to the field with another name but in it's serialized form it would still be known as status.
Here's an example where the status-field is renamed to exampleStatus.
DEFINE TEMP-TABLE ttExample NO-UNDO
FIELD exampleStatus AS CHARACTER SERIALIZE-NAME "status".
/* Code to read json goes here... */
/* Access the field */
FOR EACH ttExample:
DISPLAY ttExample.exampleStatus.
END.
I've been known to do silly things like this:
JSONData = replace( JSONData, '"status":', '"xstatus":' ).
Try naming the temp-table (hard-coded or via string appending) + '.' + hBuffer:BUFFER-FIELD(iStatus):NAME (...)
It should help the compiler understand you're talking about the field. Since it's not restricted, this should force its hand and allow you to query.

hive: create table / data type syntax for comma separated files

The text file is comma separated. However, one of the columns ex: "Issue" with value "Other (phone, health club, etc)" also contains commas.
Question: What should the data type of "Issue" be? And how should I format the table (row format delimited terminated by) so that the comma in the column (issue) is accounted for correctly
I had set it this way:
create table consumercomplaints (ComplaintID int,
Product string,
Subproduct string,
Issue string,
Subissue string,
State string,
ZIPcode int,
Submittedvia string,
Datereceived string,
Datesenttocompany string,
Company string,
Companyresponse string,
Timelyresponse string,
Consumerdisputed string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
location '/user/hive/warehouse/mydb/consumer_complaints.csv';
Sample data --
Complaint ID,Product,Sub-product,Issue,Sub-issue,State,ZIP code,Submitted via,Date received,Date sent to company,Company,Company response,Timely response?,Consumer disputed?
943291,Debt collection,,Cont'd attempts collect debt not owed,Debt is not mine,MO,63123,Web,07/18/2014,07/18/2014,"Enhanced Recovery Company, LLC",Closed with non-monetary relief,Yes,
943698,Bank account or service,Checking account,Deposits and withdrawals,,CA,93030,Web,07/18/2014,07/18/2014,U.S. Bancorp,In progress,Yes,
943521,Debt collection,,Cont'd attempts collect debt not owed,Debt is not mine,OH,44116,Web,07/18/2014,07/18/2014,"Vital Solutions, Inc.",Closed with explanation,Yes,
943400,Debt collection,"Other (phone, health club, etc.)",Communication tactics,Frequent or repeated calls,MD,21133,Web,07/18/2014,07/18/2014,"The CBE Group, Inc.",Closed with explanation,Yes,
I think you need to format your output data by some control character like Control-A. I don't think there will be any data type to support this. OR you can write a UDF to load the data and take care of formatting in the UDF logic.
Short of writing a serde, you could do 2 things,
escape the comma in the original data before loading, using some character. for e.g. \
and then use the hive create table command using row format delimited fields terminated by ',' escaped by **'\'**
you can use a regex that takes care of the comma enclosed within double quotes,
so first you apply a regex to data as shown in hortonworks/apache manuals,
regexp_extract(col_value, '^(?:([^,]*)\,?){1}', 1) player_id source:https://web.archive.org/web/20171125014202/https://hortonworks.com/tutorial/how-to-process-data-with-apache-hive/
Ensure that you are able to load and see your data using this expression ( barring the enclosed commas).
Then modify the expression to account for enclosed commas. You can do something like this,
String s = "a,\"hi, I am here\",c,d,\"ahoy, mateys\"";
String pattern ="^(?:([^\",]*|\"[^\"]*\"),?){4}";
p = Pattern.compile(pattern);
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println("YES-"+m.groupCount());
System.out.println("=>"+m.group(1));
}
by changing {4} to {1}, {2}, ... you can get respective fields.