I am trying to create a table from nested json.
The second layer of the the JSON is very complex and I don't want to keep the schema of that JSON in the table definition with struct column.
I am looking for solution that allow me to keep it as string.
for example:
{
"request_id": "3dbd4ee3-96fc-4342-bd62",
"payload": { < COMPLEX NESTED JSON > },
"timestamp": 1569161622
}
I was trying to use the following create statement:
CREATE EXTERNAL TABLE data (
request_id string,
payload string,
`timestamp` int
)
ROW FORMAT serde 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3a://bucket'
Is there any SerDe property/mapping I can use to define the nested object as String?
You can use org.openx.data.jsonserde.JsonSerDe SerDe
for more info on this SerDe refer [link] (https://github.com/rcongiu/Hive-JSON-Serde)
Hope this helps
I have one big json object
how can i achieve below sql in PostgreSQL without using table
SELECT value->'col1' AS mycolumn
FROM json_object_keys('{"jcol1": "A", "jcol2": "B"}') as value
Json is {"activities-heart":[{"dateTime":"2016-10-17","restingHeartRate":65}}]}
expected output heartrate :65
I'm testing data lake for an application I am developing. I'm new to U-SQL and data lake and am just trying to query all records in a JSON file. Right now, It's only returning one record and I'm not sure why because the file has about 200.
My code is:
DECLARE #input string = #"/MSEStream/output/2016/08/12_0_fc829ede3c1d4cf9a3278d43e7e4e9d0.json";
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
#allposts =
EXTRACT
id string
FROM #input
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
#result =
SELECT *
FROM #allposts;
OUTPUT #result
TO "/ProcessedQueries/all_posts.csv"
USING Outputters.Csv();
Data Example:
{
"id":"398507",
"contenttype":"POST",
"posttype":"post",
"uri":"http://twitter.com/etc",
"title":null,
"profile":{
"#class":"PublisherV2_0",
"name":"Company",
"id":"2163171",
"profileIcon":"https://pbs.twimg.com/image",
"profileLocation":{
"#class":"DocumentLocation",
"locality":"Toronto",
"adminDistrict":"ON",
"countryRegion":"Canada",
"coordinates":{
"latitude":43.7217,
"longitude":-31.432},
"quadKey":"000000000000000"},
"displayName":"Name",
"externalId":"00000000000"},
"source":{
"name":"blogs",
"id":"18",
"param":"Twitter"},
"content":{
"text":"Description of post"},
"language":{
"name":"English",
"code":"en"},
"abstracttext":"More Text and links",
"score":{}
}
}
Thank you for the help in advance
The JsonExtractor takes an argument that allows you to specify which items or objects are being mapped into rows using a JSON Path expression. If you don’t specify anything it will take the top root (which is one row).
You want every one of the items in the array, so you specify it as:
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("[*]");
Where [*] is the JSON Path expression that says give me all the elements of the array which in this case is the top-level array.
If you have a JSON node in your field called id, your original script posted in the question would return the node with name "id" under the rootnode. To get all the nodes, your script will be structured as
#allposts =
EXTRACT
id string,
contenttype string,
posttype string,
uri string,
title string,
profile string
FROM #input
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
Please let us know if it works. The alternative would be to extract it using a native extractor to read it all in a string (as MRys mentioned, as long as your JSON is under 128 KB this would work).
#allposts =
EXTRACT
json string
FROM #input
USING Extractors.Text(delimiter:'\b', quoting:false);
i have a json object which contains json data with a key. now i want to extract value from that json object like name, address etc and store them to variables.
controller
json_arr = new JSONArray(j_str);
int count = json_arr.length();
json_o.put("user", json_arr);
j_str contains following data
[{"Bollywood":[{"actor":[{"name":"AA","gender":"Male"},{"name":"BB","gender":"Male"}]}]},{"Hollywood":[{"actor":[{"name":"CC","gender":"Male"},{"name":"DD","gender":"Male"}]}]}]
now it is converted to json object -- json_o ,, putting a key --- "user". now how can get a specific data such as 2nd actor name from hollywood. (i.e value DD). after then store that to a string.
Short answer: Use Jackson to map the json string to a java object, and then extract that value as a variable.
Here is a quick guide on doing this with jackson: http://www.mkyong.com/java/how-to-convert-java-object-to-from-json-jackson/
Now my backend send to UI ObjectID as an object with timestamp, machineIdentifier, etc., but in database it stores as hex representation. Is there any way (annotation or something else) to serialize it to json as hex representation?
I solved in this way:
JSONObject idObj = (JSONObject)obj.get("_id");
String strID = (String) idObj.get("$oid");
Helped changing id type from ObjectId to String.