How to Query Nested JSON in Google Big Query - json

I'm currently having trouble extracting data from a JSON String.
The way the data has been pulled, everything has been nested into a single string under the data field name.
How it looks like in Big Query:
Screenshot of the Schema:
Below is an example of what the string looks like:
{"id":1381,"email":"J.Smith#gmail.com","name":"Jake Smith","sub_network_ids":[2375,2270],"extended_updated_at":"2022-01-27T00:02:14Z"}
If I simply wanted to pull the ID, Email, and Name from this string and into a table, I'm wondering how would one go about doing such? Currently, I was trying to use JSON_EXTRACT with Unnest, but that didn't pan out in the direction I thought it would.
Any help would be appreciated, thanks.

Related

unable to use where-object as value is nested in array with powershell

I'm currently working through a csv of Sharepoint AuditData and trying to evaluate several fields that are nested in the audit data field.
I've tried all the Usual $_.AuditData.CreatedDate etc. but can't figure out how to query the data thats nested in it.
It's my first time trying to evaluate against data thats structured like this and any help about how best to go about this would be very much appreciated.
I've tried the where object clause and its not returning the data I would expect
"PSComputerName","RunspaceId","PSShowComputerName","RecordType","CreationDate","UserIds","Operations","AuditData","ResultIndex","ResultCount","Identity","IsValid","ObjectState"
"REDACTED","REDACTED","REDACTED","REDACTED","REDACTED","REDACTED","REDACTED","{""AppAccessContext"":{""AADSessionId"":""REDACTED"",""CorrelationId"":""REDACTED"",""TokenIssuedAtTime"":""REDACTED"",""UniqueTokenId"":""REDACTED""},""CreationTime"":""REDACTED"",""Id"":""REDACTED"",""Operation"":""REDACTED"",""OrganizationId"":""REDACTED"",""RecordType""REDACTED""UserKey"":""REDACTED"",""UserType""REDACTED""Version""REDACTED""Workload"":""REDACTED"",""ClientIP"":""REDACTED"",""ObjectId"":""REDACTED"",""UserId"":""REDACTED"",""CorrelationId"":""REDACTED"",""EventSource"":""REDACTED"",""ItemType"":""REDACTED""",""ListId"":""REDACTED"",""ListItemUniqueId"":""REDACTED"",""Site"":""REDACTED"",""UserAgent"":""REDACTED"",""WebId"":"REDACTED"",""MachineId"":""REDACTED"",""FileSyncBytesCommitted"":""REDACTED"",""HighPriorityMediaProcessing""REDACTED""ImplicitShare"":""REDACTED"",""IsManagedDevice""REDACTED""SourceFileExtension"":""REDACTED"",""SiteUrl"":""REDACTED"",""SourceFileName"":""REDACTED"",""SourceRelativeUrl"":""REDACTED""}","REDACTED","REDACTED","REDACTED","REDACTED","REDACTED"

Advanced mapping of JSON in Azure Data Factory - some guidance requested

I'm trying to map a JSON document (sensor data) into a more meaningful representation using Mapping Dataflows. However, hard time getting this to work and would really appreciate some insight/recommendations on how to solve the following:
The input is
What I would like to end up with is the following:
Any pointers as to how this can be implemented are more than welcome.
This can be accomplished using the Copy activity and then split function in Derived Column transformation in Azure Data Factory.
Use the copy activity to read the JSON file as source and in sink, use SQL database to store the data as table. In Mapping tab, Import the schema and map the JSON records to the corresponding column names. Refer this third-part tutorial for guidance - https://sqlkover.com/dynamically-map-json-to-sql-in-azure-data-factory/
Finally, use the Data Flow activity and choose the SQL table as source now which you have used as sink above.
Select the Derived Column transformation.
Use split function.
Add the column which will take the split values which you want to split as shown below.
Use split(<column_name_to_split>, '_') function to split the column on with _ delimiter. Change <column_name_to_split> to the name of column you cant to split. Refer image below.
Preview the data to check the result.

SSRS - extract data from column containing JSON

I have a dataset with a column containing arrays of JSON data that looks like:
[{"name":"aaa","type":"yyy"},{"name":"bbb","type":"ccc"}]
or more specifically:
dataset with JSON array column
Is there any straight forward method of extracting the JSON data from the column using something like JSON_QUERY, so that I can use it in a report
As far as I can tell, the existing JSON array format is not usable with any of the T-SQL JSON functions.
The array in the column "jsonCol" needs to be in the form of:
{ "tag": [{"name":"aaa","type":"yyy"},{"name":"bbb","type":"ccc"}]}
and then I can extract each array element individually with:
SELECT JSON_QUERY(jsonCol, '$.tag[0]') as tag
FROM
So I could add a prefix and suffix string to the select statement to fix this as long as no one else will see it.

Alpha anywhere: Can I populate JSON data into the list

Can I populate a list with JSON data? I have a general list containing data available for several sessions but I need to filter them with my current session and insert them to another list. My idea is to use the filtered JSON data since I successfully filtered them in JSON format. I've looked into some threads that might relate but currently get nothing. Hope someone can point me to the right page.
I missed this page: or maybe I overlooked it: https://forum.alphasoftware.com/showthread.php?119524-How-to-populate-a-List-from-a-JSON-formatted-field.
Anyway, populating JSON data into list in alpha anywhere is easy to be done. Firstly, get the JSON data(in my case I produce them from another list). With this data(already in JSON format), I do the filter using:
var filtered_json = find_in_object(JSON.parse('my_JSON_data'), {my_filter_condition});
Then, the result should be in [object object][object object]
Finally, populate the result to the list.
var lObj= {dialog.object}.getControl('my_list_ID')
lObj.populate(filtered_json);

Best way to parse a big and intricated Json file with OpenRefine (or R)

I know how to parse json cells in Open refine, but this one is too tricky for me.
I've used an API to extract the calendar of 4730 AirBNB's rooms, identified by their IDs.
Here is an example of one Json file : https://fr.airbnb.com/api/v2/calendar_months?key=d306zoyjsyarp7ifhu67rjxn52tv0t20&currency=EUR&locale=fr&listing_id=4212133&month=11&year=2016&count=12&_format=with_conditions
For each ID and each day of the year from now until november 2017, i would like to extract the availability of this rooms (true or false) and its price at this day.
I can't figure out how to parse out these informations. I guess that it implies a series of nested forEach, but i can't find the right way to do this with Open Refine.
I've tried, of course,
forEach(value.parseJson().calendar_months, e, e.days)
The result is an array of arrays of dictionnaries that disrupts me.
Any help would be appreciate. If the operation is too difficult in Open Refine, a solution with R (or Python) would also be fine for me.
Rather than just creating your Project as text, and working with GREL to parse out...
The best way is just select the JSON record part that you want to work with using our visual importer wizard for JSON files and XML files (you can even use a URL pointing to a JSON file as in your example). (A video tutorial shows how here: https://www.youtube.com/watch?v=vUxdB-nl0Bw )
Select the JSON part that contains your records that you want to parse and work with (this can be any repeating part, just select one of them and OpenRefine will extract all the rest)
Limit the amount of data rows that you want to load in during creation, or leave default of all rows.
Click Create Project and now your in Rows mode. However if you think that Records mode might be better suited for context, just import the project again as JSON and then select the next outside area of the content, perhaps a larger array that contains a key field, etc. In the example, the key field would probably be the Date, and why I highlight the whole record for a given date. This way OpenRefine will have Keys for each record and Records mode lets you work with them better than Row mode.
Feel free to take this example and make it better and even more helpful for all , add it to our Wiki section on How to Use
I think you are on the right track. The output of:
forEach(value.parseJson().calendar_months, e, e.days)
is hard to read because OpenRefine and JSON both use square brackets to indicate arrays. What you are getting from this expression is an OR array containing twelve items (one for each month of the year). The items in the OR array are JSON - each one an array of days in the month.
To keep the steps manageable I'd suggest tackling it like this:
First use
forEach(value.parseJson().calendar_months,m,m.days).join("|")
You have to use 'join' because OR can't store OR arrays directly in a cell - it has to be a string.
Then use "Edit Cells->Split multi-valued cells" - this will get you 12 rows per ID, each containing a JSON expression. Now for each ID you have 12 rows in OR
Then use:
forEach(value.parseJson(),d,d).join("|")
This splits the JSON down into the individual days
Then use "Edit Cells->Split multi-valued cells" again to split the details for each day into its own cell.
Using the JSON from example URL above - this gives me 441 rows for the single ID - each contains the JSON describing the availability & price for a single day. At this point you can use the 'fill down' function on the ID column to fill in the ID for each of the rows.
You've now got some pretty easy JSON in each cell - so you can extract availability using
value.parseJson().available
etc.