Unify JSON files with jq - json

I'm new in the community and I'm not a dev, but I have a task I need to find a solution for. I hope I can get your ideas.
I have a set of JSON files. I want to be able to use jq or a command line that can help me unify the files into one single file.
For example:
File 1, has the following format:
{
"Interactions": [
{
"ID": "ispring://presentations/F7385CB7-DFDC-4D05-90FA-B927DB3D170D/quizzes/",
"Type": "2",
"TimestampUtc": "8/27/2020 12:09:54 PM",
"Timestamp": "8/27/2020 12:09:54 PM",
"Weighting": "",
"Result": "1",
"Latency": "1000",
"Description": "What is the purpose of Summarizing next steps?\n\nSelect the correct box or boxes",
"LearnerResponse": "0_correct_answer[,]1_Rectangle_2",
"ScormActivityId": "12392705",
"InteractionIndex": "3",
"AULMRID": "38093846"
},
]
}
File 2:
{
"Interactions": [
{
"ID": "ispring://presentations/CAA34147-7B48-40C6-84FD-5CE8077DB2BF/quizzes/",
"Type": "2",
"TimestampUtc": "12/8/2020 6:19:12 PM",
"Timestamp": "12/8/2020 6:19:12 PM",
"Weighting": "",
"Result": "1",
"Latency": "1300",
"Description": "'Can't do' language tends to relay this impression...\n\nSelect one.",
"LearnerResponse": "4_All_of_the_above",
"ScormActivityId": "13334358",
"InteractionIndex": "3",
"AULMRID": "40715598"
},
]
}
And this is my expected result in a third file:
{
"Interactions": [
{
"ID": "ispring://presentations/F7385CB7-DFDC-4D05-90FA-B927DB3D170D/quizzes/",
"Type": "2",
"TimestampUtc": "8/27/2020 12:09:54 PM",
"Timestamp": "8/27/2020 12:09:54 PM",
"Weighting": "",
"Result": "1",
"Latency": "1000",
"Description": "What is the purpose of Summarizing next steps?\n\nSelect the correct box or boxes",
"LearnerResponse": "0_correct_answer[,]1_Rectangle_2",
"ScormActivityId": "12392705",
"InteractionIndex": "3",
"AULMRID": "38093846"
},
{
"ID": "ispring://presentations/CAA34147-7B48-40C6-84FD-5CE8077DB2BF/quizzes/",
"Type": "2",
"TimestampUtc": "12/8/2020 6:19:12 PM",
"Timestamp": "12/8/2020 6:19:12 PM",
"Weighting": "",
"Result": "1",
"Latency": "1300",
"Description": "'Can't do' language tends to relay this impression...\n\nSelect one.",
"LearnerResponse": "4_All_of_the_above",
"ScormActivityId": "13334358",
"InteractionIndex": "3",
"AULMRID": "40715598"
},
]
}
Any ideas on how to unify them ?
Thank you!
RG

Try something like the following:
jq -n '{ Interactions: [ inputs.Interactions ] | add }' file1.json file2.json
This assumes that you have made both input files valid JSON by stripping that trailing comma at the end of each object in the Interactions array.
For input file1.json:
{
"Interactions": [
{
"ID": "file1",
"Type": "1",
"Timestamp": "8/27/2020 11:11:11 PM"
}
]
}
and input file2.json:
{
"Interactions": [
{
"ID": "file2",
"Type": "2",
"Timestamp": "8/27/2020 22:22:22 PM"
}
]
}
this results in:
{
"Interactions": [
{
"ID": "file1",
"Type": "1",
"Timestamp": "8/27/2020 11:11:11 PM"
},
{
"ID": "file2",
"Type": "2",
"Timestamp": "8/27/2020 22:22:22 PM"
}
]
}

Related

JSON Parsing a Plain Text File

I have a lot of data I need to parse though.
I need to pull all pid's and price's.
`
[
{
"id": 159817,
"price": "10.69",
"stocked": true,
"store": {
"id": 809,
"nsn": "22036-0",
"pricingSource": "manual",
"lastUpdated": "2022-12-05T15:24:33.908Z"
},
"sharedFields": {
"type": "PRODUCT",
"id": 24549,
"pid": "12079",
"labels": [
{
"type": "default",
"value": "Chicken Sandwich",
"locale": "en"
},
{
"type": "fresh",
"value": "Chicken",
"locale": "en"
},
{
"type": "product_json",
"value": "Chicken",
"locale": "en"
}
],
"calMin": 600,
"calMax": 600,
"lastUpdated": "2021-12-31T13:49:22.794Z"
}
},
{
"id": 159818,
"price": "9.29",
"stocked": true,
"store": {
"id": 809,
"nsn": "22036-0",
"pricingSource": "manual",
"lastUpdated": "2022-12-05T15:24:33.908Z"
},
"sharedFields": {
"type": "PRODUCT",
"id": 25,
"pid": "1",
"labels": [
{
"type": "default",
"value": "Ham Sandwich",
"locale": "en"
},
{
"type": "fresh",
"value": "Ham",
"locale": "en"
}
],
"calMin": 540,
"calMax": 540,
"lastUpdated": "2021-07-09T19:30:00.326Z"
}
}
]
`
and I need to place them into a string like this, but on a scale of 150 products. I'd also need to change "pid" to "productId"
[{ "productId": "46238", "price": 6.09 }, { "productId": "40240", "price": 1.49 }]
I need to add a string before this data, but I'm pretty confident I can figure that part out.
I am pretty open to the easiest suggestion, whether that be VBS, Excel macro, etc.

Moving a JSON array which have another array elements to the same LEVEL

I have the following JSON which have an element array_within_array in another array. So I want to pull both array_within_array element and upper array element(event_array) to root level using jq.
So here there are 3 events.
meal_selection
login
placed_order
Event placed_order have two sub-events in array. So after conversion there should be 4 events(1 from meal_selection, 1 from login and 2 from placed_order). these all should be on the same level.
Here is the JSON
{
"region": "USA",
"user_id": "123",
"event_array": [{
"event_attributes": {
"date": "2021-08-17",
"category": "lunch",
"location": "office"
},
"event_name": "meal_selection",
"created_at": "2021-08-13 01:28:57"
},
{
"event_name": "login",
"created_at": "2021-08-13 01:29:02"
},
{
"event_attributes": {
"array_within_array": [
{
"date": "2021-08-17",
"category": "lunch",
"location": "office"
},
{
"date": "2021-08-18",
"category": "dinner",
"location": "home"
}
]
},
"event_name": "placed_order",
"created_at": "2021-08-13 01:28:08"
}
]
}
and I want to convert to the below one
{
"region": "USA",
"user_id": "123",
"event_attributes": {
"date": "2021-08-17",
"category": "lunch",
"location": "office"
},
"event_name": "meal_selection",
"created_at": "2021-08-13 01:28:57"
}
{
"region": "USA",
"user_id": "123",
"event_name": "login",
"created_at": "2021-08-13 01:29:02"
}
{
"region": "USA",
"user_id": "123",
"event_attributes": {
"date": "2021-08-17",
"category": "lunch",
"location": "office"
},
"event_name": "placed_order",
"created_at": "2021-08-13 01:28:08"
}
{
"region": "USA",
"user_id": "123",
"event_attributes": {
"date": "2021-08-18",
"category": "dinner",
"location": "home"
},
"event_name": "placed_order",
"created_at": "2021-08-13 01:28:08"
}
Here's a straightforward solution that keeps things simple by using two steps:
{ region, user_id} + (.event_array[] )
| if .event_attributes|has("array_within_array")
then .event_attributes.array_within_array as $a
| .event_attributes = $a[]
else .
end

Nested json - store values in csv

I am trying to convert a nested json file into csv. It's data from a darts API and the structure is always the same. Nevertheless I got some problems flattening and storing the values in a csv because of the nested structure.
json:
{
"summaries": [{
"sport_event": {
"id": "sr:sport_event:12967512",
"start_time": "2017-11-11T13:15:00+00:00",
"start_time_confirmed": true,
"sport_event_context": {
"sport": {
"id": "sr:sport:22",
"name": "Darts"
},
"category": {
"id": "sr:category:104",
"name": "International"
},
"competition": {
"id": "sr:competition:597",
"name": "Grand Slam of Darts"
},
"season": {
"id": "sr:season:47332",
"name": "Grand Slam of Darts 2017",
"start_date": "2017-11-11",
"end_date": "2017-11-20",
"year": "2017",
"competition_id": "sr:competition:597"
},
"stage": {
"order": 1,
"type": "league",
"phase": "stage_1",
"start_date": "2017-11-11",
"end_date": "2017-11-15",
"year": "2017"
},
"round": {
"number": 1
},
"groups": [{
"id": "sr:league:29766",
"name": "Grand Slam of Darts 2017, Group G",
"group_name": "G"
}]
},
"coverage": {
"live": true
},
"competitors": [{
"id": "sr:competitor:35936",
"name": "Smith, Michael",
"abbreviation": "SMI",
"qualifier": "home"
}, {
"id": "sr:competitor:83895",
"name": "Wilson, James",
"abbreviation": "WIL",
"qualifier": "away"
}]
},
"sport_event_status": {
"status": "closed",
"match_status": "ended",
"home_score": 5,
"away_score": 3,
"winner_id": "sr:competitor:35936"
}
}, {
"sport_event": {
"id": "sr:sport_event:12967508",
"start_time": "2017-11-11T13:40:00+00:00",
"start_time_confirmed": true,
"sport_event_context": {
"sport": {
"id": "sr:sport:22",
"name": "Darts"
},
"category": {
"id": "sr:category:104",
"name": "International"
},
"competition": {
"id": "sr:competition:597",
"name": "Grand Slam of Darts"
},
"season": {
"id": "sr:season:47332",
"name": "Grand Slam of Darts 2017",
"start_date": "2017-11-11",
"end_date": "2017-11-20",
"year": "2017",
"competition_id": "sr:competition:597"
},
"stage": {
"order": 1,
"type": "league",
"phase": "stage_1",
"start_date": "2017-11-11",
"end_date": "2017-11-15",
"year": "2017"
},
"round": {
"number": 1
},
"groups": [{
"id": "sr:league:29764",
"name": "Grand Slam of Darts 2017, Group F",
"group_name": "F"
}]
},
"coverage": {
"live": true
},
"competitors": [{
"id": "sr:competitor:70916",
"name": "Bunting, Stephen",
"abbreviation": "BUN",
"qualifier": "home"
}, {
"id": "sr:competitor:191262",
"name": "de Zwaan, Jeffrey",
"abbreviation": "DEZ",
"qualifier": "away"
}]
},
"sport_event_status": {
"status": "closed",
"match_status": "ended",
"home_score": 5,
"away_score": 4,
"winner_id": "sr:competitor:70916"
}
}
So for each sport_event I would like to store the variables:
"start_time"
from "season" the variable "name"
from "competitors" both "id" and "name"
from "sport_event_status" the "winner_id"
I have already tried to flatten the json file with this code:
import json
f = open(r'path of file.json')
data = json.load(f)
def flatten(data):
for key,value in data.items():
print (str(key)+'->'+str(value))
if type(value) == type(dict()):
flatten(value)
elif type(value) == type(list()):
for val in value:
if type(val) == type(str()):
pass
elif type(val) == type(list()):
pass
else:
flatten(val)
flatten(data)
print(data)
This actually prints out the following:
id->sr:season:47332
name->Grand Slam of Darts 2017
start_date->2017-11-11
end_date->2017-11-20
year->2017
competition_id->sr:competition:597
Now my question is how to store the values I mentioned above in a csv file.
Thanks in advance for your support.
Using jq, you basically just have to transcribe your specification, adding a bit of context and taking care of an embedded array:
.summaries[]
| .sport_event # Your specification:
| [.start_time, # start_time
.sport_event_context.season.name] # from "season" the variable "name"
+ [.competitors[] | .id, .name] # from "competitors" both "id" and "name"
+ [.sport_event_status.winner_id] # from "sport_event_status" the "winner_id"
| #csv
Invocation
E.g.
jq -rf program.jq my.json

Find a record in json Object if the record has specific key in python

I have a JSON object which has 100000 records. I want a select a record which has specific value to the one of the key
Eg:
[{
"name": "bindu",
"age": "24",
"qualification": "b.tech"
},
{
"name": "naveen",
"age": "23",
"qualification": "b.tech"
},
{
"name": "parvathi",
"age": "23",
"qualification": "m.tech"
},
{
"name": "bindu s",
"status": "married"
},
{
"name": "naveen k",
"status": "unmarried"
}]
now I want to combine the records which are having the name with 'bindu' and 'bindu s. We can achieve this by iterating on the JSON object but since the size is more it is taking more time. Is there any way to make this easy.
I want the output like
[{
"name": "bindu",
"age": "24",
"qualification": "b.tech",
"status": "married"
},
{
"name": "naveen",
"age": "23",
"qualification": "b.tech",
"status": "unmarried"
},
{
"name": "parvathi",
"age": "23",
"qualification": "m.tech"
"status": ""
},
This will rename and merge your objects by first name.
jq 'map(.name |= split(" ")[0]) | group_by(.name) | map(add)'

Convert nested json to csv to sheets json api

I'm want to make my json to csv so that i can upload it on google sheets and make it as json api. Whenever i have change data i will just change it on google sheets. But I'm having problems on converting my json file to csv because it changes the variables whenever i convert it. I'm using https://toolslick.com/csv-to-json-converter to convert my json file to csv.
What is the best way to convert json nested to csv ?
JSON
{
"options": [
{
"id": "1",
"value": "Jumbo",
"shortcut": "J",
"textColor": "#FFFFFF",
"backgroundColor": "#00000"
},
{
"id": "2",
"value": "Hot",
"shortcut": "D",
"textColor": "#FFFFFF",
"backgroundColor": "#FFFFFF"
}
],
"categories": [
{
"id": "1",
"order": 1,
"name": "First Category",
"active": true
},
{
"id": "2",
"order": 2,
"name": "Second Category",
"shortcut": "MT",
"active": true
}
],
"products": [
{
"id": "03c6787c-fc2a-4aa8-93a3-5e0f0f98cfb2",
"categoryId": "1",
"name": "First Product",
"shortcut": "First",
"options": [
{
"optionId": "1",
"price": 23
},
{
"optionId": "2",
"price": 45
}
],
"active": true
},
{
"id": "e8669cea-4c9c-431c-84ba-0b014f0f9bc2",
"categoryId": "2",
"name": "Second Product",
"shortcut": "Second",
"options": [
{
"optionId": "1",
"price": 11
},
{
"optionId": "2",
"price": 20
}
],
"active": true
}
],
"discounts": [
{
"id": "1",
"name": "S",
"type": 1,
"amount": 20,
"active": true
},
{
"id": "2",
"name": "P",
"type": 1,
"amount": 20,
"active": true
},
{
"id": "3",
"name": "G",
"type": 2,
"amount": 5,
"active": true
}
]
}
Using python, this can be easily done or almost done. Maybe this code will help you in some way to understand that.
import json,csv
data = []
with open('your_json_file_here.json') as file:
for line in file:
data.append(json.loads(line))
length = len(data)
with open('create_new_file.csv','w') as f:
writer = csv.writer(f)
writers = csv.DictWriter(f, fieldnames=['header1','header2'])
writers.writeheader()
for iter in range(length):
writer.writerow((data[iter]['specific_col_name1'],data[iter]['specific_col_name2']))
f.close()