Modify array elements while grouping by a specific key using jq - json

Is it possible to modify/replace array elements while grouping by a specific key (.[].Parameter.Id) such that this array:
[{
"Id": 48,
"Parameter": {
"Id": 17
}
}, {
"Id": 196,
"Parameter": {
"Id": 17
}
}]
becomes this:
[
{
"p17": [48, 196]
}
]
Here is the source JSON file for a complete example:
[{
"Id": 78,
"PromotionType": 2,
"Amount": "100",
"UpperLimit": null,
"Variables": [{
"Id": 100,
"Parameter": {
"Id": 30
}
}]
}, {
"Id": 84,
"PromotionType": 2,
"Amount": null,
"UpperLimit": null,
"Variables": [{
"Id": 48,
"Parameter": {
"Id": 17
}
}, {
"Id": 196,
"Parameter": {
"Id": 17
}
}, {
"Id": 59,
"Parameter": {
"Id": 21
}
}, {
"Id": 60,
"Parameter": {
"Id": 21
}
}, {
"Id": 62,
"Parameter": {
"Id": 21
}
}]
}, {
"Id": 59,
"PromotionType": 2,
"Amount": "666.6",
"UpperLimit": null,
"Variables": [{
"Id": 96,
"Parameter": {
"Id": 8
}
}, {
"Id": 47,
"Parameter": {
"Id": 17
}
}]
}]
What I want to achieve is this:
[{
"Id": 78,
"PromotionType": 2,
"Amount": "100",
"UpperLimit": null,
"Variables": [{
"p30": [100]
}]
}, {
"Id": 84,
"PromotionType": 2,
"Amount": null,
"UpperLimit": null,
"Variables": [{
"p17": [48, 196]
}, {
"p21": [59, 60, 62]
}]
}, {
"Id": 59,
"PromotionType": 2,
"Amount": "666.6",
"UpperLimit": null,
"Variables": [{
"p8": [96]
}, {
"p17": [47]
}]
}]
I am reading through jq manual, jq cookbook and found some functions (e.g. with_entries, unique_by, inputs) that might help but could not figure out how to make it work.
Number of objects/inner objects are also not fixed. So I cannot simply replace using array indexes.
Any help would be appreciated.
Thanks,
Emre

jq solution:
jq 'map(.Variables
|= (group_by(.Parameter.Id)
| map(("p" + (.[0].Parameter.Id | tostring)) as $pid
| { ($pid) : map(.Id) }
)
)
)' input.json
The output:
[
{
"Id": 78,
"PromotionType": 2,
"Amount": "100",
"UpperLimit": null,
"Variables": [
{
"p30": [
100
]
}
]
},
{
"Id": 84,
"PromotionType": 2,
"Amount": null,
"UpperLimit": null,
"Variables": [
{
"p17": [
48,
196
]
},
{
"p21": [
59,
60,
62
]
}
]
},
{
"Id": 59,
"PromotionType": 2,
"Amount": "666.6",
"UpperLimit": null,
"Variables": [
{
"p8": [
96
]
},
{
"p17": [
47
]
}
]
}
]

Related

Translate JSON file's specific field

I would like to translate the field "text" from the flight domain of the Taskmaster-2 dataset. Which is a deeply nested JSON file. Using Google Cloud Translate how can I do it?
Example (from English to Bangla):
Origin JSON file:
[ {
"conversation_id": "dlg-00100680-00e0-40fe-8321-6d81b21bfc4f",
"instruction_id": "flight-12",
"utterances": [
{
"index": 0,
"speaker": "USER",
"text": "Hello. I'd like to find a round trip commercial airline flight from San Francisco to Denver.",
"segments": [
{
"start_index": 26,
"end_index": 36,
"text": "round trip",
"annotations": [
{
"name": "flight_search.type"
}
]
},
Output JSON file:
[ {
"conversation_id": "dlg-00100680-00e0-40fe-8321-6d81b21bfc4f",
"instruction_id": "flight-12",
"utterances": [
{
"index": 0,
"speaker": "USER",
"text": "হ্যালো. আমি সান ফ্রান্সিসকো থেকে ডেনভার পর্যন্ত একটি রাউন্ড ট্রিপ বাণিজ্যিক এয়ারলাইন ফ্লাইট খুঁজতে চাই।",
"segments": [
{
"start_index": 26,
"end_index": 36,
"text": "রাউন্ড ট্রিপ",
"annotations": [
{
"name": "flight_search.type"
}
]
},
I extracted a few lines of data in flights.json and used the code below written in Python using Google Cloud Translation API to translate English to Japanese. Also see list of supported languages of the API.
test.json:
[
{
"conversation_id": "dlg-00100680-00e0-40fe-8321-6d81b21bfc4f",
"instruction_id": "flight-12",
"utterances": [
{
"index": 0,
"speaker": "USER",
"text": "Hello. I'd like to find a round trip commercial airline flight from San Francisco to Denver.",
"segments": [
{
"start_index": 26,
"end_index": 36,
"text": "round trip",
"annotations": [
{
"name": "flight_search.type"
}
]
},
{
"start_index": 68,
"end_index": 81,
"text": "San Francisco",
"annotations": [
{
"name": "flight_search.origin"
}
]
},
{
"start_index": 85,
"end_index": 91,
"text": "Denver",
"annotations": [
{
"name": "flight_search.destination1"
}
]
}
]
},
{
"index": 1,
"speaker": "ASSISTANT",
"text": "Hello, how can I help you?"
},
{
"index": 2,
"speaker": "ASSISTANT",
"text": "San Francisco to Denver, got it.",
"segments": [
{
"start_index": 0,
"end_index": 13,
"text": "San Francisco",
"annotations": [
{
"name": "flight_search.origin"
}
]
},
{
"start_index": 17,
"end_index": 23,
"text": "Denver",
"annotations": [
{
"name": "flight_search.destination1"
}
]
}
]
}
]
}
]
Code:
import json
from google.cloud import translate_v2 as translate
f = open('test.json')
data = json.load(f)
target = "ja"
translate_client = translate.Client()
for conv in data:
for utt in conv["utterances"]:
utt["text"] = translate_client.translate(utt["text"], target_language=target)["translatedText"]
if "segments" in utt:
for seg in utt["segments"]:
seg["text"] = translate_client.translate(seg["text"], target_language=target)["translatedText"]
#print(data) # prints a dictionary
json_object = json.dumps(data, indent=2,ensure_ascii=False).encode('utf8')
print(json_object.decode()) # prints a json string
Output:
[
{
"conversation_id": "dlg-00100680-00e0-40fe-8321-6d81b21bfc4f",
"instruction_id": "flight-12",
"utterances": [
{
"index": 0,
"speaker": "USER",
"text": "こんにちは。サンフランシスコからデンバーまでの民間航空会社の往復便を探したいのですが。",
"segments": [
{
"start_index": 26,
"end_index": 36,
"text": "往復",
"annotations": [
{
"name": "flight_search.type"
}
]
},
{
"start_index": 68,
"end_index": 81,
"text": "サンフランシスコ",
"annotations": [
{
"name": "flight_search.origin"
}
]
},
{
"start_index": 85,
"end_index": 91,
"text": "デンバー",
"annotations": [
{
"name": "flight_search.destination1"
}
]
}
]
},
{ "index": 1,
"speaker": "ASSISTANT",
"text": "こんにちは、どうすればいいですか?"
},
{
"index": 2,
"speaker": "ASSISTANT",
"text": "サンフランシスコからデンバーへ、了解。",
"segments": [
{
"start_index": 0,
"end_index": 13,
"text": "サンフランシスコ",
"annotations": [
{
"name": "flight_search.origin"
}
]
},
{
"start_index": 17,
"end_index": 23,
"text": "デンバー",
"annotations": [
{
"name": "flight_search.destination1"
}
]
}
]
}
]
}
]

find on id and append value to json parameter

I have the following data frame, df1:
A B C
123 B1 C1
456 B2 C2
And data frame df2:
A
[
{
"id": "123",
"details": {
"id": "123",
"color": null,
"param_1": {
"name": "mike"
},
"location": "US",
"items": [
{
"item_1": "#227858",
"offer_id": null,
"item_details": {
"detials_1": [{ "notes": "other:", "quantity": 1 }]
}
}
],
"version": 1,
}
}
]
[
{
"id": "456",
"details": {
"id": "456",
"color": null,
"param_1": {
"name": "james"
},
"location": "KR",
"items": [
{
"item_1": "#2221",
"offer_id": null,
"item_details": {
"detials_1": [{ "notes": "other", "quantity": 1 }]
}
}
],
"version": 2,
}
}
]
I want to find all values in df1[A] inside the JSON found inside df2[A] under the first instance of the id parameter. Once found, I want to replace the NULL values inside the color parameter with the df1[B] and offer_id with df1[C].
The output should create a new column with the appended values:
df2[B]:
[
{
"id": "123",
"details": {
"id": "123",
"color": B1,
"param_1": {
"name": "mike"
},
"location": "US",
"items": [
{
"item_1": "#227858",
"offer_id": C1,
"item_details": {
"detials_1": [{ "notes": "other:", "quantity": 1 }]
}
}
],
"version": 1,
}
}
]
[
{
"id": "456",
"details": {
"id": "456",
"color": B2,
"param_1": {
"name": "james"
},
"location": "KR",
"items": [
{
"item_1": "#2221",
"offer_id": C2,
"item_details": {
"detials_1": [{ "notes": "other", "quantity": 1 }]
}
}
],
"version": 2,
}
}
]
I just started researching how to approach this, but I need guidance on the most efficient way. Any insight would be greatly appreciated.

Cannot get jq to query json object [duplicate]

This question already has answers here:
How to use jq when the variable has reserved characters?
(3 answers)
Closed 6 months ago.
I have a JSON file that I am trying to query with jq. I am unable to retrieve the observations. I am trying to retieve each of the "observations using the following command and not able to get to the result:
cat sample3.json | jq .dataSets[0].series.0:0:0:0:0.observations.0[0]
I am able to retieve up to the series using:
cat sample3.json | jq .dataSets[0].series
But once I try to drill down further I am getting a compile error:
$ cat sample3.json | jq .dataSets[0].series.0:0:0:0:0
jq: error: syntax error, unexpected LITERAL, expecting end of file (Unix shell quoting issues?) at <top-level>, line 1:
.dataSets[0].series.0:0:0:0:0
jq: 1 compile error
I am not sure what I am doing wrong here....
The input file is:
{
"header": {
"id": "b8be2cd5-33bf-4687-9e81-eb032f6f8a71",
"test": false,
"prepared": "2022-09-01T13:30:57.013+02:00",
"sender": {
"id": "ECB"
}
},
"dataSets": [
{
"action": "Replace",
"validFrom": "2022-09-01T13:30:57.013+02:00",
"series": {
"0:0:0:0:0": {
"attributes": [
0,
null,
0,
null,
null,
null,
null,
null,
null,
null,
null,
null,
0,
null,
0,
null,
0,
0,
0,
0
],
"observations": {
"0": [
1.4529,
0,
0,
null,
null
],
"1": [
1.4472,
0,
0,
null,
null
],
"2": [
1.4591,
0,
0,
null,
null
]
}
}
}
}
],
"structure": {
"links": [
{
"title": "Exchange Rates",
"rel": "dataflow",
"href": "https://sdw-wsrest.ecb.europa.eu:443/service/dataflow/ECB/EXR/1.0"
}
],
"name": "Exchange Rates",
"dimensions": {
"series": [
{
"id": "FREQ",
"name": "Frequency",
"values": [
{
"id": "D",
"name": "Daily"
}
]
},
{
"id": "CURRENCY",
"name": "Currency",
"values": [
{
"id": "AUD",
"name": "Australian dollar"
}
]
},
{
"id": "CURRENCY_DENOM",
"name": "Currency denominator",
"values": [
{
"id": "EUR",
"name": "Euro"
}
]
},
{
"id": "EXR_TYPE",
"name": "Exchange rate type",
"values": [
{
"id": "SP00",
"name": "Spot"
}
]
},
{
"id": "EXR_SUFFIX",
"name": "Series variation - EXR context",
"values": [
{
"id": "A",
"name": "Average"
}
]
}
],
"observation": [
{
"id": "TIME_PERIOD",
"name": "Time period or range",
"role": "time",
"values": [
{
"id": "2022-08-29",
"name": "2022-08-29",
"start": "2022-08-29T00:00:00.000+02:00",
"end": "2022-08-29T23:59:59.999+02:00"
},
{
"id": "2022-08-30",
"name": "2022-08-30",
"start": "2022-08-30T00:00:00.000+02:00",
"end": "2022-08-30T23:59:59.999+02:00"
},
{
"id": "2022-08-31",
"name": "2022-08-31",
"start": "2022-08-31T00:00:00.000+02:00",
"end": "2022-08-31T23:59:59.999+02:00"
}
]
}
]
},
"attributes": {
"series": [
{
"id": "TIME_FORMAT",
"name": "Time format code",
"values": [
{
"name": "P1D"
}
]
},
{
"id": "BREAKS",
"name": "Breaks",
"values": []
},
{
"id": "COLLECTION",
"name": "Collection indicator",
"values": [
{
"id": "A",
"name": "Average of observations through period"
}
]
},
{
"id": "COMPILING_ORG",
"name": "Compiling organisation",
"values": []
},
{
"id": "DISS_ORG",
"name": "Data dissemination organisation",
"values": []
},
{
"id": "DOM_SER_IDS",
"name": "Domestic series ids",
"values": []
},
{
"id": "PUBL_ECB",
"name": "Source publication (ECB only)",
"values": []
},
{
"id": "PUBL_MU",
"name": "Source publication (Euro area only)",
"values": []
},
{
"id": "PUBL_PUBLIC",
"name": "Source publication (public)",
"values": []
},
{
"id": "UNIT_INDEX_BASE",
"name": "Unit index base",
"values": []
},
{
"id": "COMPILATION",
"name": "Compilation",
"values": []
},
{
"id": "COVERAGE",
"name": "Coverage",
"values": []
},
{
"id": "DECIMALS",
"name": "Decimals",
"values": [
{
"id": "4",
"name": "Four"
}
]
},
{
"id": "NAT_TITLE",
"name": "National language title",
"values": []
},
{
"id": "SOURCE_AGENCY",
"name": "Source agency",
"values": [
{
"id": "4F0",
"name": "European Central Bank (ECB)"
}
]
},
{
"id": "SOURCE_PUB",
"name": "Publication source",
"values": []
},
{
"id": "TITLE",
"name": "Title",
"values": [
{
"name": "Australian dollar/Euro"
}
]
},
{
"id": "TITLE_COMPL",
"name": "Title complement",
"values": [
{
"name": "ECB reference exchange rate, Australian dollar/Euro, 2:15 pm (C.E.T.)"
}
]
},
{
"id": "UNIT",
"name": "Unit",
"values": [
{
"id": "AUD",
"name": "Australian dollar"
}
]
},
{
"id": "UNIT_MULT",
"name": "Unit multiplier",
"values": [
{
"id": "0",
"name": "Units"
}
]
}
],
"observation": [
{
"id": "OBS_STATUS",
"name": "Observation status",
"values": [
{
"id": "A",
"name": "Normal value"
}
]
},
{
"id": "OBS_CONF",
"name": "Observation confidentiality",
"values": [
{
"id": "F",
"name": "Free"
}
]
},
{
"id": "OBS_PRE_BREAK",
"name": "Pre-break observation value",
"values": []
},
{
"id": "OBS_COM",
"name": "Observation comment",
"values": []
}
]
}
}
}
The .foo syntax cannot be used if the key name has anything but alphanumeric characters or the underscore, or if the first character of the key name is numeric.
Assuming you are using a recent version of jq,
you can always use the form: ."foo", which is actually an abbreviation of the basic form, .["foo"].
So assuming you're using a sufficiently recent version of jq, your query could begin with:
.dataSets[0].series."0:0:0:0:0"
If you are presenting the jq query on a command line, then you may have to escape the double-quotes appropriately, e.g. in a bash shell, by enclosing the jq query in single-quotes.

merge two lists within a object conditionally

Hellow jq experts!
I'm a jq learner and have a json obect composed lists as follows:
{
"image_files": [
{
"id": "img_0001",
"width": 32,
"heigt": 32,
"file_name": "img_0001.png"
},
{
"id": "img_0002",
"width": 128,
"heigt": 32,
"file_name": "img_0002.png"
},
{
"id": "img_0003",
"width": 32,
"heigt": 32,
"file_name": "img_0003.png"
},
{
"id": "img_0004",
"width": 160,
"heigt": 32,
"file_name": "img_0004.png"
}
],
"annotations": [
{
"id": "ann_0001",
"image_id": "img_0001",
"label": "A",
"attributes": {
"type": "letter",
"augmented": false
}
},
{
"id": "ann_0002",
"image_id": "img_0002",
"label": "Good",
"attributes": {
"type": "word",
"augmented": false
}
},
{
"id": "ann_0003",
"image_id": "img_0003",
"label": "C",
"attributes": {
"type": "letter",
"augmented": false
}
},
{
"id": "ann_0004",
"image_id": "img_0004",
"label": "Hello",
"attributes": {
"type": "word",
"augmented": false
}
}
]
}
image_id in the annotations list are foreign key referencing the id in the image_files list.
I want to join image_files and annotations with condition of annotations.attribute.type == "letter".
Expecting following ouptut:
{
"letter_image_files_with_label": [
{
"id": "img_0001",
"width": 32,
"heigt": 32,
"file_name": "img_0001.png",
"label": "A"
},
{
"id": "img_0003",
"width": 32,
"heigt": 32,
"file_name": "img_0003.png",
"label": "C"
}
]
}
How can I produce above result from the json data input?
join explained in jq manual does not seem to use this kind task.
Is there a way for this? Please show me the rope.
Thanks for your generous reading.
Indexing image_files with ids makes this pretty trivial.
INDEX(.image_files[]; .id) as $imgs | [
.annotations[]
| select(.attributes.type == "letter")
| $imgs[.image_id] + {label: .label}
]
Online demo

Hapi.js response with "type":"Buffer"

I have hapi.js, sequelize with mysql with script like this :
method: 'GET',
path: `/${GROUP_NAME}`,
options: {
tags: ['api', GROUP_NAME],
description: 'Mendapatkan jumlah tempat tidur berdasarkan kelas',
notes: 'Mendapatkan jumlah tempat tidur',
handler: async (request, h) => {
return jlhttidurbyjenis.findAll({ attributes: ['VIP','KELAS 1','KELAS 2','KELAS 3','ICU','NICU','PICU','HCU','ICCU','ISOLASI']})
},
validate: {
},
response: {
}
}
when I test with postman it response like this :
[
{
"VIP": {
"type": "Buffer",
"data": [
50,
47,
50,
50
]
},
"KELAS 1": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"KELAS 2": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"KELAS 3": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"ICU": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"NICU": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"PICU": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"HCU": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"ICCU": {
"type": "Buffer",
"data": [
48,
47,
48
]
},
"ISOLASI": {
"type": "Buffer",
"data": [
48,
47,
48
]
}
}
]
How to fix the script so the response will be the same with database content, it will be like this :
[
{
"VIP": "12/22",
"KELAS 1": "0/0",
"KELAS 2": "0/0",
"KELAS 3": "0/0",
"ICU": "0/0",
"NICU": "0/0",
"PICU": "0/0",
"HCU": "0/0",
"ICCU": "0/0",
"ISOLASI": "0/0"
}
]
That looks like a serialization issue from sequilize. Look at available configurations for JSON serialization with regards to buffer.