Related
Working with Typeform Surveys
Here is the survey design for question 1:
Here is the survey design for question 2:
Here is the survey design for question 3:
Response Data
I took the survey twice.
On the first submission:
(favorite color) red, blue, Other: yellow
(quest) To seek the holy grail
(another question) what is your name
On the second submission:
(favorite color) red, Other: orange
(quest) Other: something else
(another question) What... is the air-speed velocity of an unladen swallow?
Here is a sample JSON file, which I saved as option.json:
{ "event_id": "EVENT_ID_1", "event_type": "form_response", "form_response": { "submitted_at": "2022-07-12T22:51:01Z", "token": "TOKEN_1", "calculated": null, "answers": [{ "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_1", "id": "QUESTION_ID_1", "type": "multiple_choice" }, "text": null, "number": null, "choices": { "labels": [{ "value": "red" }, { "value": "blue" }], "other": "yellow" }, "type": "choices", "date": null, "choice": null } }, { "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_2", "id": "QUESTION_ID_2", "type": "multiple_choice" }, "text": null, "number": null, "choices": { "labels": [{ "value": "To seek the holy grail" }], "other": null }, "type": "choices", "date": null, "choice": null } }, { "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_3", "id": "QUESTION_ID_3", "type": "short_text" }, "text": "what is your name", "number": null, "choices": null, "type": "text", "date": null, "choice": null } }], "form_id": "FORM_ID", "variables": [], "definition": { "id": "FORM_ID", "title": "Test Multiple Choice Other", "fields": [{ "value": { "ref": "QUESTION_REF_1", "id": "QUESTION_ID_1", "type": "multiple_choice", "title": "What is your favorite color?", "choices": [{ "value": { "id": "CHOICE_ID_1", "label": "red" } }, { "value": { "id": "CHOICE_ID_2", "label": "blue" } }], "allow_multiple_selections": "true", "allow_other_choice": "true" } }, { "value": { "ref": "QUESTION_REF_2", "id": "QUESTION_ID_2", "type": "multiple_choice", "title": "What is your quest?", "choices": [{ "value": { "id": "CHOICE_ID_3", "label": "To seek the holy grail" } }], "allow_multiple_selections": "true", "allow_other_choice": "true" } }, { "value": { "ref": "QUESTION_REF_3", "id": "QUESTION_ID_3", "type": "short_text", "title": "What else could I ask?", "choices": [], "allow_multiple_selections": null, "allow_other_choice": null } }] }, "hidden": null, "landed_at": "2022-07-12T22:49:34Z" }, "_sdc_received_at": "2022-07-12T22:51:33.248Z", "_sdc_sequence": "1657666262148", "_sdc_batched_at": "2022-07-12T22:57:17.785Z", "_sdc_table_version": "0" }
{ "event_id": "EVENT_ID_2", "event_type": "form_response", "form_response": { "submitted_at": "2022-07-12T22:53:08Z", "token": "TOKEN_2", "calculated": null, "answers": [{ "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_1", "id": "QUESTION_ID_1", "type": "multiple_choice" }, "text": null, "number": null, "choices": { "labels": [{ "value": "red" }], "other": "orange" }, "type": "choices", "date": null, "choice": null } }, { "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_2", "id": "QUESTION_ID_2", "type": "multiple_choice" }, "text": null, "number": null, "choices": { "labels": [], "other": "something else" }, "type": "choices", "date": null, "choice": null } }, { "value": { "boolean": null, "file_url": null, "url": null, "phone_number": null, "email": null, "field": { "ref": "QUESTION_REF_3", "id": "QUESTION_ID_3", "type": "short_text" }, "text": "What... is the air-speed velocity of an unladen swallow?", "number": null, "choices": null, "type": "text", "date": null, "choice": null } }], "form_id": "FORM_ID", "variables": [], "definition": { "id": "FORM_ID", "title": "Test Multiple Choice Other", "fields": [{ "value": { "ref": "QUESTION_REF_1", "id": "QUESTION_ID_1", "type": "multiple_choice", "title": "What is your favorite color?", "choices": [{ "value": { "id": "CHOICE_ID_1", "label": "red" } }, { "value": { "id": "CHOICE_ID_2", "label": "blue" } }], "allow_multiple_selections": "true", "allow_other_choice": "true" } }, { "value": { "ref": "QUESTION_REF_2", "id": "QUESTION_ID_2", "type": "multiple_choice", "title": "What is your quest?", "choices": [{ "value": { "id": "CHOICE_ID_3", "label": "To seek the holy grail" } }], "allow_multiple_selections": "true", "allow_other_choice": "true" } }, { "value": { "ref": "QUESTION_REF_3", "id": "QUESTION_ID_3", "type": "short_text", "title": "What else could I ask?", "choices": [], "allow_multiple_selections": null, "allow_other_choice": null } }] }, "hidden": null, "landed_at": "2022-07-12T22:51:27Z" }, "_sdc_received_at": "2022-07-12T22:53:38.604Z", "_sdc_sequence": "1657666388772", "_sdc_batched_at": "2022-07-12T22:57:17.818Z", "_sdc_table_version": "0" }
Then I created a table in BigQuery using that option.json file:
I am trying to flatten this in a way that I can easily share it with others.
sequence for multiple choice responses: one issue is that I need to keep track of how many options someone selects for each multiple choice question.
another problem is that I need to be able to handle the "other" option, which does not get saved in the same way as a regular multiple choice option.
What I have so far works fine as long as no one selects "optional" in the multiple choice types:
SELECT
tf.event_id,
a.value.field.id AS question_id,
f.value.title AS question_title,
a.value.type AS question_type,
CASE
WHEN a.value.type = 'choices' THEN c.value
WHEN a.value.type = 'text' THEN a.value.text
END AS value,
IF( a.value.type = 'choices' , choices_index+1 , 1 ) AS sequence,
COALESCE( ARRAY_LENGTH( a.value.choices.labels ) , 1 ) AS sequence_end,
FROM `<PROJECT_ID>.<DATASET_ID>.option` AS tf
LEFT JOIN UNNEST( tf.form_response.answers ) AS a
LEFT JOIN UNNEST( a.value.choices.labels ) AS c WITH OFFSET AS choices_index
INNER JOIN UNNEST( form_response.definition.fields ) f
ON a.value.field.id = f.value.id
Result:
Result in JSON:
[{
"event_id": "EVENT_ID_1",
"question_id": "QUESTION_ID_1",
"question_title": "What is your favorite color?",
"question_type": "choices",
"value": "red",
"sequence": "1",
"sequence_end": "2"
}, {
"event_id": "EVENT_ID_1",
"question_id": "QUESTION_ID_1",
"question_title": "What is your favorite color?",
"question_type": "choices",
"value": "blue",
"sequence": "2",
"sequence_end": "2"
}, {
"event_id": "EVENT_ID_1",
"question_id": "QUESTION_ID_2",
"question_title": "What is your quest?",
"question_type": "choices",
"value": "To seek the holy grail",
"sequence": "1",
"sequence_end": "1"
}, {
"event_id": "EVENT_ID_1",
"question_id": "QUESTION_ID_3",
"question_title": "What else could I ask?",
"question_type": "text",
"value": "what is your name",
"sequence": "1",
"sequence_end": "1"
}, {
"event_id": "EVENT_ID_2",
"question_id": "QUESTION_ID_1",
"question_title": "What is your favorite color?",
"question_type": "choices",
"value": "red",
"sequence": "1",
"sequence_end": "1"
}, {
"event_id": "EVENT_ID_2",
"question_id": "QUESTION_ID_2",
"question_title": "What is your quest?",
"question_type": "choices",
"value": null,
"sequence": null,
"sequence_end": "0"
}, {
"event_id": "EVENT_ID_2",
"question_id": "QUESTION_ID_3",
"question_title": "What else could I ask?",
"question_type": "text",
"value": "What... is the air-speed velocity of an unladen swallow?",
"sequence": "1",
"sequence_end": "1"
}]
I want the result to append the "optional" responses to the end of the list of the multiple-choice selections, which would mean my total query result should have 9 rows.
Specifically:
EVENT_ID_1, QUESTION_ID_1 should have 3 rows:
red (sequence: 1, sequence_end: 3)
blue (sequence: 2, sequence_end: 3)
yellow (sequence: 3, sequence_end: 3)
and
EVENT_ID_2, QUESTION_ID_1 should have 2 rows:
red (sequence: 1, sequence_end: 2)
orange (sequence: 2, sequence_end: 2)
and
EVENT_ID_3, QUESTION_ID_2 should have 1 non-null row:
something else (sequence: 1, sequence_end: 1)
I need your brilliance. Could you please help?
Basic SQL approach can be:
Find all the normal response values.
Find all the other response values.
Combine them together using union all
Below is an example from your shared data:
with titles as (
select distinct f.value.id question_id,f.value.title as question_title
from `TABLE_NAME` tf
,UNNEST( form_response.definition.fields ) f
)
,response as(
select tf.event_id
,a.value.field.id as question_id
,a.value.type AS question_type
,CASE
WHEN a.value.type = 'choices' THEN c.value
WHEN a.value.type = 'text' THEN a.value.text
END AS value
from `TABLE_NAME` tf
,UNNEST( tf.form_response.answers ) AS a
left join UNNEST( a.value.choices.labels ) AS c
)
,optionals as(
select distinct tf.event_id
,a.value.field.id as question_id
,a.value.type AS question_type
,a.value.choices.other value
from `TABLE_NAME` tf
,UNNEST( tf.form_response.answers ) AS a
where a.value.choices.other is not null
)
select t.question_title
,f.*
,row_number() over (partition by event_id,f.question_id,question_type ) sequence
,count(*) over (partition by event_id,f.question_id,question_type) sequence_end
from (
select * from response where value is not null
union all
select * from optionals
)f
left join titles t on f.question_id = t.question_id
order by event_id,question_id
Above is just an example, your query also works fine all you have to do is get another result set which will contain only optional value and merge both results together.
N.B Use Temp tables instead of CTE.
I'm trying to push JSON from a service in to Firebase, and it's not accepting it.
I cant figure out why. JSONLint validates it as valid.
The error I get is:
error: "Invalid data; couldn't parse JSON object, array, or value."
Here's the JSON payload:
{
"app_id": "e4805b8d5a9f4032b0bb8b6d9c6726b8",
"archived": false,
"attachments": {
"form.xml": {
"content_type": "text/xml",
"length": 4500,
"url": "https://someurl.com/form.xml"
}
},
"build_id": "3c5703e20346462ebcb07a3f36d5fe9b",
"domain": "my-test",
"edited_by_user_id": null,
"edited_on": null,
"form": {
"#type": "data",
"#name": "Beneficiary Registration",
"#uiVersion": "1",
"#version": "1",
"#xmlns": "http://openrosa.org/formdesigner/123456",
"beneficiary_age": {
"beneficiary_age_num": "8",
"beneficiary_exact_age_ind": "N"
},
"beneficiary_gender_cd": "M",
"beneficiary_id": "Hhr-O6I3C5L-J3G1L",
"case": {
"#case_id": "a07972bc-1a34-4c84-90ea-91250639f2a4",
"#date_modified": "2020-03-17T20:57:26.986000Z",
"#user_id": "2275c340c48ac83b6852035b0a15b5d3",
"#xmlns": "http://someurl.org/case/transaction/v2"
},
"case_name": "Hhr-O6I3C5L-J3G1L|joel galager|Katsekera|Katsekera|Mpando|Katsekera|KE",
"existing_beneficiaries": "Jane Doe\n\nJoan Doe",
"existing_beneficiaries_label": "",
"household_information": {
"beneficiary_household_information_display": "",
"beneficiary_location_hierarchy_1": "KE",
"beneficiary_location_hierarchy_1_text": "Kenya",
"beneficiary_location_hierarchy_2": "HF0001",
"beneficiary_location_hierarchy_2_text": "Katsekera",
"beneficiary_location_hierarchy_3": "TA0001",
"beneficiary_location_hierarchy_3_text": "Mpando",
"beneficiary_location_hierarchy_4": "GHV0001",
"beneficiary_location_hierarchy_4_text": "Katsekera",
"beneficiary_location_hierarchy_5": "V0001",
"beneficiary_location_hierarchy_5_text": "Katsekera",
"correct_information": "",
"hh_first_name": "John",
"hh_full_name": "John Doe",
"hh_id_fk": "Hhr-O6I3C5L",
"hh_last_name": "Doe",
"household_information": ""
},
"meta": {
"#xmlns": "http://openrosa.org/jr/xforms",
"appVersion": "Formplayer Version: 2.47",
"app_build_version": 1,
"commcare_version": null,
"deviceID": "Formplayer",
"drift": "0",
"geo_point": null,
"instanceID": "123-321-232",
"timeEnd": "2020-03-17T20:57:26.986000Z",
"timeStart": "2020-03-17T20:57:13.958000Z",
"userID": "2275c340c48ac83b6852035b0a15b5d3",
"username": "some_username"
},
"name_group": {
"beneficiary_first_name": "joel",
"beneficiary_full_name": "joel galager",
"beneficiary_last_name": "galager"
},
"subcase_0": {
"case": {
"#case_id": "2a4bfe27-a5c3-4f3a-8540-5f8ded86db85",
"#date_modified": "2020-03-17T20:57:26.986000Z",
"#user_id": "abc123",
"#xmlns": "http://commcarehq.org/case/transaction/v2",
"create": {
"case_name": "Hhr-O6I3C5L-J3G1L|joel galager|Katsekera|Katsekera|Mpando|Katsekera|KE",
"case_type": "beneficiary_case",
"owner_id": "abc123"
},
"index": {
"parent": {
"#text": "a07972bc-1a34-4c84-90ea-91250639f2a4",
"#case_type": "household_case"
}
},
"update": {
"beneficiary_age_num": "8",
"beneficiary_exact_age_ind": "N",
"beneficiary_first_name": "joel",
"beneficiary_full_name": "joel galager",
"beneficiary_gender_cd": "M",
"beneficiary_id": "Hhr-O6I3C5L-J3G1L",
"beneficiary_last_name": "galager",
"beneficiary_location_hierarchy_1": "KE",
"beneficiary_location_hierarchy_1_text": "Kenya",
"beneficiary_location_hierarchy_2": "HF0001",
"beneficiary_location_hierarchy_2_text": "Katsekera",
"beneficiary_location_hierarchy_3": "TA0001",
"beneficiary_location_hierarchy_3_text": "Mpando",
"beneficiary_location_hierarchy_4": "GHV0001",
"beneficiary_location_hierarchy_4_text": "Katsekera",
"beneficiary_location_hierarchy_5": "V0001",
"beneficiary_location_hierarchy_5_text": "Katsekera",
"hh_id_fk": "Hhr-O6I3C5L"
}
}
}
},
"id": "d547ccea-503c-4c50-b974-38ed564ae78a",
"indexed_on": "2020-03-17T21:01:04.695654",
"initial_processing_complete": true,
"is_phone_submission": true,
"metadata": {
"appVersion": "Formplayer Version: 2.47",
"app_build_version": 1,
"commcare_version": null,
"deviceID": "Formplayer",
"drift": "0",
"geo_point": null,
"instanceID": "01010101",
"location": null,
"timeEnd": "2020-03-17T20:57:26.986000Z",
"timeStart": "2020-03-17T20:57:13.958000Z",
"userID": "2275c340c48ac83b6852035b0a15b5d3",
"username": "myusername"
},
"problem": null,
"received_on": "2020-03-17T20:57:27.179986Z",
"resource_uri": "",
"server_modified_on": "2020-03-17T20:57:27.382548Z",
"type": "data",
"uiversion": "1",
"version": "1"
}
This is probably because of #type inside form.
The # character can't be used in RTDB paths.
See How data is structured from the docs.
Keys cannot contain
.
$
#
[
]
/
ASCII control characters 0-31 or 127
For the following JSON, I'd like to extract something like this ( is a TAB character).
CHROMOSOMES<TAB>HUMAN<TAB>1<TAB>1
...
STATUSES<TAB>name<TAB>Approved
...
ATTRIBUTES<TAB>HGNC<TAB>HGNC ID<TAB>gd_hgnc_id
...
ATTRIBUTES<TAB>EXTERNAL<TAB>NCBI Gene ID<TAB>md_eg_id<TAB>NCBI
...
ORDER_BY<TAB>HGNC ID<TAB>gd_hgnc_id
...
I'd like a smart way to extract the path info of this tree structure. Could you anybody show me the best way to do so? Thanks.
{
"CHROMOSOMES": {
"HUMAN": [
{
"name": "1",
"value": "1"
},
{
"name": "2",
"value": "2"
},
{
"name": "3",
"value": "3"
},
{
"name": "4",
"value": "4"
},
{
"name": "5",
"value": "5"
},
{
"name": "6",
"value": "6"
},
{
"name": "7",
"value": "7"
},
{
"name": "8",
"value": "8"
},
{
"name": "9",
"value": "9"
},
{
"name": "10",
"value": "10"
},
{
"name": "11",
"value": "11"
},
{
"name": "12",
"value": "12"
},
{
"name": "13",
"value": "13"
},
{
"name": "14",
"value": "14"
},
{
"name": "15",
"value": "15"
},
{
"name": "16",
"value": "16"
},
{
"name": "17",
"value": "17"
},
{
"name": "18",
"value": "18"
},
{
"name": "19",
"value": "19"
},
{
"name": "20",
"value": "20"
},
{
"name": "21",
"value": "21"
},
{
"name": "22",
"value": "22"
},
{
"name": "X",
"value": "X"
},
{
"name": "Y",
"value": "Y"
},
{
"name": "reserved loci",
"value": "reserved"
},
{
"name": "mitochondrial",
"value": "mito"
},
{
"name": "pseudoautosomal",
"value": "XandY"
}
]
},
"STATUSES": [
{
"name": "Approved",
"value": "Approved"
},
{
"name": "Entry and symbol withdrawn",
"value": "Entry Withdrawn"
}
],
"ATTRIBUTES": {
"HGNC": [
{
"name": "HGNC ID",
"value": "gd_hgnc_id"
},
{
"name": "Approved symbol",
"value": "gd_app_sym"
},
{
"name": "Approved name",
"value": "gd_app_name"
},
{
"name": "Status",
"value": "gd_status"
},
{
"name": "Locus type",
"value": "gd_locus_type"
},
{
"name": "Locus group",
"value": "gd_locus_group"
},
{
"name": "Previous symbols",
"value": "gd_prev_sym"
},
{
"name": "Previous name",
"value": "gd_prev_name"
},
{
"name": "Synonyms",
"value": "gd_aliases"
},
{
"name": "Name synonyms",
"value": "gd_name_aliases"
},
{
"name": "Chromosome",
"value": "gd_pub_chrom_map"
},
{
"name": "Date approved",
"value": "gd_date2app_or_res"
},
{
"name": "Date modified",
"value": "gd_date_mod"
},
{
"name": "Date symbol changed",
"value": "gd_date_sym_change"
},
{
"name": "Date name changed",
"value": "gd_date_name_change"
},
{
"name": "Accession numbers",
"value": "gd_pub_acc_ids"
},
{
"name": "Enzyme IDs",
"value": "gd_enz_ids"
},
{
"name": "NCBI Gene ID",
"value": "gd_pub_eg_id"
},
{
"name": "Ensembl gene ID",
"value": "gd_pub_ensembl_id"
},
{
"name": "Mouse genome database ID",
"value": "gd_mgd_id"
},
{
"name": "Specialist database links",
"value": "gd_other_ids"
},
{
"name": "Specialist database IDs",
"value": "gd_other_ids_list"
},
{
"name": "Pubmed IDs",
"value": "gd_pubmed_ids"
},
{
"name": "RefSeq IDs",
"value": "gd_pub_refseq_ids"
},
{
"name": "Gene group ID",
"value": "family.id"
},
{
"name": "Gene group name",
"value": "family.name"
},
{
"name": "CCDS IDs",
"value": "gd_ccds_ids"
},
{
"name": "Vega IDs",
"value": "gd_vega_ids"
},
{
"name": "Locus specific databases",
"value": "gd_lsdb_links"
}
],
"EXTERNAL": [
{
"name": "NCBI Gene ID",
"source": "NCBI",
"value": "md_eg_id"
},
{
"name": "OMIM ID",
"source": "OMIM",
"value": "md_mim_id"
},
{
"name": "RefSeq",
"source": "NCBI",
"value": "md_refseq_id"
},
{
"name": "UniProt ID",
"source": "UniProt",
"value": "md_prot_id"
},
{
"name": "Ensembl ID",
"source": "Ensembl",
"value": "md_ensembl_id"
},
{
"name": "Vega ID",
"source": "Vega",
"value": "md_vega_id"
},
{
"name": "UCSC ID",
"source": "UCSC",
"value": "md_ucsc_id"
},
{
"name": "Mouse genome database ID",
"source": "MGI",
"value": "md_mgd_id"
},
{
"name": "Rat genome database ID",
"source": "RGD",
"value": "md_rgd_id"
},
{
"name": "LNCipedia",
"source": "LNCipedia",
"value": "md_lncipedia"
},
{
"name": "GtRNAdb",
"source": "GtRNAdb",
"value": "md_gtrnadb"
}
]
},
"ORDER_BY": [
{
"name": "HGNC ID",
"value": "gd_hgnc_id"
},
{
"name": "Approved symbol",
"value": "gd_app_sym_sort"
},
{
"name": "Approved name",
"value": "gd_app_name"
},
{
"name": "Status",
"value": "gd_status"
},
{
"name": "Locus type",
"value": "gd_locus_type"
},
{
"name": "Locus group",
"value": "gd_locus_group"
},
{
"name": "Previous symbols",
"value": "gd_prev_sym"
},
{
"name": "Previous name",
"value": "gd_prev_name"
},
{
"name": "Synonyms",
"value": "gd_aliases"
},
{
"name": "Name synonyms",
"value": "gd_name_aliases"
},
{
"name": "Chromosome",
"value": "gd_pub_chrom_map_sort"
},
{
"name": "Date approved",
"value": "gd_date2app_or_res"
},
{
"name": "Date modified",
"value": "gd_date_mod"
},
{
"name": "Date symbol changed",
"value": "gd_date_sym_change"
},
{
"name": "Date name changed",
"value": "gd_date_name_change"
},
{
"name": "Accession numbers",
"value": "gd_pub_acc_ids"
},
{
"name": "Enzyme IDs",
"value": "gd_enz_ids"
},
{
"name": "NCBI Gene ID",
"value": "gd_pub_eg_id"
},
{
"name": "Ensembl gene ID",
"value": "gd_pub_ensembl_id"
},
{
"name": "Mouse genome database ID",
"value": "gd_mgd_id"
},
{
"name": "Specialist database links",
"value": "gd_other_ids"
},
{
"name": "Specialist database IDs",
"value": "gd_other_ids_list"
},
{
"name": "Pubmed IDs",
"value": "gd_pubmed_ids"
},
{
"name": "RefSeq IDs",
"value": "gd_pub_refseq_ids"
},
{
"name": "Gene group ID",
"value": "family.id"
},
{
"name": "Gene group name",
"value": "family.name"
},
{
"name": "CCDS IDs",
"value": "gd_ccds_ids"
},
{
"name": "Vega IDs",
"value": "gd_vega_ids"
},
{
"name": "Locus specific databases",
"value": "gd_lsdb_links"
},
{
"name": "NCBI Gene ID (supplied by NCBI)",
"value": "md_eg_id"
},
{
"name": "OMIM ID (supplied by OMIM)",
"value": "md_mim_id"
},
{
"name": "RefSeq (supplied by NCBI)",
"value": "md_refseq_id"
},
{
"name": "UniProt ID (supplied by UniProt)",
"value": "md_prot_id"
},
{
"name": "Ensembl ID (supplied by Ensembl)",
"value": "md_ensembl_id"
},
{
"name": "Vega ID (supplied by Vega)",
"value": "md_vega_id"
},
{
"name": "UCSC ID (supplied by UCSC)",
"value": "md_ucsc_id"
},
{
"name": "Mouse genome database ID (supplied by MGI)",
"value": "md_mgd_id"
},
{
"name": "Rat genome database ID (supplied by RGD)",
"value": "md_rgd_id"
},
{
"name": "LNCipedia ID (supplied by LNCipedia)",
"value": "md_lncipedia"
},
{
"name": "GtRNAdb ID (supplied by GtRNAdb)",
"value": "md_gtrnadb"
}
],
"OUTPUT": [
"Text",
"Make URL for text"
]
}
I'd like a smart way to extract the path info of this tree structure.
paths is your friend.
Given certain irregularities in the input, the exact requirements are
not always clear, but the following might be what you are looking for
and even if not, it would be easy to tweak in accordance with your
detailed requirements.
totsv.jq
def s: map(select(type=="string"));
paths as $p
| getpath($p)
| if type == "object" and has("name")
then ($p|s) + [.name, .value, (.source // empty)]
elif type == "array" and .[0] == "Text" then ($p|s) + .
else empty
end
| #tsv
Invocation
jq -crf totsv.jq chromosomes.json
Selection from output
CHROMOSOMES HUMAN 1 1
CHROMOSOMES HUMAN 2 2
...
STATUSES Approved Approved
STATUSES Entry and symbol withdrawn Entry Withdrawn
ATTRIBUTES HGNC HGNC ID gd_hgnc_id
...
ORDER_BY GtRNAdb ID (supplied by GtRNAdb) md_gtrnadb
OUTPUT Text Make URL for text
For future reference
Rather than give a very long sample input, it would be better
to give a small sample that is tightly woven with detailed requirements.
I'm trying to extract a specific value from JSON file.
the key value is: "info": "this is an example" (The key is unique)
I want to extract only the value: "this is an example"
My code:
cat 9.json | jq '.info'
result:
null
JSON file example:
{
"Event": {
"id": "13",
"orgc_id": "1",
"org_id": "1",
"date": "2019-01-09",
"threat_level_id": "3",
"info": "test9",
"published": false,
"uuid": "5c35d180",
"attribute_count": "2",
"analysis": "0",
"timestamp": "1547044733",
"distribution": "1",
"proposal_email_lock": false,
"locked": false,
"publish_timestamp": "1547034089",
"sharing_group_id": "0",
"disable_correlation": false,
"extends_uuid": "",
"event_creator_email": "o#cyhgfnt.com",
"Org": {
"id": "1",
"name": "Cygfdgfdnt",
"uuid": "5b9f938d-e3a0-4ecb-83b3-0bdeac1b41bc"
},
"Orgc": {
"id": "1",
"name": "Cyhgfgft",
"uuid": "5b9f938d-e3a0-4ecb-83b3-0bdeac1b41bc"
},
"Attribute": [{
"id": "292630",
"type": "domain",
"category": "Network activity",
"to_ids": true,
"uuid": "5c35dd94-cccc-4086-b386-682823717aa5",
"event_id": "1357",
"distribution": "5",
"timestamp": "1547034584",
"comment": "This is a comment",
"sharing_group_id": "0",
"deleted": false,
"disable_correlation": false,
"object_id": "0",
"object_relation": null,
"value": "dodskj.com",
"Galaxy": [],
"ShadowAttribute": [],
"Tag": [{
"id": "223",
"name": "kill-chain:Exploitation",
"colour": "#a80079",
"exportable": true,
"user_id": "0",
"hide_tag": false,
"numerical_value": null
}]
}, {
"id": "292631",
"type": "ip-dst",
"category": "Network activity",
"to_ids": true,
"uuid": "5c35dd94-fe90-4ef6-b3a9-682823717aa5",
"event_id": "1357",
"distribution": "5",
"timestamp": "1547044733",
"comment": "comment example",
"sharing_group_id": "0",
"deleted": false,
"disable_correlation": false,
"object_id": "0",
"object_relation": null,
"value": "8.8.6.6",
"Galaxy": [],
"ShadowAttribute": [],
"Tag": [{
"id": "247",
"name": "maec-malware-capabilities:maec-malware-capability=\"anti-removal\"",
"colour": "#3f0004",
"exportable": true,
"user_id": "0",
"hide_tag": false,
"numerical_value": null
}, {
"id": "465",
"name": "osint:lifetime=\"perpetual\"",
"colour": "#006ebe",
"exportable": true,
"user_id": "0",
"hide_tag": false,
"numerical_value": null
}]
}],
"ShadowAttribute": [],
"RelatedEvent": [],
"Galaxy": [{
"id": "3",
"uuid": "698774c7-8022-42c4-917f-8d6e4f06ada3",
"name": "Threat Actor",
"type": "threat-actor",
"description": "Threat actors are characteristics of malicious actors (or adversaries) representing a cyber attack threat including presumed intent and historically observed behaviour.",
"version": "3",
"icon": "user-secret",
"namespace": "misp",
"GalaxyCluster": [{
"id": "6397",
"collection_uuid": "7cdff317-a673-4474-84ec-4f1754947823",
"type": "threat-actor",
"value": "Sofacy",
"tag_name": "misp-galaxy:threat-actor=\"Sofacy\"",
"description": "The Sofacy Group (also known as APT28, Pawn Storm, Fancy Bear and Sednit) is a cyber espionage group believed to have ties to the Russian government. Likely operating since 2007, the group is known to target government, military, and security organizations. It has been characterized as an advanced persistent threat.",
"galaxy_id": "3",
"source": "MISP Project",
"authors": ["Alexandre Dulaunoy", "Florian Roth", "Thomas Schreck", "Timo Steffens", "Various"],
"version": "82",
"uuid": "5b4ee3ea-eee3-4c8e-8323-85ae32658754",
"tag_id": "608",
"meta": {
"cfr-suspected-state-sponsor": ["Russian Federation"],
"cfr-suspected-victims": ["Georgia", "France", "Jordan", "United States", "Hungary", "World Anti-Doping Agency", "Armenia", "Tajikistan", "Japan", "NATO", "Ukraine", "Belgium", "Pakistan", "Asia Pacific Economic Cooperation", "International Association of Athletics Federations", "Turkey", "Mongolia", "OSCE", "United Kingdom", "Germany", "Poland", "European Commission", "Afghanistan", "Kazakhstan", "China"],
"cfr-target-category": ["Government", "Military"],
"cfr-type-of-incident": ["Espionage"],
"country": ["RU"],
"refs": ["https:\/\/en.wikipedia.org\/wiki\/Sofacy_Group", "https:\/\/aptnotes.malwareconfig.com\/web\/viewer.html?file=..\/APTnotes\/2014\/apt28.pdf", "http:\/\/www.trendmicro.com\/cloud-content\/us\/pdfs\/security-intelligence\/white-papers\/wp-operation-pawn-storm.pdf", "https:\/\/www2.fireeye.com\/rs\/848-DID-242\/images\/wp-mandiant-matryoshka-mining.pdf", "https:\/\/www.crowdstrike.com\/blog\/bears-midst-intrusion-democratic-national-committee\/", "http:\/\/researchcenter.paloaltonetworks.com\/2016\/06\/unit42-new-sofacy-attacks-against-us-government-agency\/", "https:\/\/www.cfr.org\/interactive\/cyber-operations\/apt-28", "https:\/\/blogs.microsoft.com\/on-the-issues\/2018\/08\/20\/we-are-taking-new-steps-against-broadening-threats-to-democracy\/", "https:\/\/www.bleepingcomputer.com\/news\/security\/microsoft-disrupts-apt28-hacking-campaign-aimed-at-us-midterm-elections\/", "https:\/\/www.bleepingcomputer.com\/news\/security\/apt28-uses-lojax-first-uefi-rootkit-seen-in-the-wild\/"],
"synonyms": ["APT 28", "APT28", "Pawn Storm", "PawnStorm", "Fancy Bear", "Sednit", "TsarTeam", "Tsar Team", "TG-4127", "Group-4127", "STRONTIUM", "TAG_0700", "Swallowtail", "IRON TWILIGHT", "Group 74"]
}
}]
}],
"Object": [],
"Tag": [{
"id": "608",
"name": "misp-galaxy:threat-actor=\"Sofacy\"",
"colour": "#12e000",
"exportable": true,
"user_id": "0",
"hide_tag": false,
"numerical_value": null
}, {
"id": "118",
"name": "gdpr:special-categories=\"health\"",
"colour": "#3ce600",
"exportable": true,
"user_id": "0",
"hide_tag": false,
"numerical_value": null
}]
}
}
I suppose you are trying to get the .info field inside .Event which should have been written as below. Use -r for without quotes
jq '.Event.info'
I have following json:
{
"id": "1",
"name": "profile1",
"userId": "0",
"groupId": "3",
"attributes": [
{
"id": "104",
"name": "Enable",
"value": "1"
},
{
"id": "105",
"name": "TargetNode",
"value": "system1"
},
{
"id": "106",
"name": "Timeout",
"value": "30"
}
],
"xconns": [
{
"id": "1",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
},
{
"id": "1",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
},
{
"id": "1",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
}
]
}
{
"id": "2",
"name": "profile2",
"userId": "7",
"groupId": "0",
"attributes": [
{
"id": "104",
"name": "Enable",
"value": "1"
},
{
"id": "105",
"name": "TargetNode",
"value": "system2"
},
{
"id": "106",
"name": "Timeout",
"value": "30"
}
],
"xconns": [
{
"id": "2",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
},
{
"id": "2",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
},
{
"id": "2",
"locked": false,
"attributeList": [
{
"id": "101",
"name": "Lgrp",
"value": "1"
},
{
"id": "102",
"name": "IsRem",
"value": "1"
},
{
"id": "103",
"name": "Media",
"value": "1"
}
]
}
]
}
I can filter following:
$ jq -r 'select([.attributes[] | .name == "TargetNode" ] | any ) | [{userId, groupId, id, name}] | .[] | if (.userId == "0") then del(.userId) else . end | if (.groupId == "0") then del(.groupId) else . end | to_entries | map("\(.key | ascii_upcase):\(.value)") | #tsv' file.json
GROUPID:3 ID:1 NAME:profile1
USERID:7 ID:2 NAME:profile2
I need to add also value of TargetNode:
GROUPID:3 ID:1 NAME:profile1 TARGETNODE:system1
USERID:7 ID:2 NAME:profile2 TARGETNODE:system2
is there a way to include it in
[{userId, groupId, id, name, TargetNode}]
to get the value of TargetNode and not null?
GROUPID:3 ID:1 NAME:profile1 TARGETNODE:null
USERID:7 ID:2 NAME:profile2 TARGETNODE:null
Update:
the solution provided by RomanPerekhrest is nearly ok, but there is one issue because the json file in real is much bigger, there are more attrobutes in "main secttion", for example:
{
"id": "1",
"name": "profile1",
"userId": "0",
"groupId": "3",
"attrib101": "A",
"attrib102": "B",
"attributes": [
...
...
it is cousing that RomanPerekhrest's jq filter returns too much...
how to rid of them too?
ID:1 NAME:profile1 GROUPID:3 ATTRIB101:A ATTRIB102:B TARGETNODE:system1
ID:2 NAME:profile2 USERID:7 ATTRIB101:C ATTRIB102:D TARGETNODE:system2
jq solution:
jq -r '.attributes |= map(select(.name == "TargetNode"))
| if (.attributes | length != 0) then .targetNode = .attributes[0].value else . end
| if (.userId == "0") then del(.userId) else . end
| if (.groupId == "0") then del(.groupId) else . end
| del(.attributes, .xconns) | to_entries
| map("\(.key | ascii_upcase):\(.value)") | #tsv' file.json
If an object with "name": "TargetNode" pair not exists - TARGETNODE won't be added into resulting structure
The output:
ID:1 NAME:profile1 GROUPID:3 TARGETNODE:system1
ID:2 NAME:profile2 USERID:7 TARGETNODE:system2