Parse nested Json to splunk query which has string - json

I have a multiple result for a macAddress which contains the device details.
This is the sample data
"data": {
"a1:b2:c3:d4:11:22": {
"deviceIcons": {
"type": "Phone",
"icons": {
"3x": null,
"2x": "image.png"
}
},
"advancedDeviceId": {
"agentId": 113,
"partnerAgentId": "131",
"dhcpHostname": "Galaxy-J7",
"mac": "a1:b2:c3:d4:11:22",
"lastSeen": 12,
"model": "Android Phoe",
"id": 1
}
},
"a0:b2:c3:d4:11:22": {
"deviceIcons": {
"type": "Phone",
"icons": {
"3x": null,
"2x": "image.png"
}
},
"advancedDeviceId": {
"agentId": 113,
"partnerAgentId": "131",
"dhcpHostname": "Galaxy",
"mac": "a0:b2:c3:d4:11:22",
"lastSeen": 12,
"model": "Android Phoe",
"id": 1
}
}
}
}
How can I query in splunk for all the kind of above sample results to get the advancedDeviceId.model and advancedDeviceId.id in tabular format?

I think this will do what you want
| spath
| untable _time column value
| rex field=column "data.(?<address>[^.]+)\.advancedDeviceId\.(?<item>[^.]+)"
| table _time address item value
| eval {item}=value
| stats list(model) as model
list(id) as id
list(dhcpHostname) as dhcpHostname
list(mac) as mac
by address
Here is a "run anywhere" example that has two events each with two addresses:
| makeresults
| eval _raw="{\"data\":{\"a1:b2:c3:d4:11:21\":{\"deviceIcons\":{\"type\":\"Phone\",\"icons\":{\"3x\":null,\"2x\":\"image.png\"}},\"advancedDeviceId\":{\"agentId\":113,\"partnerAgentId\":\"131\",\"dhcpHostname\":\"Galaxy-J7\",\"mac\":\"a1:b2:c3:d4:11:21\",\"lastSeen\":12,\"model\":\"Android Phoe\",\"id\":1}},\"a0:b2:c3:d4:11:22\":{\"deviceIcons\":{\"type\":\"Phone\",\"icons\":{\"3x\":null,\"2x\":\"image.png\"}},\"advancedDeviceId\":{\"agentId\":113,\"partnerAgentId\":\"131\",\"dhcpHostname\":\"iPhone 6\",\"mac\":\"a0:b2:c3:d4:11:22\",\"lastSeen\":12,\"model\":\"Apple Phoe\",\"id\":2}}}}"
| append [
| makeresults
| eval _raw="{\"data\":{\"b1:b2:c3:d4:11:23\":{\"deviceIcons\":{\"type\":\"Phone\",\"icons\":{\"3x\":null,\"2x\":\"image.png\"}},\"advancedDeviceId\":{\"agentId\":113,\"partnerAgentId\":\"131\",\"dhcpHostname\":\"Nokia\",\"mac\":\"b1:b2:c3:d4:11:23\",\"lastSeen\":12,\"model\":\"Symbian Phoe\",\"id\":3}},\"b0:b2:c3:d4:11:24\":{\"deviceIcons\":{\"type\":\"Phone\",\"icons\":{\"3x\":null,\"2x\":\"image.png\"}},\"advancedDeviceId\":{\"agentId\":113,\"partnerAgentId\":\"131\",\"dhcpHostname\":\"Windows\",\"mac\":\"b0:b2:c3:d4:11:24\",\"lastSeen\":12,\"model\":\"Windows Phoe\",\"id\":4}}}}"
]
| spath
| untable _time column value
| rex field=column "data.(?<address>[^.]+)\.advancedDeviceId\.(?<item>[^.]+)"
| table _time address item value
| eval {item}=value
| stats list(model) as model
list(id) as id
list(dhcpHostname) as dhcpHostname
list(mac) as mac
by address

Related

Create a composite object from a complex json object using jq

I have complex configuration file in JSON:
{
"config": {
...,
"extra": {
...
"auth_namespace.com": {
...
"name": "some_name",
"id": 1,
...
}
},
...,
"endpoints": [
{ ...,
"extra": {
"namespace_1.com": {...},
"namespace_auth.com": { "scope": "scope1" }
}},
{ ...
# object without "extra" property
...
},
...,
{ ...
"extra": {
"namespace_1.com": {...},
"namespace_auth.com": { "scope": "scope2" }
}},
{ ...
"extra": {
# scopes may repeat
"namespace_auth.com": { "scope": "scope2" }
}}
]
}
}
And I want to get the output object with properties "name", "id", "scopes". Where "scopes" is an array of unique values.
Something like this:
{
"name": "some_name",
"id": 1,
"scopes": ["scope1", "scope2" ... "scopeN"]
}
I can get these properties separately. But I don't know how to combine them together.
[
.config |
(
.extra["auth_namespace.com"] |
select(.name) |
{name, id}
) as $name_id |
.endpoints[] |
.extra["namespace_auth.com"].scope |
select(.)
] | unique | {scopes: .}
Perhaps the following is closer to what you're looking for:
.config
| (.extra."auth_namespace.com" | {id, name})
+ {scopes: .endpoints
| map( select(has("extra"))
| .extra."namespace_auth.com"
| select(has("scope"))
| .scope )
| unique }
Well, I found a solution. It's ugly, but it works.
Would be grateful if someone could write a more elegant version.
.config
| (
.endpoints
| map(.extra["namespace_auth.com"] | select(.scope) | .[])
| unique
) as $s
| .extra["auth_namespace.com"] | select(.name)
| {name, id, scopes: $s}

Karate API framework how to match the response values with the table columns?

I have below API response sample
{
"items": [
{
"id":11,
"name": "SMITH",
"prefix": "SAM",
"code": "SSO"
},
{
"id":10,
"name": "James",
"prefix": "JAM",
"code": "BBC"
}
]
}
As per above response, my tests says that whenever I hit the API request the 11th ID would be of SMITH and 10th id would be JAMES
So what I thought to store this in a table and assert against the actual response
* table person
| id | name |
| 11 | SMITH |
| 10 | James |
| 9 | RIO |
Now how would I match one by one ? like first it parse the first ID and first name from the API response and match with the Tables first ID and tables first name
Please share any convenient way of doing it from KARATE
There are a few possible ways, here is one:
* def lookup = { 11: 'SMITH', 10: 'James' }
* def items =
"""
[
{
"id":11,
"name":"SMITH",
"prefix":"SAM",
"code":"SSO"
},
{
"id":10,
"name":"James",
"prefix":"JAM",
"code":"BBC"
}
]
"""
* match each items contains { name: "#(lookup[_$.id+''])" }
And you already know how to use table instead of JSON.
Please read the docs and other stack-overflow answers to get more ideas.

Representing a DB schema in JSON

Let's say I have two tables in my database, employee and car defined thusly.
employee:
+--------------+------------+
| col_name | data_type |
+--------------+------------+
| eid | int |
| name | string |
| salary | int |
| destination | string |
+--------------+------------+
car:
+------------+----------------+
| col_name | data_type |
+------------+----------------+
| cid | int |
| name | string |
| model | string |
| cylinders | int |
| price | int |
+------------+----------------+
I would like to export this schema to a JSON object so that I can populate an HTML dropdown menu based on the table - for instance, the table menu would have employee and car. Selecting employee would populate another dropdown with the column names and types corresponding to that table.
Given this use case, would the optimal json representation of the database be this?
{
"employee": {
"salary": "int",
"destination": "string",
"eid": "int",
"name": "string"
},
"car": {
"price": "int",
"model": "string",
"cylinders": "int",
"name": "string",
"cid": "int"
}
}
EDIT:
Or would this be more appropriate?
{
"employee": [
{
"type": "int",
"colname": "eid"
},
{
"type": "string",
"colname": "name"
},
{
"type": "int",
"colname": "salary"
},
{
"type": "string",
"colname": "destination"
}
],
"car": [
{
"type": "int",
"colname": "cid"
},
{
"type": "string",
"colname": "name"
},
{
"type": "string",
"colname": "model"
},
{
"type": "int",
"colname": "cylinders"
},
{
"type": "int",
"colname": "price"
}
]
}
In the first example, all your data is stored in objects. Assuming the structure is stored in a var mytables, you can get the names with Object.keys(mytables), which returns ['employee', 'car']. Equivalent for the columns inside: Object.keys(mytables['employee'].cols) returns ['salary','destination','eid','name'].
In the second example I would suggest to also store the tables in an array as the columns, like
[name: 'employee',
cols: [ {
"type": "int",
"colname": "cid"
}, ...]
Then you can easily iterate over the arrays and get the names by accessing mytables[i].name
for (t in tables){
console.log(tables[t].name);
for (c in tables[t].cols)
console.log(" - ",tables[t].cols[c].colname, ": ", tables[t].cols[c].type);
}

Parse a tables with unicode chars in variables from JSON with SAS BASE

I've faced with a problem on parsing JSON with unicode char in vars.
So, I have the next JSON (example):
{
"SASJSONExport":"1.0",
"SASTableData+TEST":[
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":4,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0031"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":2,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0032"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":1,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":42,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0033"
}
]
}
To parse the table from JSON I use SAS engine:
libname jsonfl JSON fileref=injson ;
The code higher decode chars in cells, but name of vars looks like missing vals:
+--------------+---------------------------+------------+---------+---------+
| ordinal_root | ordinal_SASTableData_TEST | __________ | _______ | ______ |
+--------------+---------------------------+------------+---------+---------+
| 1 | 1 | 2 | 4 | Что-то1 |
| 1 | 2 | 2 | 2 | Что-то2 |
| 1 | 3 | 1 | 42 | Что-то3 |
+--------------+---------------------------+------------+---------+---------+
The header must look like:
+--------------+---------------------------+------------+---------+---------+
| ordinal_root | ordinal_SASTableData_TEST | Переменная | Среднее | Строка |
+--------------+---------------------------+------------+---------+---------+
So I've decide to replace unicoded variables chars with names like this DIM_N_.
And for that I must find all strings, that agree with next regexp: /([\s\w\d\\]+)\"\:/
But, to get strings from json I need set as delim the next char '{','}','[',']',','.
But if set that chars as dlm , I willn't assemble json again.
So I've decide to paste before the char ~ to set it as dlm.
data delim;
infile injson lrecl=1073741823 nopad;
file delim;
input char1 $char1. ##;
if char1 in ('{','}','[',']',',') then
put '7E'x;
put char1 $CHAR1. ##;
run;
I've get the novalid json file:
~
{"SASJSONExport":"1.0"~
,"SASTableData+TEST":~
[ ~
{"\u0056\u0061\u0072":2~
,"\u006d\u0065\u0061\u006e":4~
,"\u004e\u0061\u006d\u0065":"\u0073\u006d\u0074\u0068\u0031"~
}~
, ~
{"\u0056\u0061\u0072":2~
,"\u006d\u0065\u0061\u006e":2~
,"\u004e\u0061\u006d\u0065":"\u0073\u006d\u0074\u0068\u0032"~
}~
, ~
{"\u0056\u0061\u0072":1~
,"\u006d\u0065\u0061\u006e":42~
,"\u004e\u0061\u006d\u0065":"\u0073\u006d\u0074\u0068\u0033"~
} ~
]~
}
So as the next step I'm parsing JSON and use ~ as the delimiter:
data transfer;
length column $2000;
retain r;
infile delim delimiter='7E'x nopad;
input char1 : $4000. ##;
r = prxparse('/([\s\w\d\\]+)\"\:/');
pos = prxmatch(r,char1);
column = prxposn(r,1,char1);
n= _n_;
run;
It works... But I feel that those are too bad practices, and It has confines.
UPD1
Option,
options vAlidfmtname=long VALIDMEMNAME=extend VALIDVARNAME=any;
return:
+--------------+---------------------------+----------------------------+---------+--------------+
| ordinal_root | ordinal_SASTableData_TEST | __________ | _______ | ______ |
+--------------+---------------------------+----------------------------+---------+--------------+
| 1 | 1 | авфа2 фвафв = фвыа - тфвыа | 4 | Что-то1 ,,,, |
| 1 | 2 | авфа2 фвафв = фвыа - тфвыа | 2 | Что-то2 |
| 1 | 3 | авфа2 фвафв = фвыа - тфвыа | 2017 | Что-то3 |
+--------------+---------------------------+----------------------------+---------+--------------+
So my questions are:
Can I decode the whole file without the infile statement?
Can I use infile delimiter, but set smth options to not delete the delimiter?
Adequate criticism is welcomed.
UPDI came to the solution without having to manually edit the json map file, but using a regex.
libname _all_ clear;
filename _all_ clear;
filename _PDFOUT temp;
filename _GSFNAME temp;
proc datasets lib=work kill memtype=data nolist; quit;
filename jsf '~/sasuser.v94/.json' encoding='utf-8';
data _null_;
file jsf;
length js varchar(*);
retain js;
input;
js=unicode(_infile_);
put js;
datalines;
{
"SASJSONExport":"1.0",
"SASTableData+TEST":[
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":4,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0031"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":2,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0032"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":1,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":42,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0033"
}
]
}
;
run;
filename jsm '~/sasuser.v94/.json.map' encoding='utf-8';
libname jsd json fileref=jsf map=jsm automap=replace;
libname jsm json fileref=jsm;
data jsmm;
merge jsm.datasets jsm.datasets_variables;
by ordinal_DATASETS;
run;
proc sort data=jsmm; by ordinal_root ordinal_DATASETS; run;
data _null_;
set work.jsmm end=last;
if _N_=1 then do;
length s varchar(*) ds varchar(*);
retain s ds prx;
s='{"DATASETS":[';
ds='';
prx=prxparse('/[^_]/');
end;
if ds=dsname then s=s||',';
else do;
ds=dsname;
if _N_^=1 then s=s||']},';
s=cats(s,'{"DSNAME":"',ds,'","TABLEPATH":"',tablepath,'","VARIABLES":[');
end;
s=cats(s,'{"NAME":"',name,'","TYPE":"',type,'","PATH":"',path,'"');
if prxmatch(prx,name) > length(name) then
s=cats(s,',"LABEL":"',scan(path,-1,'/'),'"');
s=s||'}';
if last then do;
s=s||']}]}';
file jsm;
put s;
end;
run;
libname jsd json fileref=jsf map=jsm;
proc print data=jsd.SASTableData_TEST label noobs; run;
The first variant of the solutionIt is the quick'n'dirty solution.First preparing the input data:
libname _all_ clear;
filename _all_ clear;
filename jsf '~/sasuser.v94/.json' encoding='utf-8';
data _null_;
file jsf;
length js varchar(*);
input;
js=unicode(_infile_);
put js;
datalines;
{
"SASJSONExport":"1.0",
"SASTableData+TEST": [
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":4,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0031"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":2,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":2,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0032"
},
{
"\u041f\u0435\u0440\u0435\u043c\u0435\u043d\u043d\u0430\u044f":1,
"\u0421\u0440\u0435\u0434\u043d\u0435\u0435":42,
"\u0421\u0442\u0440\u043e\u043a\u0430":"\u0427\u0442\u043e\u002d\u0442\u043e\u0033"
}
]
}
;
run;
The output file .json:
{
"SASJSONExport":"1.0",
"SASTableData+TEST": [
{
"Переменная":2,
"Среднее":4,
"Строка":"Что-то1"
},
{
"Переменная":2,
"Среднее":2,
"Строка":"Что-то2"
},
{
"Переменная":1,
"Среднее":42,
"Строка":"Что-то3"
}
]
}
Then create the json map file .json.map:
filename jsmf '~/sasuser.v94/.json.map' encoding='utf-8';
libname jsm json fileref=jsf map=jsmf automap=create;
The .json.map contents:
{
"DATASETS": [
{
"DSNAME": "root",
"TABLEPATH": "/root",
"VARIABLES": [
{
"NAME": "ordinal_root",
"TYPE": "ORDINAL",
"PATH": "/root"
},
{
"NAME": "SASJSONExport",
"TYPE": "CHARACTER",
"PATH": "/root/SASJSONExport",
"CURRENT_LENGTH": 3
}
]
},
{
"DSNAME": "SASTableData_TEST",
"TABLEPATH": "/root/SASTableData+TEST",
"VARIABLES": [
{
"NAME": "ordinal_root",
"TYPE": "ORDINAL",
"PATH": "/root"
},
{
"NAME": "ordinal_SASTableData_TEST",
"TYPE": "ORDINAL",
"PATH": "/root/SASTableData+TEST"
},
{
"NAME": "____________________",
"TYPE": "NUMERIC",
"PATH": "/root/SASTableData+TEST/Переменная"
},
{
"NAME": "______________",
"TYPE": "NUMERIC",
"PATH": "/root/SASTableData+TEST/Среднее"
},
{
"NAME": "____________",
"TYPE": "CHARACTER",
"PATH": "/root/SASTableData+TEST/Строка",
"CURRENT_LENGTH": 12
}
]
}
]
}
Let's change the file a bit by removing the description of the unnesessary dataset and adding labels:
{
"DATASETS": [
{
"DSNAME": "SASTableData_TEST",
"TABLEPATH": "/root/SASTableData+TEST",
"VARIABLES": [
{
"NAME": "ordinal_root",
"TYPE": "ORDINAL",
"PATH": "/root"
},
{
"NAME": "ordinal_SASTableData_TEST",
"TYPE": "ORDINAL",
"PATH": "/root/SASTableData+TEST"
},
{
"NAME": "____________________",
"TYPE": "NUMERIC",
"PATH": "/root/SASTableData+TEST/Переменная",
"LABEL": "Переменная"
},
{
"NAME": "______________",
"TYPE": "NUMERIC",
"PATH": "/root/SASTableData+TEST/Среднее",
"LABEL": "Среднее"
},
{
"NAME": "____________",
"TYPE": "CHARACTER",
"PATH": "/root/SASTableData+TEST/Строка",
"LABEL": "Строка",
"CURRENT_LENGTH": 12
}
]
}
]
}
And try again:
libname jsd json fileref=jsf map=jsmf;
proc print data=jsd.SASTableData_TEST label noobs; run;
The result:
+--------------+---------------------------+- ----------+---------+-----------+
| ordinal_root | ordinal_SASTableData_TEST | Переменная | Среднее | Строка |
+--------------+---------------------------+------------+---------+-----------+
| 1 | 1 | 2 | 4 | Что-то1 |
| 1 | 2 | 2 | 2 | Что-то2 |
| 1 | 3 | 1 | 42 | Что-то3 |
+--------------+---------------------------+------------+---------+-----------+
All it was done in SAS University Edition.

Unnesting nested JSON structures in Apache Drill

I have the following JSON (roughly) and I'd like to extract the information from the header and defects fields separately:
{
"file": {
"header": {
"timeStamp": "2016-03-14T00:20:15.005+04:00",
"serialNo": "3456",
"sensorId": "1234567890",
},
"defects": [
{
"info": {
"systemId": "DEFCHK123",
"numDefects": "3",
"defectParts": [
"003", "006", "008"
]
}
}
]
}
}
I have tried to access the individual elements with file.header.timeStamp etc but that returns null. I have tried using flatten(file) but that gives me
Cannot cast org.apache.drill.exec.vector.complex.MapVector to org.apache.drill.exec.vector.complex.RepeatedValueVector
I've looked into kvgen() but don't see how that fits in my case. I tried kvgen(file.header) but that gets me
kvgen function only supports Simple maps as input
which is what I had expected anyway.
Does anyone know how I can get header and defects, so I can process the information contained in them. Ideally, I'd just select the information from header because it contains no arrays or maps, so I can take individual records as they are. For defects I'd simply use FLATTEN(defectParts) to obtain a table of the defective parts.
Any help would be appreciated.
What version of Drill are you using ? I tried querying the following file on latest master (1.7.0-SNAPHOT):
{
"file": {
"header": {
"timeStamp": "2016-03-14T00:20:15.005+04:00",
"serialNo": "3456",
"sensorId": "1234567890"
},
"defects": [
{
"info": {
"systemId": "DEFCHK123",
"numDefects": "3",
"defectParts": [
"003", "006", "008"
]
}
}
]
}
}
{
"file": {
"header": {
"timeStamp": "2016-03-14T00:20:15.005+04:00",
"serialNo": "3456",
"sensorId": "1234567890"
},
"defects": [
{
"info": {
"systemId": "DEFCHK123",
"numDefects": "3",
"defectParts": [
"003", "006", "008"
]
}
}
]
}
}
And the following queries are working fine:
1.
select t.file.header.serialno as serialno from `parts.json` t;
+-----------+
| serialno |
+-----------+
| 3456 |
| 3456 |
+-----------+
2 rows selected (0.098 seconds)
2.
select flatten(t.file.defects) defects from `parts.json` t;
+---------------------------------------------------------------------------------------+
| defects |
+---------------------------------------------------------------------------------------+
| {"info":{"systemId":"DEFCHK123","numDefects":"3","defectParts":["003","006","008"]}} |
| {"info":{"systemId":"DEFCHK123","numDefects":"3","defectParts":["003","006","008"]}} |
+---------------------------------------------------------------------------------------+
3.
select q.h.serialno as serialno, q.d.info.defectParts as defectParts from (select t.file.header h, flatten(t.file.defects) d from `parts.json` t) q;
+-----------+----------------------+
| serialno | defectParts |
+-----------+----------------------+
| 3456 | ["003","006","008"] |
| 3456 | ["003","006","008"] |
+-----------+----------------------+
2 rows selected (0.126 seconds)
PS: This should've been a comment but I don't have enough rep yet!
I don't have experience with Apache Drill, but checked the manual. Isn't this what you're looking for?
https://drill.apache.org/docs/selecting-multiple-columns-within-nested-data/
https://drill.apache.org/docs/selecting-nested-data-for-a-column/