Convert array of hash by logstash to simple hash - json

I am trying to parse by logstash this:
"attributes": [
{
"value": "ValueA",
"name": "NameA"
},
{
"value": "ValueB",
"name": "NameB"
},
{
"value": "ValueC",
"name": "NameC"
}
],
To this:
"attributes": {
"NameA": "ValueA",
"NameB": "ValueB",
"NameC": "ValueC"
}
Any recommendations?
I don't want to split this list to more records...

I have found th solution. For anyone dealing with a similar problem, here is the solution and a short story...
In the beginning, I tried this:
ruby {
code => '
xx = event.get("[path][to][data_source]")
event.set(['_my_destination'], Hash[xx.collect { |p| [p[:name], p[:value]] }])
'
But it returned an error because of the set method allowing a string only.
So I tried to do it this way:
ruby {
code => '
event.get("[path][to][data_source]").each do |item|
k = item[:name]
event.set("[_my_destination][#{k}]", item[:value])
end
'
}
I spent a lot of time debugging it because it works everywhere except in logstash :-D. After some grumbling, I finally fixed it. The solution with commentary is as follows.
ruby {
code => '
i = 0 # need index to address hash in array
event.get("[json_msg][mail][headers]").each do |item|
# need to use event.get function to get value
k = event.get("[json_msg][mail][headers][#{i}][name]")
v = event.get("[json_msg][mail][headers][#{i}][value]")
# now it is simple
event.set("[json_msg][headers][#{k}]", v)
i += 1
end
'
}

I think you should be able to do it with a custom Ruby script - see the ruby filter. You'll use the Event API to manipulate the event.
Maybe the aggregate filter could be also used but the Ruby script with one for loop seems more straightforward to me.

Related

Python glom with list of records group common unique client_ids together as key

I just discovered glom and the tutorial makes sense, but I can't figure out the right spec to use for chrome BrowserHistory.json entries to create a data structure grouped by client_id or if this is even the right use of glom. I think I can accomplish this using other methods by looping over the json, but was hoping to learn more about glom and its capabilities.
The json has Browser_History with a list for each history entry as follows:
{
"Browser_History": [
{
"favicon_url": "https://www.google.com/favicon.ico",
"page_transition": "LINK",
"title": "Google Takeout",
"url": "https://takeout.google.com",
"client_id": "abcd1234",
"time_usec": 1424794867875291
},
...
I'd like a data structure where everything is grouped by the client_id, like with the client_id as the key to a list of dicts, something like:
{ 'client_ids' : {
'abcd1234' : [ {
"title" : "Google Takeout",
"url" : "https://takeout.google.com",
...
},
...
],
'wxyz9876' : [ {
"title" : "Google",
"url" : "https://www.google.com",
...
},
...
}
}
Is this something glom is suited for? I've been playing around with it and reading, but I can't seem to get the spec correct to accomplish what I need. Best I've got without error is:
with open(history_json) as f:
history_list = json.load(f)['Browser_History']
spec = {
'client_ids' : ['client_id']
}
pprint(glom(data, spec))
which gets me a list of all the client_ids, but I can't figure out how to group them together as keys rather than have them as a big list. any help would be appreciated, thanks!
This should do the trick although I'm not sure if this is the most "glom"-ic way to achieve this.
import glom
grouping_key = "client_ids"
def group_combine (existing,incoming):
# existing is a dictionary used for accumulating the data
# incoming is each item in the list (your input)
if incoming[grouping_key] not in existing:
existing[incoming[grouping_key]] = []
if grouping_key in incoming:
existing[incoming[grouping_key]].append(incoming)
return existing
data ={ 'Browser_History': [{}] } # your data structure
fold_spec = glom.Fold(glom.T,init = dict, op = group_combine )
results = glom.glom(data["Browser_History"] ,{ grouping_key:fold_spec })

Passion in CouchBase Programming

I would like to get the single element in the Couchbase document that is in the array of objects, but i am able to fetch the array of objects
i tried to fetch the array using the following query, 'select countryDetails from test';
{
"type":"countries",
"docName":"CountryData",
"countryDetails":[
{
"name":"US",
"code":"+1",
"stateInfo":[
{
"name":"Florida",
"id":"1212"
},
{
"name":"NewYork",
"id":"1214"
}
]
},
{
"name":"France",
"code":"+33",
"stateInfo":[
{
"name":"Grand Est",
"id":"5212"
},
{
"name":"Brittany",
"id":"5214"
}
]
}
]
}
i tried fetching array using, select countryDetails from test;
i like to fetch the result as [ {"name" : "US", "code" : "+1" }, {"name" : "France", "code" : "+33"}]
If you project countryDetails it projects whole sub object.
If you need to part of sub object you need to explicitly project that.
The following ARRAY construction will provide the data representation you are expecting.
SELECT ARRAY {v.name,v.code} FOR v IN t.countryDetails END AS contryDetails
FROM test AS t
WHERE t.type = "countries";
What you are trying to do does not seem to be possible. You can get closer to what you want with a query like this:
select raw countryDetails from test
But the results of this query still have the result wrapped in an extra level of array.

Parsing, Extracting & Returning JSON as Hash

I am trying to make a localized version of this app: SMS Broadcast Ruby App
I have been able to get the JSON data from a local file & sanitize the number as well as open the JSON data. However I have been unable to extract the values and pair them as a scrubbed hash. Here's what I have so far.
def data_from_spreadsheet
file = open(spreadsheet_url).read
JSON.parse(file)
end
def contacts_from_spreadsheet
contacts = {}
data_from_spreadsheet.each do |entry|
puts entry['name']['number']
contacts[sanitize(number)] = name
end
contacts
end
Here's the JSON data sample I'm working with.
[
{
"name": "Michael",
"number": 9045555555
},
{
"name": "Natalie",
"number": 7865555555
}
]
Here's how I would like the JSON to be expressed after the contacts_from_spreadsheet method.
{
'19045555555' => 'Michael',
'19045555555' => 'Natalie'
}
Any help would be much appreciated.
You could create array of pairs (hashes) using map and then call reduce to get a single hash.
data = [{
"name": "Michael",
"number": 9045555555
},
{
"name": "Natalie",
"number": 7865555555
}]
data.map{|e| {e[:number] => e[:name]}}.reduce Hash.new, :merge
Result: {9045555555=>"Michael", 7865555555=>"Natalie"}
You don't seem to have number or name extracted in any way. I think first you'll need to update your code to get those details.
i.e. If entry is a JSON object (or rather was before parsing), you can do the following:
def contacts_from_spreadsheet
contacts = {}
data_from_spreadsheet.each do |entry|
contacts[sanitize(entry['number'])] = entry['name']
end
contacts
end
Not really keeping this function within JSON, but I have solved the problem. Here's what I used.
def data_from_spreadsheet
file = open(spreadsheet_url).read
YAML.load(file)
end
def contacts_from_spreadsheet
contacts = {}
data_from_spreadsheet.each do |entry|
name = entry['name']
number = entry['phone_number'].to_s
contacts[sanitize(number)] = name
end
contacts
end
This returned back clean array here:
{"+19045555555"=>"Michael", "+17865555555"=>"Natalie"}
Thanks everyone who added input!

How to validate Sub-Sets of JSON Keys using match contains when there are nested JSON's in the response

From a response, I extracted a subset like this.
{
"base": {
"first": {
"code": "1",
"description": "Its First"
},
"second": {
"code": "2",
"description": "Its Second"
},
"default": {
"last": {
"code": "last",
"description": "No"
}
}
}
}
If I need to do a single validation using And match X contains to check
Inside first the Code is 1
Inside default-last the code is last?
Instead of using json path for every validation, I am trying to extract a specific portion and validate it. If there is no nested json paths, I can do it very easily using And match X contains, however when there are nested jsons, I am not able to do it.
Does this work for you:
* def first = get[0] response..first
* match first.code == '1'
* def last = get[0] response..default.last
* match last.code == 'last'
Edit: ok looks like you want to condense into one line as far as possible, more importantly to be able to do contains in nested nodes. Personally, I find this sometimes to be not worth the trouble, but here goes.
Refer also to these short-cuts: https://github.com/intuit/karate#contains-short-cuts
* def first = { code: "1" }
* match response.base.first contains first
* match response.base contains { first: '#(^first)' }
* def last = { code: 'last' }
* match response.base contains { first: '#(^first)', default: { last: '#(^last)' } }
Mhmm, My question is slightly different I think.
For example if I directly point to the first using a json path and save it to a variable savedResponse, I can do this validation
And match savedResponse contains {code: "1"}
If there were 10 Key value combinations under first and if I need to validate 6 of those, I can use the same json path and I can easily do it using match contains
Similiar way if I save the above response to a variable savedResponse, how I can validate mutliple things using match contains, in this. The below statement will not work anyway.
And match savedResponse contains {first:{code:"1"}, last:{code:"last"}}
However if I modify something will it work?

Trying to alter JSON so it displays over multiple rows instead of single line using JSON_PRETTY_PRINT, not working

Here are my results:
[{"android_id":"4b76f380a2734530","date":"11\/11\/1992","entry":"Ate a peanut"},{"android_id":"4b76f380a2734530","date":"11\/11\/1994","entry":"Ate an banana"}]
What I want it to look like:
[
{
"android_id": "4b76f380a2734530",
"date": "11\/11\/1992",
"entry": "Ate a peanut"
},
{
"android_id": "4b76f380a2734530",
"date": "11\/11\/1994",
"entry": "Ate an banana"
}
]
I'm trying to use the JSON_PRETTY_PRINT function but it doesn't like it even thought I'm using PHP 5.6:
My code:
// Pring data as json string.
$json = json_encode($diary_entrys);
$json_string = json_encode($json, JSON_PRETTY_PRINT);
print_r($json);
Never mind, I should have used decode instead of encode, works fine now.