The following segment of a JSON file needs to be transformed, essentially flattening it to a single hash for sizes
"sizes" : {
"small" : {
"w" : "680",
"h" : "261",
},
"large" : {
"w" : "878",
"h" : "337",
},
"medium" : {
"w" : "878",
"h" : "337",
}
while the child attributes are accessible as:
parent['sizes'].each do |size|
w: size['w'],
h: size['h'],
label: size
end
will not function, as the whole hash will be loaded for the label. How can the hash's identity string only be captured?
For hashes each calls the block once for each key in hash, passing the key-value pair as parameters.
Like this:
parent['sizes'].each do |key, size|
# format your preferred output here
# key is small, large, medium
# and size is {"w"=>"680", "h"=>"261"}, ...
# for example
{
w: size['w'],
h: size['h'],
label: key
}
end
Related
Asking for a advice what would be in your opinion best and simple solution to replace and access values in nested hash or json by path ir variable using ruby?
For example imagine I have json or hash with this kind of structure:
{
"name":"John",
"address":{
"street":"street 1",
"country":"country1"
},
"phone_numbers":[
{
"type":"mobile",
"number":"234234"
},
{
"type":"fixed",
"number":"2342323423"
}
]
}
And I would like to access or change fixed mobile number by path which could be specified in variable like this: "phone_numbers/1/number" (separator does not matter in this case)
This solution is necessary to retrieve values from json/hash and sometimes replace variables by specifying path to it. Found some solutions which can find value by key, but this solution wouldn't work as there is some hashes/json where key name is same in multiple places.
I saw this one: https://github.com/chengguangnan/vine , but it does not work when payload is like this as it is not kinda hash in this case:
[
{
"value":"test1"
},
{
"value":"test2"
}
]
Hope you have some great ideas how to solve this problem.
Thank you!
EDIT:
So I tried code below with this data:
x = JSON.parse('[
{
"value":"test1"
},
{
"value":"test2"
}
]')
y = JSON.parse('{
"name":"John",
"address":{
"street":"street 1",
"country":"country1"
},
"phone_numbers":[
{
"type":"mobile",
"number":"234234"
},
{
"type":"fixed",
"number":"2342323423"
}
]
}')
p x
p y.to_h
p x.get_at_path("0/value")
p y.get_at_path("name")
And got this:
[{"value"=>"test1"}, {"value"=>"test2"}]
{"name"=>"John", "address"=>{"street"=>"street 1", "country"=>"country1"}, "phone_numbers"=>[{"type"=>"mobile", "number"=>"234234"}, {"type"=>"fixed", "number"=>"2342323423"}]}
hash_new.rb:91:in `<main>': undefined method `get_at_path' for [{"value"=>"test1"}, {"value"=>"test2"}]:Array (NoMethodError)
For y.get_at_path("name") got nil
You can make use of Hash.dig to get the sub-values, it'll keep calling dig on the result of each step until it reaches the end, and Array has dig as well, so when you reach that array things will keep working:
# you said the separator wasn't important, so it can be changed up here
SEPERATOR = '/'.freeze
class Hash
def get_at_path(path)
dig(*steps_from(path))
end
def replace_at_path(path, new_value)
*steps, leaf = steps_from path
# steps is empty in the "name" example, in that case, we are operating on
# the root (self) hash, not a subhash
hash = steps.empty? ? self : dig(*steps)
# note that `hash` here doesn't _have_ to be a Hash, but it needs to
# respond to `[]=`
hash[leaf] = new_value
end
private
# the example hash uses symbols as the keys, so we'll convert each step in
# the path to symbols. If a step doesn't contain a non-digit character,
# we'll convert it to an integer to be treated as the index into an array
def steps_from path
path.split(SEPERATOR).map do |step|
if step.match?(/\D/)
step.to_sym
else
step.to_i
end
end
end
end
and then it can be used as such (hash contains your sample input):
p hash.get_at_path("phone_numbers/1/number") # => "2342323423"
p hash.get_at_path("phone_numbers/0/type") # => "mobile"
p hash.get_at_path("name") # => "John"
p hash.get_at_path("address/street") # => "street 1"
hash.replace_at_path("phone_numbers/1/number", "123-123-1234")
hash.replace_at_path("phone_numbers/0/type", "cell phone")
hash.replace_at_path("name", "John Doe")
hash.replace_at_path("address/street", "123 Street 1")
p hash.get_at_path("phone_numbers/1/number") # => "123-123-1234"
p hash.get_at_path("phone_numbers/0/type") # => "cell phone"
p hash.get_at_path("name") # => "John Doe"
p hash.get_at_path("address/street") # => "123 Street 1"
p hash
# => {:name=>"John Doe",
# :address=>{:street=>"123 Street 1", :country=>"country1"},
# :phone_numbers=>[{:type=>"cell phone", :number=>"234234"},
# {:type=>"fixed", :number=>"123-123-1234"}]}
I have this JSON (I don't give you the whole thing because it's freaking long but you don't need the rest.)
cve" : {
"data_type" : "CVE",
"data_format" : "MITRE",
"data_version" : "4.0",
"CVE_data_meta" : {
"ID" : "CVE-2018-9991",
"ASSIGNER" : "cve#mitre.org"
},
"affects" : {
"vendor" : {
"vendor_data" : [ {
"vendor_name" : "frog_cms_project",
"product" : {
"product_data" : [ {
"product_name" : "frog_cms",
"version" : {
"version_data" : [ {
"version_value" : "0.9.5"
} ]
}
} ]
}
} ]
}
},
What I want to do is to print the vendor name of this cve.
So, what I did is :
with open("nvdcve-1.0-2018.json", "r") as file:
data = json.load(file)
increment = 0
number_cve = data["CVE_data_numberOfCVEs"]
while increment < int(number_cve):
print (data['CVE_Items'][increment]['cve']['CVE_data_meta']['ID'])
print (',')
print (data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data'][0]['vendor_name'])
print ("\n")
increment +=
The reason I did a while is because in the JSON file, there is a lot of CVEs, this is why I did data['CVE_Items'][increment]['cve'] (and this part works fine, the line `print (data['CVE_Items'][increment]['cve']['CVE_data_meta']['ID'] is working well).
My error is in the print (data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data'][0]['vendor_name']) line, python returns a list index out of range error.
But if I'm reading this JSON well, vendor_data is an array of 1 column so vendor_name is the ['vendor_data'][0]['vendor_name'] isn't it ?
The only way to parse the vendor_name i found is :
for value in data['CVE_Items'][a]['cve']['affects']['vendor']['vendor_data']:
print (value['vendor_name'])
instead of print (data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data'][0]['vendor_name'])
And doing a for just for one iteration is pretty disgusting :s, but at least, value is the data['CVE_Items'][a]['cve']['affects']['vendor']['vendor_data'][0] that I wanted....
Anyone knows something about it ?
Make sure every CVE_Item has an vender_data.
Example:
with open("nvdcve-1.0-2018.json", "r") as file:
data = json.load(file)
increment = 0
number_cve = data["CVE_data_numberOfCVEs"]
while increment < int(number_cve):
print (data['CVE_Items'][increment]['cve']['CVE_data_meta']['ID'])
print (',')
if (len(data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data']) > 0) :
print (data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data'][0]['vendor_name'])
print ("\n")
increment +=
Thanks to Ron Nabuurs' answer i found that all my vendor_data does not always have a vendor_name. So it is why the for works and not the print.
(the for check if the object is non null, else it stops).
So what I did is :
try:
print (data['CVE_Items'][increment]['cve']['affects']['vendor']['vendor_data'][0]['vendor_name'])
print (',')
except:
pass
Let's say I have the following document in a MongoDB database:
{
"assist_leaders" : {
"Steve Nash" : {
"team" : "Phoenix Suns",
"position" : "PG",
"draft_data" : {
"class" : 1996,
"pick" : 15,
"selected_by" : "Phoenix Suns",
"college" : "Santa Clara"
}
},
"LeBron James" : {
"team" : "Cleveland Cavaliers",
"position" : "SF",
"draft_data" : {
"class" : 2003,
"pick" : 1,
"selected_by" : "Cleveland Cavaliers",
"college" : "None"
}
},
}
}
I'm trying to collect a few values under "draft_data" for each player in an ORDERED list. The list needs to look like the following for this particular document:
[ [1996, 15, "Phoenix Suns"], [2003, 1, "Cleveland Cavaliers"] ]
That is, each nested list must contain the values corresponding to the "pick", "selected_by", and "class" keys, in that order. I also need the "Steve Nash" data to come before the "LeBron James" data.
How can I achieve this using pymongo? Note that the structure of the data is not set in stone so I can change this if that makes the code simpler.
I'd extract the data and turn it into a list in Python, once you've retrieved the document from MongoDB:
for doc in db.collection.find():
for name, info in doc['assist_leaders'].items():
draft_data = info['draft_data']
lst = [draft_data['class'], draft_data['pick'], draft_data['selected_by']]
print name, lst
List comprehension is the way to go here (Note: don't forget .iteritems() in Python2 or .items() in Python3 or you'll get a ValueError: too many values to unpack).
import pymongo
import numpy as np
client = pymongo.MongoClient()
db = client[database_name]
dataList = [v for i in ["Steve Nash", "LeBron James"]
for key in ["class", "pick", "selected_by"]
for document in db.collection_name.find({"assist_leaders": {"$exists": 1}})
for k, v in document["assist_leaders"][i]["draft_data"].iteritems()
if k == key]
print dataList
# [1996, 15, "Phoenix Suns", 2003, 1, "Cleveland Cavaliers"]
matrix = np.reshape(dataList, [2,3])
print matrix
# [ [1996, 15, "Phoenix Suns"],
# [2003, 1, "Cleveland Cavaliers"] ]
I have retrieved remote json using urllib.request in python3 and would like to to dump, line by line, the value of the IP addresses only (ie. ip:127.0.0.1 would be 127.0.0.1, next line is next IP) if it matches certain criteria. Other key values include a score (one integer value per category) and category (one or more string values possible).
I want to check if the score is higher than, say 10, AND the category number equals a list of one OR more values. If it fits the params, I just need those IP addresses added line by line to a text file.
Here is how I retrieve the json:
ip_fetch = urllib.request.urlopen('https://testonly.com/ip.json').read().decode('utf8')
I have the json module loaded, but don't know where to go from here.
Example of json data I'm working with, more than one category:
"127.0.0.1" : {
"Test" : "10",
"Prod" : "20"
},
I wrote a simple example that should show you how to iterate trough json objects and how to write to a file:
import json
j = json.loads(test)
threshold = 10
validCategories = ["Test"]
f=open("test.txt",'w')
for ip, categories in j.items():
addToList = False
for category, rank in categories.items():
if category in validCategories and int(rank) >= threshold:
addToList = True
if addToList:
f.write("{}\n".format(ip))
f.close()
I hope that helps you to get started. For testing I used the following json-string:
test = """
{
"127.0.0.1" : {
"Test" : "10",
"Prod" : "20"
},
"127.0.0.2" : {
"Test" : "5",
"Prod" : "20"
},
"127.0.0.3" : {
"Test" : "5",
"Prod" : "5",
"Test2": "20"
}
}
"""
I've wrote a program which process JSON objects. Now I want to verify if I've missed something.
Is there an JSON-example of all allowed JSON structure combinations? Something like this:
{
"key1" : "value",
"key2" : 1,
"key3" : {"key1" : "value"},
"key4" : [
[
"string1",
"string2"
],
[
1,
2
],
...
],
"key5" : true,
"key6" : false,
"key7" : null,
...
}
As you can see at http://json.org/ on the right hand side the grammar of JSON isn't quite difficult, but I've got several exceptions because I've forgotten to handles some structure combinations which are possible. E.g. inside an array there can be "string, number, object, array, true, false, null" but my program couldn't handle arrays inside an array until I ran into an exception. So everything was fine until I got this valid JSON object with arrays inside an array.
I want to test my program with a JSON object (which I'm looking for). After this test I want to be feel certain that my program handle every possible valid JSON structure on earth without an exception.
I don't need nesting in depth 5 or so. I only need something in nested depth 2 or max 3. With all base types which nested all allowed base types, inside this base type.
Have you thought of escaped characters and objects within an object?
{
"key1" : {
"key1" : "value",
"key2" : [
"String1",
"String2"
],
},
"key2" : "\"This is a quote\"",
"key3" : "This contains an escaped slash: \\",
"key4" : "This contains accent charachters: \u00eb \u00ef",
}
Note: \u00eb and \u00ef are resp. charachters ë and ï
Choose a programming language that support json.
Try to load your json, on fail the exception's message is descriptive.
Example:
Python:
import json, sys;
json.loads(open(sys.argv[1]).read())
Generate:
import random, json, os, string
def json_null(depth = 0):
return None
def json_int(depth = 0):
return random.randint(-999, 999)
def json_float(depth = 0):
return random.uniform(-999, 999)
def json_string(depth = 0):
return ''.join(random.sample(string.printable, random.randrange(10, 40)))
def json_bool(depth = 0):
return random.randint(0, 1) == 1
def json_list(depth):
lst = []
if depth:
for i in range(random.randrange(8)):
lst.append(gen_json(random.randrange(depth)))
return lst
def json_object(depth):
obj = {}
if depth:
for i in range(random.randrange(8)):
obj[json_string()] = gen_json(random.randrange(depth))
return obj
def gen_json(depth = 8):
if depth:
return random.choice([json_list, json_object])(depth)
else:
return random.choice([json_null, json_int, json_float, json_string, json_bool])(depth)
print(json.dumps(gen_json(), indent = 2))