Parse JSON where object is variable - json

I get the following JSON response for Alpha vantage Stock API:
"Time Series (Daily)": {
"2018-07-09": {
"1. open": "142.6000",
...
},
I get different dates depending on the location of the stock exchange. On some, it might say "2018-07-09", and on others "2018-07-10".
Is there a way to reach the attributes of an object without typing the object, something along the lines of:
["Time Series (Daily)"]["first_child"]["1. open"]
Using:
["Time Series (Daily)"][0]["1. open"]
doesn't work.

ruby_hash["Time Series (Daily)"].values.first["1. open"]

Adapted from How can I get the key Values in a JSON array in ruby?
require 'json'
myData = JSON.parse('{"A": {"B": {"C": {"D": "open sesame"}}}}')
def getProperty(index, parsedData)
return parsedData[parsedData.keys[index]]
end
myDataFirstProperty = getProperty(0, myData) # => {"B"=>{"C"=>{"D"=>"open sesame"}}}
evenDeeper = getProperty(0, myDataFirstProperty) # => {"C"=>{"D"=>"open sesame"}}
This answer can also get the Nth property of the JSON, unlike the accepted answer which can only get the first.

Related

Check if a value exists in a json file with python

I've the following json file (banneds.json):
{
"players": [
{
"avatar": "https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/07/07aa315f664efa92456569429230bc2c254c3ff8_full.jpg",
"created": 1595050663,
"created_by": "<#128152620136267776>",
"nick": "teste",
"steam64": 76561198046619692
},
{
"avatar": "https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/21/21fa5c468597e9c890212b2e3bdb0fac781c040c_full.jpg",
"created": 1595056420,
"created_by": "<#128152620136267776>",
"nick": "ingridão",
"steam64": 76561199058918551
}
]
}
And I want to insert new values if the new value (inserted by user) is not already in the json, however when I try to search if the value is already there I receive a false value, an example of what I'm doing ( not the original code, only an example ):
import json
check = 76561198046619692
with open('banneds.json', 'r') as file:
data = json.load(file)
if check in data:
print(True)
else:
print(False)
I'm always receiving the "False" result, but the value is there, someone can give me a light of what I'm doing wrong please? I tried the entire night to find a solution, but no one works :(
Thanks for the help!
You are checking data as a dictionary object. When checking using if check in data it checks if data object have a key matching the value of the check variable (data.keys() to list all keys).
One easy way would be to use if check in data["players"].__str__() which will convert value to a string and search for the match.
If you want to make sure that check value only checks for the steam64 values, you can write a simple function that will iterate over all "players" and will check their "steam64" values. Another solution would be to make list of "steam64" values for faster and easier checking.
You can use any() to check if value of steam64 key is there.
For example:
import json
def check_value(data, val):
return any(player['steam64']==val for player in data['players'])
with open('banneds.json', 'r') as f_in:
data = json.load(f_in)
print(check_value(data, 76561198046619692))
Prints:
True

Get corresponding time and value from json using Django (Python 2)

The following is a sample of the JSON that I'm fetching from the source. I need to get certain values from it such as time, value, value type. The time and value are in different arrays and on different depth.
So, my aim is to get the time and value like for value 1, I should get the timestamp 1515831588000, 2 for 1515838788000 and 3 for 1515845987000.
Please mind that this is just a sample JSON so ignore any mistakes that I may have made while writing this array.
{
"feed":{
"component":[
{
"stream":[
{
"statistic":[
{
"type":"DOUBLE",
"data":[
1,
2,
3
]
}
],
"time":[
1515831588000,
1515838788000,
1515845987000
]
}
]
}
]
},
"message":"",
"success":true
}
Here is a function, that I have written to get the values but the timestamp that I'm getting on the final step is incorrect. Please help to resolve this issue.
get_feed_data() is the function that is giving the above JSON.
# Fetch the component UIDs from the database
component_uids = ComponentDetail.objects.values('uid')
# Make sure we have component UIDs to begin with
if component_uids:
for component_uid in component_uids:
# Fetch stream UIDs for each component from the database
stream_uids = StreamDetail.objects.values('uid').filter(comp_id=ComponentDetail.objects.get(uid=component_uid['uid']))
for stream_uid in stream_uids:
feed_data = json.loads(get_feed_data(component_uid['uid'], stream_uid['uid']))
sd = StreamDetail.objects.get(uid=stream_uid['uid'])
for component in feed_data['feed']['component']:
for stream in component['stream']:
t = {}
d = {}
stats = {}
for time_value in stream['time']:
t = time_value
for stats in stream['statistic']:
for data_value in stats['data']:
print({
'uid': sd, value_type:stats['type'], timestamp=t, value=data_value
})
I finally got it dome using lists. I simply append the data in a list and call it later in a loop using the index and fetch the corresponding date and time.

Parse complex Json string contained in Hadoop

I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. I found that complex JSON can be parsed by using Twitter's Elephant Bird or Mozilla's Akela library. (I found some additional libraries, but I cannot use 'Loader' based approach since I use HCatalog Loader to load data from Hive.)
But, the problem is the structure of my data; each value of Map structure contains value part of complex JSON. For example,
1. My table looks like (WARNING: type of 'complex_data' is not STRING, a MAP of <STRING, STRING>!)
TABLE temp_table
(
user_id BIGINT COMMENT 'user ID.',
complex_data MAP <STRING, STRING> COMMENT 'complex json data'
)
COMMENT 'temp data.'
PARTITIONED BY(created_date STRING)
STORED AS RCFILE;
2. And 'complex_data' contains (a value that I want to get is marked with two *s, so basically #'d'#'f' from each PARSED_STRING(complex_data#'c') )
{ "a": "[]",
"b": "\"sdf\"",
"**c**":"[{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},
{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},]"
}
3. So, I tried... (same approach for Elephant Bird)
REGISTER '/path/to/akela-0.6-SNAPSHOT.jar';
DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap();
data = LOAD temp_table USING org.apache.hive.hcatalog.pig.HCatLoader();
values_of_map = FOREACH data GENERATE complex_data#'c' AS attr:chararray; -- IT WORKS
-- dump values_of_map shows correct chararray data per each row
-- eg) ([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }])
([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }]) ...
attempt1 = FOREACH data GENERATE JsonTupleMap(complex_data#'c'); -- THIS LINE CAUSE AN ERROR
attempt2 = FOREACH data GENERATE JsonTupleMap(CONCAT(CONCAT('{\\"key\\":', complex_data#'c'), '}'); -- IT ALSO DOSE NOT WORK
I guessed that "attempt1" was failed because the value doesn't contain full JSON. However, when I CONCAT like "attempt2", I generate additional \ mark with. (so each line starts with {\"key\": ) I'm not sure that this additional marks breaks the parsing rule or not. In any case, I want to parse the given JSON string so that Pig can understand. If you have any method or solution, please Feel free to let me know.
I finally solved my problem by using jyson library with jython UDF.
I know that I can solve it by using JAVA or other languages.
But, I think that jython with jyson is the most simplist answer to this issue.

Parsing Google Custom Search API for Elasticsearch Documents

After retrieving results from the Google Custom Search API and writing it to JSON, I want to parse that JSON to make valid Elasticsearch documents. You can configure a parent - child relationship for nested results. However, this relationship seems to not be inferred by the data structure itself. I've tried automatically loading, but not results.
Below is some example input that doesn't include things like id or index. I'm trying to focus on creating the correct data structure. I've tried modifying graph algorithms like depth-first-search but am running into problems with the different data structures.
Here's some example input:
# mock data structure
google = {"content": "foo",
"results": {"result_one": {"persona": "phone",
"personb": "phone",
"personc": "phone"
},
"result_two": ["thing1",
"thing2",
"thing3"
],
"result_three": "none"
},
"query": ["Taylor Swift", "Bob Dole", "Rocketman"]
}
# correctly formatted documents for _source of elasticsearch entry
correct_documents = [
{"content":"foo"},
{"results": ["result_one", "result_two", "result_three"]},
{"result_one": ["persona", "personb", "personc"]},
{"persona": "phone"},
{"personb": "phone"},
{"personc": "phone"},
{"result_two":["thing1","thing2","thing3"]},
{"result_three": "none"},
{"query": ["Taylor Swift", "Bob Dole", "Rocketman"]}
]
Here is my current approach this is still a work in progress:
def recursive_dfs(graph, start, path=[]):
'''recursive depth first search from start'''
path=path+[start]
for node in graph[start]:
if not node in path:
path=recursive_dfs(graph, node, path)
return path
def branching(google):
""" Get branches as a starting point for dfs"""
branch = 0
while branch < len(google):
if google[google.keys()[branch]] is dict:
#recursive_dfs(google, google[google.keys()[branch]])
pass
else:
print("branch {}: result {}\n".format(branch, google[google.keys()[branch]]))
branch += 1
branching(google)
You can see that recursive_dfs() still needs to be modified to handle string, and list data structures.
I'll keep going at this but if you have thoughts, suggestions, or solutions then I would very much appreciate it. Thanks for your time.
here is a possible answer to your problem.
def myfunk( inHole, outHole):
for keys in inHole.keys():
is_list = isinstance(inHole[keys],list);
is_dict = isinstance(inHole[keys],dict);
if is_list:
element = inHole[keys];
new_element = {keys:element};
outHole.append(new_element);
if is_dict:
element = inHole[keys].keys();
new_element = {keys:element};
outHole.append(new_element);
myfunk(inHole[keys], outHole);
if not(is_list or is_dict):
new_element = {keys:inHole[keys]};
outHole.append(new_element);
return outHole.sort();

How to fetch a JSON file to get a row position from a given value or argument

I'm using wget to fetch several dozen JSON files on a daily basis that go like this:
{
"results": [
{
"id": "ABC789",
"title": "Apple",
},
{
"id": "XYZ123",
"title": "Orange",
}]
}
My goal is to find row's position on each JSON file given a value or set of values (i.e. "In which row XYZ123 is located?"). In previous example ABC789 is in row 1, XYZ123 in row 2 and so on.
As for now I use Google Regine to "quickly" visualize (using the Text Filter option) where the XYZ123 is standing (row 2).
But since it takes a while to do this manually for each file I was wondering if there is a quick and efficient way in one go.
What can I do and how can I fetch and do the request? Thanks in advance! FoF0
In python:
import json
#assume json_string = your loaded data
data = json.loads(json_string)
mapped_vals = []
for ent in data:
mapped_vals.append(ent['id'])
The order of items in the list will be indexed according to the json data, since the list is a sequenced collection.
In PHP:
$data = json_decode($json_string);
$output = array();
foreach($data as $values){
$output[] = $values->id;
}
Again, the ordered nature of PHP arrays ensure that the output will be ordered as-is with regard to indexes.
Either example could be modified to use a mapped dictionary (python) or an associative array (php) if needs demand.
You could adapt these to functions that take the id value as an argument, track how far they are into the array, and when found, break out and return the current index.
Wow. I posted the original question 10 months ago when I knew nothing about Python nor computer programming whatsoever!
Answer
But I learned basic Python last December and came up with a solution for not only get the rank order but to insert the results into a MySQL database:
import urllib.request
import json
# Make connection and get the content
response = urllib.request.urlopen(http://whatever.com/search?=ids=1212,125,54,454)
content = response.read()
# Decode Json search results to type dict
json_search = json.loads(content.decode("utf8"))
# Get 'results' key-value pairs to a list
search_data_all = []
for i in json_search['results']:
search_data_all.append(i)
# Prepare MySQL list with ranking order for each id item
ranks_list_to_mysql = []
for i in range(len(search_data_all)):
d = {}
d['id'] = search_data_all[i]['id']
d['rank'] = i + 1
ranks_list_to_mysql.append(d)