Parsing JSON string in Python - json

I am new in Python so i beg your patience !
In the following output string, i need to get the latest price (determined from timestamp) where type = 'bid'. Please suggest how can i read the output into JSON and read the latest price
{"dollar_pound":[
{"type":"ask","price":0.01769341,"amount":1.10113151,"tid":200019988,"timestamp":1515919171},
{"type":"ask","price":0.017755,"amount":3.95681783,"tid":200019987,"timestamp":1515919154},
{"type":"bid","price":0.01778859,"amount":3.7753814,"tid":200019986,"timestamp":1515919152},
{"type":"ask","price":0.017755,"amount":0.01216145,"tid":200019985,"timestamp":1515919147},
{"type":"ask","price":0.017755,"amount":0.05679142,"tid":200019984,"timestamp":1515919135}]}
I tried this but didn't worked
parsed_json = json.loads(request.text)
price = parsed_json['price'][0]

I think this may be what you want - here's a short script to get the latest price of type "bid":
# Here's a few more test cases for bid prices to let you test out your script
parsed_json = {"dollar_pound":[
{"type":"ask","price":0.01769341,"amount":1.10113151,"tid":200019988,"timestamp":1515919171},
{"type":"ask","price":0.017755,"amount":3.95681783,"tid":200019987,"timestamp":1515919154},
{"type":"bid","price":0.01778859,"amount":3.7753814,"tid":200019986,"timestamp":1515919152},
{"type":"bid","price":0.01542344,"amount":3.7753814,"tid":200019983,"timestamp":1715929152},
{"type":"bid","price":0.023455,"amount":3.7753814,"tid":200019982,"timestamp":1515919552},
{"type":"ask","price":0.017755,"amount":0.01216145,"tid":200019985,"timestamp":1515919147},
{"type":"ask","price":0.017755,"amount":0.05679142,"tid":200019984,"timestamp":1515919135}]}
# To get items of type "bid"
def get_bid_prices(parsed_json):
return filter(lambda x: x["type"] == "bid", parsed_json)
# Now, we want to get the latest "bid" price, i.e. the largest number in the "timestamp" field
latest_bid_price = max(get_bid_prices(parsed_json["dollar_pound"]), key=lambda x: x["timestamp"])
# Your result will be printed here
print(latest_bid_price) # {"type":"bid","price":0.01542344,"amount":3.7753814,"tid":200019983,"timestamp":1715929152}

For all good fellas struggling like me, i would like to share the answer for them.
json_data = json.loads (req.text)
for x in json_data['dollar_pound']:
print (x['price'])

Related

How to convert multi dimensional array in JSON as separate columns in pandas

I have a DB collection consisting of nested strings . I am trying to convert the contents under "status" column as separate columns against each order ID in order to track the time taken from "order confirmed" to "pick up confirmed". The string looks as follows:
I have tried the same using
xyz_db= db.logisticsOrders -------------------------(DB collection)
df =pd.DataFrame(list(xyz_db.find())) ------------(JSON to dataframe)
Using normalize :
parse1=pd.json_normalize(df['status'])
It works fine in case of non nested arrays. But status being a nested array the output is as follows:
Using for :
data = df[['orderid','status']]
data = list(data['status'])
dfy = pd.DataFrame(columns = ['statuscode','statusname','laststatusupdatedon'])
for i in range(0, len(data)):
result = data[i]
dfy.loc[i] = [data[i][0],data[i][0],data[i][0],data[i][0]]
It gives the result in form of appended rows which is not the format i am trying to achieve
The output I am trying to get is :
Please help out!!
i share you which i used json read, maybe help you:
you can use two and more list
def jsonify(z):
genr = []
if z==z and z is not None:
z = eval(z)
if type(z) in (dict, list, tuple):
for dic in z:
for key, val in dic.items():
if key == "name":
genr.append(val)
else:
return None
else:
return None
return genr
top_genr['genres_N']=top_genr['genres'].apply(jsonify)

extract rows and columns from dictionary of JSON responses consisting of lists of dictionaries in python

sorry for the confusing title.
So im trying to read a butload of JSON responses using grequests with this loop:
def GetData():
urlRespDict = {}
for OrderNo in LookupNumbers['id']:
urls1 = []
for idno in ParameterList:
urlTemp = url0_const + OrderNo + url1_const + idno + param1_const
urls1.append(urlTemp)
urlRespDict[OrderNo] = grequests.map((grequests.get(u) for u in urls1))
return urlRespDict
Which is all fine and dandy, my response is a dictionary of 4 keys with consisting of a lists with sizes 136.
When i read one of the responses with (key and index are random):
d1 = dict_responses['180378'][0].json()
I get a list of dictionaries that has a dictionary inside see picture below.
Basically all i want to get out is the value from the 'values' key where in this case is '137' and
'13,80137' ideally i want to create a df that has columns with the 'key' (in this case the '137') and rows with the values extracted from d1.
I've tried using apply(pd.Series) on the values dict. But it is very time consuming.
like:
df2 = [(pd.DataFrame.from_records(n))['values'].apply(pd.Series,dtype="string") for n in df1]
just to see the data.
I hope theres another alternative, i am not an experienced coder
I hope i explained it good enough and i hope you can help. Thank you so much in advance

Missing element - help json python3

I'm new to Python and using Python3 to display the data from my weather station
The problem I have is it used to work perfectly until I got a replacement station.
I found the problem
In the weather data sent there are 3 fields (not sure of the correct name) but they are
lightning_strike_last_distance
lightning_strike_last_distance_msg
lightning_strike_last_epoch
In my new station these fields are completely missing as there has been no lightning since I got the new one
As a result the station display just doesn't parse the weather data as those fields are not there.
How can I get the program to check if those fields/elements or whatever the correct name is, and if they are there parse them as usual
but if they are not there to skip those and move onto the next section
This is the relevant section of code
lightning_strike_last_distance = forecast_json["current_conditions"]["lightning_strike_last_distance"]
lightning1 = lightning_strike_last_distance*0.621371 #Convert kph to mph
data.lightning_strike_last_distance = "{0:.2f} miles".format(lightning1)
lightning_strike_last_epoch = forecast_json["current_conditions"]["lightning_strike_last_epoch"]
data.lightning_strike_last_epoch = time.strftime("%d-%m-%Y %H:%M:%S", time.localtime(lightning_strike_last_epoch))
How can I fix it so the program skips those 3 elements/sections if they are missing?
try following pattern:
lightning_strike_last_distance = forecast_json["current_conditions"]["lightning_strike_last_distance"] if "lightning_strike_last_distance" in forecast_json["current_conditions"] else None
It will set lightning_strike_last_distance to the value if it is present, and set it to None if it is not present.
Repeat that pattern for all other assignements.
to test it quickly try :
data = {"a":{"b":1,},}
valueB = data["a"]["b"] if "b" in data["a"] else None
valueC = data["a"]["c"] if "c" in data["a"] else None
print (valueB)
print (valueC)

Sort JSON dictionaries using datetime format not consistent

I have JSON file (post responses from an API) - I need to sort the dictionaries by a certain key in order to parse the JSON file in chronological order. After studying the data, I can sort it by the date format in metadata or by the number sequences of the S5CV[0156]P0.xml
One text example that you can load in JSON here - http://pastebin.com/0NS5BiDk
I have written 2 codes to sort the list of objects by a certain key. The 1st one sorts by the 'text' of the xml. The 2nd one by [metadata][0][value].
The 1st one works, but a few of the XMLs, even if they are higher in number, actually have documents inside older than I expected.
For the 2nd code the format of date is not consistent and sometimes the value is not present at all. I am struggling to extract the datetime format in a consistent way. The second one also gives me an error, but I cannot figure out why - string indices must be integers.
# 1st code (it works but not ideal)
# load post response r1 in json (python 3.5)
j=r1.json()
# iterate through dictionaries and sort by the 4 num of xml (ex. 0156)
list = []
for row in j["tree"]["children"][0]["children"]:
list.append(row)
newlist = sorted(list, key=lambda k: k['text'][-9:])
print(newlist)
# 2nd code. I need something to make consistent datetime,
# except missing values and solve the list index error
list = []
for row in j["tree"]["children"][0]["children"]:
list.append(row)
# extract the last 3 blocks of characters from the [metadata][0][value]
# usually are like this "7th april, 1922." and trasform in datatime format
# using dparser.parse
def date(key):
return dparser.parse((' '.join(key.split(' ')[-3:])),fuzzy=True)
def order(slist):
try:
return sorted(slist, key=lambda k: k[date(["metadata"][0]["value"])])
except ValueError:
return 0
print(order(list))
#update
orig_list = j["tree"]["children"][0]["children"]
cleaned_list = sorted((x for x in orig_list if extract_date(x) != DEFAULT_DATE),
key=extract_date)
first_date = extract_date(cleaned_list[0])
if first_date != DEFAULT_DATE: # valid date found?
cleaned_list [0] ['date'] = first_date
print(first_date)
middle_date = extract_date(cleaned_list[len(cleaned_list)//2])
if middle_date != DEFAULT_DATE: # valid date found?
cleaned_list [0] ['date'] = middle_date
print(middle_date)
last_date = extract_date(cleaned_list [-1])
if last_date != DEFAULT_DATE: # valid date found?
cleaned_list [0] ['date'] = last_date
print(last_date)
Clearly you can't use the .xml filenames to sort the data if it's unreliable, so the most promising strategy seems to be what you're attempting to do in the 2nd code.
When I mentioned needing a datetime to sort the items in my comments to your other question, I literally meant something like datetime.date instances, not strings like "28th july, 1933", which wouldn't provide the proper ordering needed since they would be compared lexicographically with one another, not numerically like datetime.dates.
Here's something that seems to work. It uses the re module to search for the date pattern in the strings that usually contain them (those with a "name" associated with the value "Comprising period from"). If there's more than one date match in the string, it uses the last one. This is then converted into a date instance and returned as the value to key on.
Since some of the items don't have valid date strings, a default one is substituted for sorting purposes. In the code below, a earliest valid date is used as the default—which makes all items with date problems appear at the beginning of the sorted list. Any items following them should be in the proper order.
Not sure what you should do about items lacking date information—if it isn't there, your only options are to guess a value, ignore them, or consider it an error.
# v3.2.1
import datetime
import json
import re
# default date when one isn't found
DEFAULT_DATE = datetime.date(1, 1, datetime.MINYEAR) # 01/01/0001
MONTHS = ('january february march april may june july august september october'
' november december'.split())
# dictionary to map month names to numeric values 1-12
MONTH_TO_ORDINAL = dict( zip(MONTHS, range(1, 13)) )
DMY_DATE_REGEX = (r'(3[01]|[12][0-9]|[1-9])\s*(?:st|nd|rd|th)?\s*'
+ r'(' + '|'.join(MONTHS) + ')(?:[,.])*\s*'
+ r'([0-9]{4})')
MDY_DATE_REGEX = (r'(' + '|'.join(MONTHS) + ')\s+'
+ r'(3[01]|[12][0-9]|[1-9])\s*(?:st|nd|rd|th)?,\s*'
+ r'([0-9]{4})')
DMY_DATE = re.compile(DMY_DATE_REGEX, re.IGNORECASE)
MDY_DATE = re.compile(MDY_DATE_REGEX, re.IGNORECASE)
def extract_date(item):
metadata0 = item["metadata"][0] # check only first item in metadata list
if metadata0.get("name") != "Comprising period from":
return DEFAULT_DATE
else:
value = metadata0.get("value", "")
matches = DMY_DATE.findall(value) # try dmy pattern (most common)
if matches:
day, month, year = matches[-1] # use last match if more than one
else:
matches = MDY_DATE.findall(value) # try mdy pattern...
if matches:
month, day, year = matches[-1] # use last match if more than one
else:
print('warning: date patterns not found in "{}"'.format(value))
return DEFAULT_DATE
# convert strings found into numerical values
year, month, day = int(year), MONTH_TO_ORDINAL[month.lower()], int(day)
return datetime.date(year, month, day)
# test files: 'json_sample.txt', 'india_congress.txt', 'olympic_games.txt'
with open('json_sample.txt', 'r') as f:
j = json.load(f)
orig_list = j["tree"]["children"][0]["children"]
sorted_list = sorted(orig_list, key=extract_date)
for item in sorted_list:
print(json.dumps(item, indent=4))
To answer your latest follow-on questions, you could leave out all the items in the list that don't have recognizable dates by using extract_date() to filter them out beforehand in a generator expression with something like this:
# to obtain a list containing only entries with a parsable date
cleaned_list = sorted((x for x in orig_list if extract_date(x) != DEFAULT_DATE),
key=extract_date)
Once you have a sorted list of items that all have a valid date, you can do things like the following, again reusing the extract_date() function:
# extract and display dates of items in cleaned list
print('first date: {}'.format(extract_date(cleaned_list[0])))
print('middle date: {}'.format(extract_date(cleaned_list[len(cleaned_list)//2])))
print('last date: {}'.format(extract_date(cleaned_list[-1])))
Calling extract_date() on the same item multiple times is somewhat inefficient. To avoid that you could easily add the datetime.date value it returns to the object on-the-fly since it's a dictionary, and then just refer to it as often as needed with very little additional overhead:
# add extracted datetime.date entry to a list item[i] if a valid one was found
date = extract_date(some_list[i])
if date != DEFAULT_DATE: # valid date found?
some_list[i]['date'] = date # save by adding it to object
This effectively caches the extracted date by storing it in the item itself. Afterwards, the datetime.date value can simply be referenced with some_list[i]['date'].
As a concrete example, consider this revised example of displaying the datesof the first, middle, and last objects:
# display dates of items in cleaned list
print('first date: {}'.format(cleaned_list[0]['date']))
middle = len(cleaned_list)//2
print('middle date: {}'.format(cleaned_list[middle]['date']))
print('last date: {}'.format(cleaned_list[-1]['date']))

How to fetch a JSON file to get a row position from a given value or argument

I'm using wget to fetch several dozen JSON files on a daily basis that go like this:
{
"results": [
{
"id": "ABC789",
"title": "Apple",
},
{
"id": "XYZ123",
"title": "Orange",
}]
}
My goal is to find row's position on each JSON file given a value or set of values (i.e. "In which row XYZ123 is located?"). In previous example ABC789 is in row 1, XYZ123 in row 2 and so on.
As for now I use Google Regine to "quickly" visualize (using the Text Filter option) where the XYZ123 is standing (row 2).
But since it takes a while to do this manually for each file I was wondering if there is a quick and efficient way in one go.
What can I do and how can I fetch and do the request? Thanks in advance! FoF0
In python:
import json
#assume json_string = your loaded data
data = json.loads(json_string)
mapped_vals = []
for ent in data:
mapped_vals.append(ent['id'])
The order of items in the list will be indexed according to the json data, since the list is a sequenced collection.
In PHP:
$data = json_decode($json_string);
$output = array();
foreach($data as $values){
$output[] = $values->id;
}
Again, the ordered nature of PHP arrays ensure that the output will be ordered as-is with regard to indexes.
Either example could be modified to use a mapped dictionary (python) or an associative array (php) if needs demand.
You could adapt these to functions that take the id value as an argument, track how far they are into the array, and when found, break out and return the current index.
Wow. I posted the original question 10 months ago when I knew nothing about Python nor computer programming whatsoever!
Answer
But I learned basic Python last December and came up with a solution for not only get the rank order but to insert the results into a MySQL database:
import urllib.request
import json
# Make connection and get the content
response = urllib.request.urlopen(http://whatever.com/search?=ids=1212,125,54,454)
content = response.read()
# Decode Json search results to type dict
json_search = json.loads(content.decode("utf8"))
# Get 'results' key-value pairs to a list
search_data_all = []
for i in json_search['results']:
search_data_all.append(i)
# Prepare MySQL list with ranking order for each id item
ranks_list_to_mysql = []
for i in range(len(search_data_all)):
d = {}
d['id'] = search_data_all[i]['id']
d['rank'] = i + 1
ranks_list_to_mysql.append(d)