In my fetched json data, how can I seperate out the balance?

In my fetched json data, how can I seperate out the balance? - json

So, I have been testing block.io api, and so far I have this:
knee = block_io.get_address_balance(labels='shibe1')
s1 = json.dumps(knee)
d2 = json.loads(s1)
print (d2)
It returns me with this batch of text:
{'status': 'success', 'data': {'network': 'DOGE', 'available_balance': '0.0', 'pending_received_balance': '0.0', 'balances': [{'user_id': 1, 'label': 'shibe1', 'address': 'A9Bda9UMBcb1183PtsBxnbj5QgP6jwkCFG', 'available_balance': '0.00000000', 'pending_received_balance': '0.00000000'}]}}
How would I get it so that I could grab only the available_balance part of it, and print it out instead of all of the json data?
EDIT: Please help! Cant find a solution.

Try using some regex.
import re
data="{'status': 'success', 'data': {'network': 'DOGE', 'available_balance': '0.129',
'pending_received_balance': '0.0', 'balances': [{'user_id': 1, 'label': 'shibe1',
'address': 'A9Bda9UMBcb1183PtsBxnbj5QgP6jwkCFG', 'available_balance': '0.00000000',
'pending_received_balance': '0.00000000'}]}}"
pattern = re.compile("(?<=available_balance': ').*?(?=')")
matches = pattern.finditer(data)
for match in matches:
print(match.group())
Breakdown :
import re imports the regex library built into python
data="{'status': 'success', 'data': {'network': 'DOGE', 'available_balance': '0.129',
'pending_received_balance': '0.0', 'balances': [{'user_id': 1, 'label': 'shibe1',
'address': 'A9Bda9UMBcb1183PtsBxnbj5QgP6jwkCFG', 'available_balance': '0.00000000',
'pending_received_balance': '0.00000000'}]}}" is a string containing the data to match. You can replace this with the json data.
pattern = re.compile("(?<=available_balance': ').*?(?=')") compiles the regex for finding the data for available balance.
Regex breakdown
(?<= is a lookbehind, which means it will check if the value is actually available_balance.
.* matches everything inside a defined constraint.
(?= is a lookahead, which means it will match everything before the close parenthesis, and everything after the lookbehind.
pattern.finditer(data) matches the regex against data
for match in matches:
print(match.group()) prints the matches from the regex.
If you compile this code, you will get the following results :
0.129
0.00000000
If you want the code under your variables, here you go :
import re
pattern = re.compile("(?<=available_balance': ').*?(?=')")
matches = pattern.finditer(d2)
for match in matches:
print(match.group())

Related

How to handle the variable size json file in python to create DataFrame using pandas

I am trying to build a DataFrame using pandas but I am not able to handle the case when I have the variable size of JSON chunks I am getting.
eg:
1st chunk:
{'ad': 0,
'country': 'US',
'ver': '1.0',
'adIdType': 2,
'adValue': '5',
'data': {'eventId': 99,
'clickId': '',
'eventType': 'PURCHASEMADE',
'tms': '2019-12-25T09:57:04+0000',
'productDetails': {'currency': 'DLR',
'productList': [
{'segment': 'Girls',
'vertical': 'Fashion Jewellery',
'brickname': 'Traditional Jewellery',
'price': 8,
'quantity': 10}]},
'transactionId': '1254'},
'appName': 'xer.tt',
'appId': 'XR',
'sdkVer': '1.0.0',
'language': 'en',
'tms': '2022-04-25T09:57:04+0000',
'tid': '124'}
2nd chunk:
{'ad': 0,
'country': 'US',
'ver': '1.0',
'adIdType': 2,
'adValue': '78',
'data': {'eventId': 7,
'clickId': '',
'eventType': 'PURCHASEMADE',
'tms': '20219-02-25T09:57:04+0000',
'productDetails': {'currency': 'DLR',
'productList': [{'segment': 'Boys',
'vertical': 'Fashion',
'brickname': 'Casuals',
'price': 10,
'quantity': 5},
{'segment': 'Girls',
'vertical': 'Fashion Jewellery',
'brickname': 'Traditional Jewellery',
'price': 8,
'quantity': 10}]},
'transactionId': '3258'},
'appName': 'xer.tt',
'appId': 'XR',
'sdkVer': '1.0.0',
'language': 'en',
'tms': '2029-02-25T09:57:04+0000',
'tid': '124'}
Now in the ProductDetails the number of products are getting changes, in the first chunk we have only 1 product listed and it's detailed but in the 2nd chunk, we have 2 products listed and it's detailed, for further chunks we can have ANY number of products for other chunks also. (i.e. chunks~Records)
I tried doing that by writing some python scripts but was not able to come to any good solution.
PS: If any further detail is required please let me know in the comments.
Thanks!

What you can do, is use pd.json_normalize and have the most "inner" dictionary as your record_path and all other data you are interested in as your meta . Here is an in-depth example how you could construct that: pandas.io.json.json_normalize with very nested json
In your case, that would for example be (for a single object):
df = pd.json_normalize(obj,
record_path=["data", "productDetails", "productList"],
meta=([
["data", "productDetails", "currency"],
["data", "transactionId"],
["data", "clickId"],
["data", "eventType"],
["data", "tms"],
"ad",
"country"
])
)

converting json into pandas dataframe

I have JSON output that I would like to convert to pandas dataframe. I downloaded from a website via HTTPS and utilizing an API key. thanks much. here is what I coded:
json_data = vehicle_miles_traveled.json()
print(json_data)
{'request': {'command': 'series', 'series_id': 'STEO.MVVMPUS.A'}, 'series': [{'series_id': 'STEO.MVVMPUS.A', 'name': 'Vehicle Miles Traveled, Annual', 'units': 'million miles/day', 'f': 'A', 'description': 'Includes gasoline and diesel fuel vehicles', 'copyright': 'None', 'source': 'U.S. Energy Information Administration (EIA) - Short Term Energy Outlook', 'geography': 'USA', 'start': '1990', 'end': '2023', 'lastHistoricalPeriod': '2021', 'updated': '2022-03-08T12:39:35-0500', 'data': [['2023', 9247.0281671], ['2022', 9092.4575671], ['2021', 8846.1232877], ['2020', 7933.3907104], ['2019', 8936.3589041], ['2018', 8877.6027397], ['2017', 8800.9479452], ['2016', 8673.2431694], ['2015', 8480.4712329], ['2014', 8289.4684932], ['2013', 8187.0712329], ['2012', 8110.8387978], ['2011', 8083.2931507], ['2010', 8129.4958904], ['2009', 8100.7205479], ['2008', 8124.3387978], ['2007', 8300.8794521], ['2006', 8257.8520548], ['2005', 8190.2136986], ['2004', 8100.5163934], ['2003', 7918.4136986], ['2002', 7823.3123288], ['2001', 7659.2054795], ['2000', 7505.2622951], ['1999', 7340.9808219], ['1998', 7192.7780822], ['1997', 7014.7205479], ['1996', 6781.9699454], ['1995', 6637.7369863], ['1994', 6459.1452055], ['1993', 6292.3424658], ['1992', 6139.7595628], ['1991', 5951.2712329], ['1990', 5883.5643836]]}]}

It hugely depends on your final goal. You could add all meta-data in a dataframe if you want to. I assume that you are interested in reading the data field into a dataframe.
We can just get those fields by accessing:
data = json_data['series'][0]['data']
# and pass them to the dataframe constructor. We can specify the column names as well!
df = pd.DataFrame(data, columns=['year', 'other_col_name'])

can't convert text data to json

I am trying to convert the following (json) string into a python data type:
data = "{'id': 26, 'photo': '/media/f082b5af-ad0.png', 'first_name': 'Islam', 'last_name': 'Mansour', 'email': 'islammansour06+8#gmail.com', 'city': 'Giza', 'cv': '/media/fbb61609-442.pdf', 'reference': 'Facebook', 'campaign': OrderedDict([('id', 2), ('name', 'javascript')]), 'status': 'Invitation Sent', 'user': None, 'at': '2020-01-20', 'time': '23:02:58.359179', 'technologies': [OrderedDict([('id', 46), ('name', 'Django'), ('category', OrderedDict([('id', 24), ('name', 'Framework'), ('_type', 'skill')]))])]}"
I am trying to convert it to JSON by using
json.loads(data.replace("\'", "\""))
but I am having the following error
json.decoder.JSONDecoderError: Expecting value: line 1 column 219 (char 218)

The issue is that your data is not valid json.
The main problem starts here: [OrderedDict([('id', 46), ('name', 'Django'), ('category', OrderedDict([('id', 24), ('name', 'Framework'), ('_type', 'skill')]))])]}. This looks like it is a string representaion of some python objects.
Below is a more friendly representation of your json data.
I have marked the problematic parts (with **) (basically everywhere there is a OrderedDict).
{
"id":26,
"photo":"/media/f082b5af-ad0.png",
"first_name":"Islam",
"last_name":"Mansour",
"email":"islammansour06+8#gmail.com",
"city":"Giza",
"cv":"/media/fbb61609-442.pdf",
"reference":"Facebook",
"campaign":**OrderedDict**([("id",
2), ("name", "javascript")]), "status":"Invitation Sent",
"user":None,
"at":"2020-01-20",
"time":"23:02:58.359179",
"technologies":[
**OrderedDict**([("id",
46),
("name",
"Django")
]("category", OrderedDict([("id", 24), ("name", "Framework"), ("_type", "skill")]))])]
}```
You could try making use of an [online json parser][1] which might give you some friendlier output.
[1]: http://json.parser.online.fr/

As previously said, OrderedDict is not correct JSON. But this is correct python.
To fix it:
from collections import OrderedDict # direct import because this is as this in your string
import json
jsonCorrect = json.dumps(eval(data))
json.loads(jsonCorrect) # it works

Not sure why you are adding the replace call. Should work with just the following:
json.loads(data)
You can read about it here.

Why does dask.bag.read_text(filename).map(json.loads) return a list?

I need to read several json.gz files using Dask. I am trying to achieve this by using dask.bag.read_text(filename).map(json.loads), but the output is a nested list (the files contain lists of dictionaries), whereas I would like to get a just a list of dictionaries.
I have included a small example that reproduces my problem, below.
import json
import gzip
import dask.bag as db
dict_list = [{'id': 123, 'name': 'lemurt', 'indices': [1,10]}, {'id': 345, 'name': 'katin', 'indices': [2,11]}]
filename = './test.json.gz'
# Write json
with gzip.open(filename, 'wt') as write_file:
json.dump(dict_list , write_file)
# Read json
with gzip.open(filename, "r") as read_file:
data = json.load(read_file)
# Read json with Dask
data_dask = db.read_text(filename).map(json.loads).compute()
print(data)
print(data_dask)
I would like to get the first output:
[{'id': 123, 'name': 'lemurt', 'indices': [1, 10]}, {'id': 345, 'name': 'katin', 'indices': [2, 11]}]
But instead I get the second one:
[[{'id': 123, 'name': 'lemurt', 'indices': [1, 10]}, {'id': 345, 'name': 'katin', 'indices': [2, 11]}]]

The read_text function returns a bag, where each element is a line of text. So you have a list of strings. Then, you parse each of those lines of text with json.loads, so each of those lines of text becomes a list again. So you have a list of lists.
In your case you might use map_partitions, and a function that expects a list of a single line of text
b = db.read_text("*.json.gz").map(lambda L: json.loads(L[0]))

Following the comment by #MRocklin, I ended up solving my problem by changing the way I was writing the json.gz files.
Instead of
with gzip.open(filename, 'wt') as write_file:
json.dump(dict_list , write_file)
I used
with gzip.open(filename, 'wt') as write_file:
for dd in dict_list:
json.dump(dd , write_file)
write_file.write("\n")
and kept reading the files as
db.read_text(filename).map(json.loads)

Simple Json decoding with SimpleJSON - Python

Ive just started learning python and Im having a go at using a google api. But I hit a brick wall trying to parse the JSON with simplejson.
How do I go about pulling single values (ie product or brand fields) out of this mess below
{'currentItemCount': 25, 'etag': '"izYJutfqR9tRDg1H4X3fGx1UiCI/hqqZ6pMwV1-CEu5NSqfJO0Ix-gs"', 'id': 'tag:google.com,2010:shopping/products', 'items': [{'id': 'tag:google.com,2010:shopping/products/1196682/8186421160532506003',
'kind': 'shopping#product',
'product': {'author': {'accountId': '1196682',
'name': "Dillard's"},
'brand': 'Merrell',
'condition': 'new',
'country': 'US',
'creationTime': '2011-03-10T08:11:08.000Z',
'description': u'Merrell\'s "Trail Glove" barefoot running shoe lets your feet follow their natural i$
'googleId': '8186421160532506003',
'gtin': '00797240569847',
'images': [{'link': 'http://dimg.dillards.com/is/image/DillardsZoom/03528718_zi_amazon?$product$'}],
'inventories': [{'availability': 'inStock',
'channel': 'online',
'currency': 'USD',
'price': 110.0}],
'language': 'en',
'link': 'http://www.dillards.com/product/Merrell-Mens-Trail-Glove-Barefoot-Running-Shoes_301_-1_301_5$
'modificationTime': '2011-05-25T07:42:51.000Z',
'title': 'Merrell Men\'s "Trail Glove" Barefoot Running Shoes'},
'selfLink': 'https://www.googleapis.com/shopping/search/v1/public/products/1196682/gid/8186421160532506003?alt=js$

The JSON you've pasted in the question is not valid. But when you fixed that here's how to use simplejson:
import simplejson as json
your_response_body = '["foo", {"bar":["baz", null, 1.0, 2]}]'
obj = json.loads(your_response_body)
print(obj[1]['bar'])
And a link to the documentation.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

In my fetched json data, how can I seperate out the balance? - json

Related

How to handle the variable size json file in python to create DataFrame using pandas

converting json into pandas dataframe

can't convert text data to json

Why does dask.bag.read_text(filename).map(json.loads) return a list?

Simple Json decoding with SimpleJSON - Python

Categories

Resources