JSONDecodeError: Extra data: line 1 column 5 (char 4) - json

I'm trying to run an API with multiple zip codes in order to store json files into a dataframe. But, I'm getting an error when I try to pass a list with more than one zip code:
ceps = ['69027320', '38411206', '78118187', '12245481']
jsonfile = []
for cep in ceps:
url = "https://www.cepaberto.com/api/v3/cep?cep="+str(cep)
headers = {'Authorization': 'Token token=111111111111111111111111111111'}
response = requests.request("GET", url.format(cep=ceps), headers=headers)
jsonfile.append(response.json())
df = pd.json_normalize(jsonfile)
Error output:
JSONDecodeError: Extra data: line 1 column 5 (char 4)
I can visualize that the error happens because I try to analyze several objects without wrapping them in an array. But, I can't think of a solution to make it work

Related

raise JSON DecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

def get_historical_candles(self, symbol, interval):
data = dict()
data['symbol'] = symbol
data['interval'] = interval
data['limit'] = 1000
response = requests.get('https://api.binance.com/api/V3/kline', data)
raw_candles = response.json()
candles = []
if raw_candles is not None:
for c in raw_candles:
candles.append([c[0], float(c[1]), float(c[2]), float(c[3]), float(c[4]), float(c[5])])
print(candles)
How can i avoid the error mentioned in the title?
I have tried
Fetch Candlestick/Kline data from Binance API using Python (preferably requests) to get JSON Dat
and it works well.
However, i need to find out a solution for this approach
response = requests.get('https://api.binance.com/api/V3/kline', data)
Please read binance API document carefully: https://binance-docs.github.io/apidocs/spot/en/#kline-candlestick-data
To debug this error: Obviously it comes from raw_candles = response.json(). Therefore, before deserialize the response content, it is a good idea to check the status code and text by response.status_code and response.text
TL;DR:
An s is missing. The path should be api/v3/klines

How to Convert a data set from a database query to a specific JSON format for input to a REST api

I am fairly new to python and would like some assistance with a problem.
I have a SQL select query that returns a column with values. I would like to pass the records in a header request to a REST API call. The issue is that the API call expects the data in a specific JSON format.
How can I convert the data returned by the query to the specific JSON format shown below?
Query from SQL returns:
InfoId
------
1
2
3
4
5
6
I need to pass these values to a REST API as JSON in the following format:
{
"InfoId":[
1,2,3,4,5,6
]
}
I have tried couple of options to solve this problem.
I have tried converting the data into json using the pandas datatable.to_json method with the various orient parameters but none of them return the desired format as shown above.
import requests
import json
import pyodbc
import pandas as pd
conn = pyodbc.connect('Driver={SQL SERVER};'
'Server=myServer;'
'Database=TestDb;'
'Trusted_Connection=yes;'
)
cursor = conn.cursor()
sql_query = pd.read_sql_query('SELECT InfoId FROM tbl_info', conn)
#print(sql_query)
print(sql_query.to_json(orient='values', index=False))
url = "http://ldapiserver:5000/patterns/v1/swirl?idType=Info"
#sample payload
#payload = "{\r\n \"InfoId\": [\r\n 1,2,3,4,5,6\r\n ]\r\n}"
payload = sql_query.to_json(orient='records')
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=json.dumps(payload, indent=4))
resp_body = response.json()
print(resp_body)
print(response.elapsed.total_seconds())
The 2nd method I have tried is to convert the rows from SQL query into an list object and then form the json string. It works that way but I would like to automate is so that irrespective of the query it can for the json string.
import requests
import json
import pyodbc
conn = pyodbc.connect('Driver={SQL SERVER};'
'Server=myServer;'
'Database=TestDb;'
'Trusted_Connection=yes;'
)
cursor = conn.cursor()
cursor.execute("""
SELECT InfoId FROM tbl_info
""")
rows = cursor.fetchall()
# Convert query to row arrays
rowarray_list = []
for row in rows:
t = (row.InfoId)
rowarray_list.append(t)
j = json.dumps(rowarray_list)
conn.close()
txt = '{"InfoId": ', j, '}'
# print(txt)
payload = txt[0]+txt[1]+txt[2]
url = "http://ldapiserver:5000/patterns/v1/swirl?idType=Info"
# payload = "{\r\n \"InfoId\": [\r\n 72,74\r\n ]\r\n}"
#print(json .dumps(payload, indent=4))
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
resp_body = response.json()
print(resp_body)
print(response.elapsed.total_seconds())
Appreciate any help with this.
Thank you.
To convert your SQL query to JSON,
.
.
.
rows = cursor.fetchall()
# convert to list
json_rows = [dict(zip([key[0] for key in cursor.description], row)) for row in rows]
Then you can return your response as you like

Iterate through a JSON file using Python

i am trying to loop through a simple json file (see link) and to calculate the sum of all integers from the file.
When iterating through the file I receive the following error:
TypeError: string indices must be integers
Could you please help.
code below
import urllib.request, urllib.parse, urllib.error
import json
total=0
#url = input('Enter URL: ')
url= ' http://py4e-data.dr-chuck.net/comments_42.json'
uh=urllib.request.urlopen(url)
data = uh.read().decode()
print('Retrieved', len(data), 'characters')
print(data)
info = json.loads(data)
print('User count:', len(info)) #it displays "User count: 2" why?
for item in info:
num=item["comments"][0]["count"]
total=total+num
print (total)
The json file starts with a note. Your for-loop reads the keys of a dictionary, so the first item is 'note' (a string), which can only be subscripted with an integer, hence the error message.
You probably want to loop over info["comments"] which is the list with all dictionaries containing 'name' and 'count':
for item in info["comments"]:
num=item["count"]
total=total+num
print (total)

Using the reults of multiple for loops to post a single json response

Okay, so this is a loaded question but and I'm sure theres an easy method to use here, but I'm stuck.
Long story short, I am tasked with creating a function in python (to be run an AWS lambda) which can perform acceptance tests on a series of URL's using python-requests. These requests will be used to assert the HTTP response codes and a custom HTTP header identifying if an haproxy backend is correct.
The URL's themselves will be maintained in a yaml document which will be converted to a dict in python and passed to a for loop which will use python requests to HTTP GET the response code and header of the URL.
The issue I am having is getting a single body object to return the results of multiple for loops.
I have tried to find similar use cases but cannot
import requests
import json
import yaml
def acc_tests():
with open("test.yaml", 'r') as stream:
testurls = yaml.safe_load(stream)
results = {}
# endpoint/path 1
for url in testurls["health endpoints"]:
r = requests.get(url, params="none")
stat = r.status_code
result = json.dumps(print(url, stat))
results = json.dumps(result)
# endpoint path with headers
for url in testurls["xtvapi"]:
headers = {'H': 'xtvapi.cloudtv.comcast.net'}
r = requests.get(url, headers=headers, params="none")
stat = r.status_code
head = r.headers["X-FINITY-TANGO-BACKEND"]
result = json.dumps((url, stat, head))
results = json.dumps(result)
return {
'statusCode': 200,
'body': json.dumps(results)
}
acc_tests()
YAML file:
health endpoints:
- https://xfinityapi-tango-production-aws-us-east-1-active.r53.aae.comcast.net/tango-health/
- https://xfinityapi-tango-production-aws-us-east-1-active.r53.aae.comcast.net/
- https://xfinityapi-tango-production-aws-us-east-2-active.r53.aae.comcast.net/tango-health/
- https://xfinityapi-tango-production-aws-us-east-2-active.r53.aae.comcast.net/
- https://xfinityapi-tango-production-aws-us-west-2-active.r53.aae.comcast.net/tango-health/
- https://xfinityapi-tango-production-aws-us-west-2-active.r53.aae.comcast.net/
xtvapi:
- https://xfinityapi-tango-production-aws-us-east-1-active.r53.aae.comcast.net/
- https://xfinityapi-tango-production-aws-us-east-2-active.r53.aae.comcast.net/
- https://xfinityapi-tango-production-aws-us-west-2-active.r53.aae.comcast.net/
What I think is happening is that both for loops are running one after another, but the value of results is empty, but I'm not sure what to do in order to update/append the results dict with the results of each loop.
Thanks folks. I ended up solving this by creating a dict with immutable keys for each test type and then using append to add the results to a nested list within the dict.
Here is the "working" code as it is in the AWS Lambda function:
from botocore.vendored import requests
import json
import yaml
def acc_tests(event, context):
with open("test.yaml", 'r') as stream:
testurls = yaml.safe_load(stream)
results = {'tango-health': [], 'xtvapi': []}
# Tango Health
for url in testurls["health endpoints"]:
r = requests.get(url, params="none")
result = url, r.status_code
assert r.status_code == 200
results["tango-health"].append(result)
# xtvapi default/cloudtv
for url in testurls["xtvapi"]:
headers = {'H': 'xtvapi.cloudtv.comcast.net'}
r = requests.get(url, headers=headers, params="none")
result = url, r.status_code, r.headers["X-FINITY-TANGO-BACKEND"]
assert r.status_code == 200
assert r.headers["X-FINITY-TANGO-BACKEND"] == "tango-big"
results["xtvapi"].append(result)
resbody = json.dumps(results)
return {
'statusCode': 200,
'body': resbody
}

Serialise and deserialise pandas periodIndex series

The pandas Series.to_json() function is creating unreadable JSON when using a PeriodIndex.
The error that occurs is:
json.decoder.JSONDecodeError: Expecting ':' delimiter: line 1 column 5 (char 4)
I've tried changing the orient, but in all of these combinations of serialising and deserialising the index is lost.
idx = pd.PeriodIndex(['2019', '2020'], freq='A')
series = pd.Series([1, 2], index=idx)
json_series = series.to_json() # This is a demo - in reality I'm storing this in a database, but this code throws the same error
value = json.loads(json_series)
A link to the pandas to_json docs
A link to the python json lib docs
The reason I'm not using json.dumps is that the pandas series object is not serialisable.
Python 3.7.3 Pandas 0.24.2
A workaround is to convert PeriodIndex to regular Index before dump and convert it back to PeriodIndex after load:
regular_idx = period_idx.astype(str)
# then dump
# after load
period_idx = pd.to_datetime(regular_idx).to_period()