I have a web-service call (HTTP Get) that my Python script makes in which returns a JSON response. The response looks to be a list of Dictionaries. The script's purpose is to iterate through the each dictionary, extract each piece of metadata (i.e. "ClosePrice": "57.74",) and write each dictionary to its own row in Mssql.
The issue is, I don't think Python is recognizing the JSON output from the API call as a list of dictionaries, and when I try a for loop, I'm getting the error must be int not str. I have tried converting the output to a list, dictionary, tuple. I've also tried to make it work with List Comprehension, with no luck. Further, if I copy/paste the data from the API call and assign it to a variable, it recognizes that its a list of dictionaries without issue. Any help would be appreciated. I'm using Python 2.7.
Here is the actual http call being made: http://test.kingegi.com/Api/QuerySystem/GetvalidatedForecasts?user=kingegi&market=us&startdate=08/19/13&enddate=09/12/13
Here is an abbreviated JSON output from the API call:
[
{
"Id": "521d992cb031e30afcb45c6c",
"User": "kingegi",
"Symbol": "psx",
"Company": "phillips 66",
"MarketCap": "34.89B",
"MCapCategory": "large",
"Sector": "basic materials",
"Movement": "up",
"TimeOfDay": "close",
"PredictionDate": "2013-08-29T00:00:00Z",
"Percentage": ".2-.9%",
"Latency": 37.48089483333333,
"PickPosition": 2,
"CurrentPrice": "57.10",
"ClosePrice": "57.74",
"HighPrice": null,
"LowPrice": null,
"Correct": "FALSE",
"GainedPercentage": 0,
"TimeStamp": "2013-08-28T02:31:08 778",
"ResponseMsg": "",
"Exchange": "NYSE "
},
{
"Id": "521d992db031e30afcb45c71",
"User": "kingegi",
"Symbol": "psx",
"Company": "phillips 66",
"MarketCap": "34.89B",
"MCapCategory": "large",
"Sector": "basic materials",
"Movement": "down",
"TimeOfDay": "close",
"PredictionDate": "2013-08-29T00:00:00Z",
"Percentage": "16-30%",
"Latency": 37.4807215,
"PickPosition": 1,
"CurrentPrice": "57.10",
"ClosePrice": "57.74",
"HighPrice": null,
"LowPrice": null,
"Correct": "FALSE",
"GainedPercentage": 0,
"TimeStamp": "2013-08-28T02:31:09 402",
"ResponseMsg": "",
"Exchange": "NYSE "
}
]
Small Part of code being used:
import os,sys
import subprocess
import glob
from os import path
import urllib2
import json
import time
try:
data = urllib2.urlopen('http://api.kingegi.com/Api/QuerySystem/GetvalidatedForecasts?user=kingegi&market=us&startdate=08/10/13&enddate=09/12/13').read()
except urllib2.HTTPError, e:
print "HTTP error: %d" % e.code
except urllib2.URLError, e:
print "Network error: %s" % e.reason.args[1]
list_id=[x['Id'] for x in data] #test to see if it extracts the ID from each Dict
print(data) #Json output
print(len(data)) #should retrieve the number of dict in list
UPDATE
Answered my own question, here is the method below:
`url = 'some url that is a list of dictionaries' #GetCall
u = urllib.urlopen(url) # u is a file-like object
data = u.read()
newdata = json.loads(data)
print(type(newdata)) # printed data type will show as a list
print(len(newdata)) #the length of the list
newdict = newdata[1] # each element in the list is a dict
print(type(newdict)) # this element is a dict
length = len(newdata) # how many elements in the list
for a in range(1,length): #a is a variable that increments itself from 1 until a number
var = (newdata[a])
print(var['Correct'], var['User'])`
Related
Trying to get Json data to csv i am getting the values but one block is showing as one line in result, new to python so any help appriciated. Have tried the below code to do the same.
import pandas as pd
with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
df = pd.read_json(inputfile)
df.to_csv(r'C:\Users\anath\csvfile.csv', encoding='utf-8', index=True)
Sample Json in the source file, short snippet
{
"issues": [
{
"issueId": 110052,
"revision": 84,
"definitionId": "DNS1012",
"subject": "urn:h:domain:fitestdea.com",
"subjectDomain": "fitestdea.com",
"title": "Nameserver name doesn\u0027t resolve to an IPv6 address",
"category": "DNS",
"severity": "low",
"cause": "urn:h:domain:ns1.gname.net",
"causeDomain": "ns1.gname.net",
"open": true,
"status": "active",
"auto": true,
"autoOpen": true,
"createdOn": "2022-09-01T02:29:09.681451Z",
"lastUpdated": "2022-11-23T02:26:28.785601Z",
"lastChecked": "2022-11-23T02:26:28.785601Z",
"lastConfirmed": "2022-11-23T02:26:28.785601Z",
"details": "{}"
},
{
"issueId": 77881,
"revision": 106,
"definitionId": "DNS2001",
"subject": "urn:h:domain:origin-mx.stagetest.test.com.test.com",
"subjectDomain": "origin-mx.stagetest.test.com.test.com",
"title": "Dangling domain alias (CNAME)",
"category": "DNS",
"severity": "high",
"cause": "urn:h:domain:origin-www.stagetest.test.com.test.com",
"causeDomain": "origin-www.stagetest.test.com.test.com",
"open": true,
"status": "active",
"auto": true,
"autoOpen": true,
"createdOn": "2022-08-10T09:34:36.929071Z",
"lastUpdated": "2022-11-23T09:33:32.553663Z",
"lastChecked": "2022-11-23T09:33:32.553663Z",
"lastConfirmed": "2022-11-23T09:33:32.553663Z",
"details": "{\"#type\": \"hardenize/com.hardenize.schemas.dns.DanglingProblem\", \"rrType\": \"CNAME\", \"rrDomain\": \"origin-mx.stagetest.test.com.test.com\", \"causeDomain\": \"origin-www.stagetest.test.com.test.com\", \"danglingType\": \"nxdomain\", \"rrEffectiveDomain\": \"origin-mx.stagetest.test.com.test.com\"}"
}
}
]
}
Output i am getting is as below was looking a way where could field name in header and values in a column or cell so far getting the entire record in 1 cell. Any way we can just get specific field only like title, severity or issueid not everything but only the feilds i need.
Try:
import json
import pandas as pd
with open("your_file.json", "r") as f_in:
data = json.load(f_in)
df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])
Prints:
title severity issueId
0 Nameserver name doesn't resolve to an IPv6 address low 110052
1 Dangling domain alias (CNAME) high 77881
To save as CSV you can do:
df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)
try this...
df = pd.json_normalize(inputfile)
in place of the line you have.
Finally this worked for me #Andrej Kesely thanks for the inputs. sharing as might help others.
import pandas as pd
import json
with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
data = json.load(inputfile)
df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])
df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)
I have a simple python (version 3.10.2) script that uses the requests library to make a REST call to an API. The call returns a list of objects. I find that the json.loads() function will not parse the JSON returned in the response. It gives me the following error:
TypeError: the JSON object must be str, bytes or bytearray, not list
Oddly, the json.dumps() function can successfully format the same data.
Here is the code:
import requests
import json
def get_groups(url):
# TODO SSL/TLS turned OFF (verify=False)
response = requests.get(url + "/groups", verify=False)
print("status code:", response.status_code)
print("JSON:\n")
print(json.dumps(response.json(), indent=2))
json.loads(response.json())
Here is an example what json.dumps() is outputting:
[
{
"id": 6,
"web_url": "https://<URL redacted>/groups/test",
"name": "test",
"path": "test",
"description": "",
"visibility": "public",
"share_with_group_lock": false,
"require_two_factor_authentication": false,
"two_factor_grace_period": 48,
"project_creation_level": "developer",
"auto_devops_enabled": null,
"subgroup_creation_level": "maintainer",
"emails_disabled": null,
"mentions_disabled": null,
"lfs_enabled": true,
"default_branch_protection": 2,
"avatar_url": null,
"request_access_enabled": true,
"full_name": "test",
"full_path": "test",
"created_at": "2021-08-03T15:41:34.523Z",
"parent_id": null,
"ldap_cn": null,
"ldap_access": null
}
]
I have seen lots of postings about this and every one mentions using json.loads() to parse the JSON data. Not sure why it works for them, but it doesn't work for me.
Any ideas as to what is wrong?
As was pointed out by #tkausl, in this case the HTTP/REST response object returns the JSON data already parsed. For some reason I missed that. I don't need to to use the json library.
I am coding using Python, Flask, pandas. I am reading data from a REST API.
When I get the data from the REST API, the dispenserId used to be an Integer meaning that each value started with a number different from 0.
This weekend, I received dispenserIds starting with a 0 (zero) character, so calling json.load(path_to_filenamen) does not parse the JSON file anymore due to errors.
See the sample
{
"result": {
"dispensers": [
{
"dispenserId": 00000,
"dispenserName": "1st Floor",
"dispenserType": "H2",
"status": "Green",
"locationId": 12345
},
{
"dispenserId": 98765,
"dispenserName": "2nd Floor",
"dispenserType": "S4",
"status": "Green",
"locationId": 23456
},
{
"dispenserId": 00001,
"dispenserName": "3rd Floor",
"dispenserType": "H2",
"status": "Green",
"locationId": 34567
}
]
}
}
I receive Exception has occurred: TypeError string indices must be integers when I call data["result"]["dispensers"].
How can I indicate to the JSON parser that the dispenserId is a string instead of an Integer?
Few things :
1.
your dispensers collection is not closed (no closing squarre braket, so it cannot work)
2.
since you got integers, you should not get that much zeros. you should have :
"dispenserId": 0,
or
"dispenserId": 1,
Once you got this corrected, a["result"]["dispensers"] will work just fine, undepending of the values of "dispenserId".
OR :
The values should be given as strings :
"dispenserId": "00000",
and then convert them into integers :
int(a["result"]["dispensers"][0]["dispenserId"])
But anyway, your json file does not respect the json format.
This piece of code should "clean" your file, by deleting all non wanted "0" and convert into Json format :
import re
import codecs
import json
pattern = "(:\s*)0*(\d)"
with codecs.open(path_to_filenamen,"r","utf-8") as f:
myJson = json.loads(re.sub(pattern,'\\1\\2', f.read()))
The myJson var is in Json format, you can therefore use myJson["result"]["dispensers"]
I am on Windows 10. I recently obtained a large JSON file (200 MB) via webscraping, and I am now trying to import the file to MongoDB using Compass Community via the import data button. However, whenever I try to import the file, I get the following error:
Unexpected token l in JSON at position 0 while parsing near 'l
Here are the first few lines of the JSON file I am trying to import:
{
"bands": [{
"activity": "Split-up",
"bandMembers": ["https://www.metal-archives.com/artists/Jon_Powlowski/760544", "https://www.metal-archives.com/artists/Ruben_Martinez/760545", "https://www.metal-archives.com/artists/Greg_Eickmier/416646", "https://www.metal-archives.com/artists/Nedwob/471955"],
"bandName": "A // Solution",
"country": "United States",
"dateAdded": "2018-08-04",
"genre": "Crust Punk/Thrash Metal",
"label": {
"labelName": "Voltic Records",
"labelUrl": "https://www.metal-archives.com/labels/Voltic_Records/47794"
},
"location": "California",
"lyricalThemes": "N/A",
"releases": [{
"numReviews": 0,
"releaseName": "Butterfly",
"reviewAverage": null,
"type": "EP",
"url": "https://www.metal-archives.com/albums/A_--_Solution/Butterfly/723154",
"year": "1989"
}, {
"numReviews": 0,
"releaseName": "Things to Come",
"reviewAverage": null,
"type": "EP",
"url": "https://www.metal-archives.com/albums/A_--_Solution/Things_to_Come/723155",
"year": "1995"
}
],
"similarArtists": null,
"url": "https://www.metal-archives.com/bands/A_--_Solution/3540442600",
"yearFormed": "N/A",
"yearsActive": "N/A"
}, {
"activity": "Active",
Does anyone have an idea on how I can fix this error?
EDIT: I ran the import again after restarting Compass and got this:
Unexpected token : in JSON at position 0 while parsing near ': null,
Is this error related at all to the other one?
The import data button needs the object to be inlined according to https://docs.mongodb.com/compass/master/import-export/#import-data-into-a-collection.
Apart from that, I had issues with the "Unexpected token : in JSON at position 0", and even tho I could not figure out the cause yet, I tried creating a new .json and copying the content into it, and surprisingly, it worked.
Also, remember to leave a line break at the end of the file.
To convert the json into a 1 line format, you could use the following python script:
import json
import sys
import codecs
import os
def read_file(name):
with open(name, encoding='utf8') as f:
return f.read()
def write_file(name, text):
os.makedirs(os.path.dirname(name), exist_ok=True)
with codecs.open(name, "w", "utf-8-sig") as temp:
temp.writelines(text)
text = read_file(sys.argv[1])
data = json.loads(text)
result = json.dumps(text, ensure_ascii=False) + "\n"
write_file(sys.argv[2], result)
I need to implement a simple shell utility in Ruby which parses JSON from a file and return a particular field from it.
JSON examples to be parsed:
{"status": "fail", "messages": ["Out of capacity"]}
{"status": "success", "messages": [], "result": {"node": {"ip": "1.2.3.4", "description": "", "id": 974, "name": "VM#3"}}}
Idea is to create a CLI utility with two parameters: JSON file to read and field from JSON to extract:
./get_json_field.rb ~/tmp.XXXXXX 'result.node.ip'
./get_json_field.rb ~/tmp.XXXXXX 'messages.0'
I'm struggling how to map 2nd parameter to parsed JSON data structure in Ruby. I can write an iterator for sure, splitting string to an array using dot as separator an go through it item by item but this doesn't look like elegant solution.
Any suggestions for more elegant way?
There is nothing wrong with splitting string and going through parts of it:
require 'json'
data1 = JSON.load('{"status": "fail", "messages": ["Out of capacity"]}')
data2 = JSON.load('{"status": "success", "messages": [], "result": {"node": {"ip": "1.2.3.4", "description": "", "id": 974, "name": "VM#3"}}}')
def get_from_json(data, query)
query.split('.').inject(data) do |memo, key|
key = key.to_i if memo.is_a? Array
memo.fetch(key)
end
end
get_from_json(data1, 'messages.0') # => "Out of capacity"
get_from_json(data2, 'result.node.ip') # => "1.2.3.4"
Take a look at jq it might already do what you are looking for.
jq .messages[0]
jq .node.message.ip
See http://stedolan.github.com/jq/