invalid json format of facebook graph api - json

I am using graph api to fetch ad audiences information, when i tried https://graph.facebook.com/act_adaccountid/customaudiences?fields=
but when i tried it through a program i am getting invalid json format
from urlib2 import urlopen
from simplejson import loads
x = loads(urlopen('https://graph.facebook.com/act_adaccountid/customaudiences?fields=<comma_separate_list_of_fields?access_token='XXXXXXXXXX').read())
output:
{'paging': {'cursors': {'after': 'NjAxMDE5ODE5NjgxMw==', 'before': 'NjAxNTAzNDkwOTAxMw=='}}, 'data': [{'account_id': 1377346239145180L, 'id': '6015034909013'}, {'account_id': 1377346239145180L, 'id': '6015034901213'}, {'account_id': 1377346239145180L, 'id': '6015034901013'}, {'account_id': 1377346239145180L, 'id': '6015034900413'}
{'data': []}
expected output:
http://pastebin.com/5265tJ8w

Related

Django APIView data format from AJAX

I was trying to send an array of objects in views.py using APIView to insert multiple rows in 1 post request. This is my JavaScript data format:
const data = {
group_designation: [
{id: 1},
{id: 2},
{id: 3},
]
}
I run an insomnia app and it only accepts this kind of format:
{
"group_designation": [
{"id": 1},
{"id": 2},
]
}
However, if I send a post request using the javascript format stated above, it gives me a bad request error(400). This is the payload in network tab:
group_designation[0][id]: 1
group_designation[1][id]: 2
group_designation[2][id]: 3
In Django, this is the request.data result:
<QueryDict: {
'group_designation[0][id]': ['1'],
'group_designation[1][id]': ['2'],
'group_designation[2][id]': ['3']
}>
My code in Django:
def post(self, request):
temp_objects = []
new_data_format = {'group_designation': temp_objects}
serializer = GroupSerializer(data=new_data_format, many=True)
if serializer.is_valid(raise_exception=True):
group_data_saved = serializer.save()
return Response({
"success": "success!!!"
})
I was just trying to rewrite the data format so it will be saved but no luck trying. Please help. Thank you!

How to save twitterscraper output as json file

I read the documentation, but the documentation only mentions saving output as .txt file. I tried to modify the code to save output as JSON.
save as .txt:
from twitterscraper import query_tweets
if __name__ == '__main__':
list_of_tweets = query_tweets("Trump OR Clinton", 10)
#print the retrieved tweets to the screen:
for tweet in query_tweets("Trump OR Clinton", 10):
print(tweet)
#Or save the retrieved tweets to file:
file = open(“output.txt”,”w”)
for tweet in query_tweets("Trump OR Clinton", 10):
file.write(tweet.encode('utf-8'))
file.close()
I tried to modify this to save as JSON:
output = query_tweets("Trump OR Clinton", 10)
jsonfile = open("tweets.json","w")
for tweet in output:
json.dump(tweet,jsonfile)
jsonfile.close()
TypeError: Object of type Tweet is not JSON serializable
But I get the above type error
How can I save output as JSON?
I know that typing command in termminal creates JSON, but I wanted to write a python version.
We'll need to convert each tweet to a dict first, as Python class objects are not serializable as JSON. Looking at the first object we can see the available methods and attributes like this: help(list_of_tweets[0]). Accessing the __dict__ of the first object we see:
# print(list_of_tweets[0].__dict__)
{'user': 'foobar',
'fullname': 'foobar',
'id': '143846459132929',
'url': '/foobar/status/1438420459132929',
'timestamp': datetime.datetime(2011, 12, 5, 23, 59, 53),
'text': 'blah blah',
'replies': 0,
'retweets': 0,
'likes': 0,
'html': '<p class="TweetTextSize...'}
Before we can dump it to json we'll need to convert the datetime objects to strings.
tweets = [t.__dict__ for t in list_of_tweets]
for t in tweets:
t['timestamp'] = t['timestamp'].isoformat()
Then we can use the json module to dump the data to a file.
import json
with open('data.json', 'w') as f:
json.dump(tweets, f)

skipping Attribute error while importing twitter data into pandas

I have almost 1 gb file storing almost .2 mln tweets. And, the huge size of file obviously carries some errors. The errors are shown as
AttributeError: 'int' object has no attribute 'items'. This occurs when I try to run this code.
raw_data_path = input("Enter the path for raw data file: ")
tweet_data_path = raw_data_path
tweet_data = []
tweets_file = open(tweet_data_path, "r", encoding="utf-8")
for line in tweets_file:
try:
tweet = json.loads(line)
tweet_data.append(tweet)
except:
continue
tweet_data2 = [tweet for tweet in tweet_data if isinstance(tweet,
dict)]
from pandas.io.json import json_normalize
tweets = json_normalize(tweet_data2)[["text", "lang", "place.country",
"created_at", "coordinates",
"user.location", "id"]]
Can a solution be found where those lines where such error occurs can be skipped and continue for the rest of the lines.
The issue here is not with lines in data but with tweet_data itself. If you check your tweet_data, you will find one more elements which are of 'int' datatype (assuming your tweet_data is a list of dictionaries as it only expects "dict or list of dicts").
You may want to check your tweet data to remove values other that dictionaries.
I was able to reproduce with below example for json_normalize document:
Working Example:
from pandas.io.json import json_normalize
data = [{'state': 'Florida',
'shortname': 'FL',
'info': {
'governor': 'Rick Scott'
},
'counties': [{'name': 'Dade', 'population': 12345},
{'name': 'Broward', 'population': 40000},
{'name': 'Palm Beach', 'population': 60000}]},
{'state': 'Ohio',
'shortname': 'OH',
'info': {
'governor': 'John Kasich'
},
'counties': [{'name': 'Summit', 'population': 1234},
{'name': 'Cuyahoga', 'population': 1337}]},
]
json_normalize(data)
Output:
Displays datarame
Reproducing Error:
from pandas.io.json import json_normalize
data = [{'state': 'Florida',
'shortname': 'FL',
'info': {
'governor': 'Rick Scott'
},
'counties': [{'name': 'Dade', 'population': 12345},
{'name': 'Broward', 'population': 40000},
{'name': 'Palm Beach', 'population': 60000}]},
{'state': 'Ohio',
'shortname': 'OH',
'info': {
'governor': 'John Kasich'
},
'counties': [{'name': 'Summit', 'population': 1234},
{'name': 'Cuyahoga', 'population': 1337}]},
1 # *Added an integer to the list*
]
result = json_normalize(data)
Error:
AttributeError: 'int' object has no attribute 'items'
How to prune "tweet_data": Not needed, if you follow update below
Before normalising, run below:
tweet_data = [tweet for tweet in tweet_data if isinstance(tweet, dict)]
Update: (for foor loop)
for line in tweets_file:
try:
tweet = json.loads(line)
if isinstance(tweet, dict):
tweet_data.append(tweet)
except:
continue
The final form of code looks like this:
tweet_data_path = raw_data_path
tweet_data = []
tweets_file = open(tweet_data_path, "r", encoding="utf-8")
for line in tweets_file:
try:
tweet = json.loads(line)
if isinstance(tweet, dict):
tweet_data.append(tweet)
except:
continue
This clears all the possibility of attribute error that might hinder importing into panda dataframe.

Unable to make dns-over-https with cloudflare and python requests

I'm trying to write a quick script that could do dns lookups using the new 1.1.1.1 DNS over HTTPS public DNS server from CloudFlare.
Looking at their docs here https://developers.cloudflare.com/1.1.1.1/dns-over-https/json-format/ I'm not sure what I'm doing wrong and why I'm getting a 415 status code (415 Unsupported content type).
Here is my script:
#!/usr/bin/env python
import requests
import json
from pprint import pprint
url = 'https://cloudflare-dns.com/dns-query'
client = requests.session()
json1 = {'name': 'example.com','type': 'A'}
ae = client.get(url, headers = {'Content-Type':'application/dns-json'}, json = json1)
print ae.raise_for_status()
print ae.status_code
print ae.json()
client.close()
Here is the output:
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 415 Client Error: Unsupported Media Type for url: https://cloudflare-dns.com/dns-query
and for the json response (expected I believe):
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Using curl this works perfectly fine.
Many thanks
You should not set a JSON request at all. The response uses JSON.
Put the application/dns-json value in a ct parameter:
JSON formatted queries are sent using a GET request. When making requests using GET, the DNS query is encoded into the URL. An additional URL parameter of ‘ct’ should indicate the MIME type (application/dns-json).
A GET request never has a body, so don't try to send JSON:
params = {
'name': 'example.com',
'type': 'A',
'ct': 'application/dns-json',
}
ae = client.get(url, params=params)
Demo:
>>> import requests
>>> url = 'https://cloudflare-dns.com/dns-query'
>>> client = requests.session()
>>> params = {
... 'name': 'example.com',
... 'type': 'A',
... 'ct': 'application/dns-json',
... }
>>> ae = client.get(url, params=params)
>>> ae.status_code
200
>>> from pprint import pprint
>>> pprint(ae.json())
{'AD': True,
'Answer': [{'TTL': 2560,
'data': '93.184.216.34',
'name': 'example.com.',
'type': 1}],
'CD': False,
'Question': [{'name': 'example.com.', 'type': 1}],
'RA': True,
'RD': True,
'Status': 0,
'TC': False}

Validating trello board API responses in Python unittest

I am writing a unittest that queries the trello board API and want to assert that a particular card exists.
The first attempt was using the /1/boards/[board_id]/lists rewuest which gives results like:
[{'cards': [
{'id': 'id1', 'name': 'item1'},
{'id': 'id2', 'name': 'item2'},
{'id': 'id3', 'name': 'item3'},
{'id': 'id4', 'name': 'item4'},
{'id': 'id5', 'name': 'item5'},
{'id': 'id6', 'name': 'item6'}],
'id': 'id7',
'name': 'ABC'},
{'cards': [], 'id': 'id8', 'name': 'DEF'},
{'cards': [], 'id': 'id9', 'name': 'GHI'}]
I want to assert that 'item6' is indeed in the above mentioned list. Loading the json and using assertTrue, like this:
element = [item for item in json_data if item['name'] == "item6"]
self.assertTrue(element)
but I receive an error: 'TypeError: the JSON object must be str, bytes or bytearray, not 'list'.
Then discovered using the /1/boards/[board_id]/cards request gives a plain list of cards:
[
{'id': 'id1', 'name': 'item1'},
{'id': 'id2', 'name': 'item2'},
...
]
How should I write this unittest assertion?
The neatest option is to create a class that will equal the dict for the card you want to ensure is there, then use that in an assertion. For your example, with a list of cards returned over the api:
cards = board.get_cards()
self.assertIn(Card(name="item6"), cards)
Here's a reasonable implementation for the Card() helper class, it may look a little complex but is mostly straight forward:
class Card(object):
"""Class that matches a dict with card details from json api response."""
def __init__(self, name):
self.name = name
def __eq__(self, other):
if isinstance(other, dict):
return other.get("name", None) == self.name
return NotImplemented
def __repr__(self):
return "{}({!r}, {!r})".format(
self.__class__.__name__, self.key, self.value)
You could add more fields to validate as needed.
One question worth touching on at this point is whether the unit test should be making real api queries. Generally a unit test would have test data to just focus on the function you control, but perhaps this is really an integration test for your trello deployment using the unittest module?
import unittest
from urllib.request import urlopen
import json
class Basic(unittest.TestCase):
url = 'https://api.trello.com/1/boards/[my_id]/cards?fields=id,name,idList,url&key=[my_key]&token=[my_token]'
response = urlopen(url)
resp = response.read()
json_ob = json.loads(resp)
el_list = [item for item in json_ob if item['name'] == 'card6']
def testBasic(self):
self.assertTrue(self.el_list)
if __name__ == '__main__':
unittest.main()
So what I did wrong: I focused too much on the list itself which I got after using the following code:
import requests
from pprint import pprint
import json
url = "https://api.trello.com/1/boards/[my_id]/lists"
params = {"cards":"open","card_fields":"name","fields":"name","key":"[my_key]","token":"[my_token]"}
response = requests.get(url=url, params=params)
pprint(response.json())