Pull Data from TMX Using Python 3.6.8 - json

About two months ago I asked a question about pulling data from the CME in the json format. I was successfully able to pull the appropriate data with your help.
Want to remind everyone that I am still pretty new to Python, so please bear with me if my question is relatively straightforward.
I am trying to pull data again again in json format but from a different website and things do not appear to be cooperating. In particular I am trying to pull the following data:
https://api.tmxmoney.com/marketactivity/candeal?ts=1567086212742
This is what I have tried.
import pandas as pd
import json
import requests
cadGovt = 'https://api.tmxmoney.com/marketactivity/candeal?ts=1567086212742'
sample_data = requests.get(cadGovt)
sample_data.encoding = 'utf-8'
test = sample_data.json()
print(test)
I would like to get a json of the information (which is literally just a table that has term, description, bid yield, ask yield, change, bid price, ask price, change).
Instead I am getting 'JSONDecodeError: Expecting value: line 1 column 1 (char 0)'.
If anyone has any guidance or advice that would be greatly appreciated.

It's cause the page you're getting is not returning JSON but an HTML page. So when you try to use
test = sample_data.json()
You're trying to parse HTML as JSON which won't work. You can scrape the data off of the page though, here's an example in bs4 you can try, it's a bit rusty on the edges but it should work.
import requests as r
from bs4 import beautifulsoup
url = 'https://api.tmxmoney.com/marketactivity/candeal?ts=1567086212742'
response = r.get(url)
soup = BeautifulSoup(response.text, 'lxml')
for tr in soup.find_all('tr'):
print(tr.text+"\n")
you can get the TD such as this
for tr in soup.find_all('tr'):
tds = tr.find_all('td')

Related

JSON dataset features not being loaded properly?

I have this code:
import pandas as pd
import json
file = "/Users/mickelborg/Desktop/Dataset/2018/Carbon_Minoxide_(CO)_2018.json"
with open(file, 'r') as j:
contents = json.loads(j.read())
oxide = pd.DataFrame.from_dict(contents, orient='index')
oxide
I'm trying to get a readout of the JSON dataset by the features/columns, but they don't seem to load properly.
Currently this is the output that I have:
LINK
As can be seen from the image, the data loads incorrectly. "county_code" should each have their own row in the dataset, along with all the other following features.
What am I doing wrong in this regard?
Thanks a lot for your help!

How to extract all values from one key type of a json file?

I'm trying to learn how to do this (I can barely code), I'm not trying to get you (the wonderful and generous reader of my post) to do my job for me. Obviously full solutions are welcome but my goal is to figure out the HOW so I can do this myself.
Project - Summary
Extract just the attachment file urls from a massive json file (I believe the proper term is "parse json strings").
Project - Wordy Explanation
I'm trying to get all the attachments from a .json file that is an export of the entire Trello board I have. It has a specific key field for these attachments at the end of a json tree like below:
TrelloBoard.json
> cards
>> 0
>>> attachments
>>>> 0
>>>>> url "https://trello-attachments.s3.amazonaws.com/###/####/#####/AttachedFile.pdf"
(The first 0 goes up to 300+, representing each Trello card, the second 0 has never gone above 0, as it represents the number of attachments per card)
I've looked up tutorials online of how to parse strings from json files, but I haven't been able to get anything to print out (write) from those attempts. Seeing as I have over 100 attachments per month to download, a code would clearly be the best way to do it -- but I'm completely stumped on how and am asking you, dear reader, to help point me in the right direction.
Code Attempt
Any programming language is fine (I'm new enough to not be attached to any), but I've tried the following in python (among other codes) to no avail in Command Prompt.
import json
with open('G:\~WORK~\~Codes~\trello.json') as f:
data = json.load(f)
# Output: {'cards': '0', 'attachments': '0', 'url': ['https://trello-attachments.s3.amazonaws.com']}
print(data)
Use python dict to get the needed value:
import json
with open('G:\~WORK~\~Codes~\trello.json') as f:
data = json.load(f)
url = data['url']

Json Parsing from API With Dicts

I am writing a piece of code to retrieve certain information from the League of Legends api.
I have everything working fine and printing to my console, I have even managed to access the data and print off only the information that I need, the only issue is there are 299 values which I would like printed off and I can only manage to print one at a time. This would obviously be the worst way to sort through it as it would take forever to write the program. I have spent over 3 days researching and watching videos with no success so far.
Below is the code I currently have (minus imports).
url =('https://na1.api.riotgames.com/lol/league/v4/challengerleagues/by-
queue/RANKED_SOLO_5x5?api_key=RGAPI-b5187110-2f16-48b4-8b0c-938ae5bddccb')
r = requests.get(url)
response_dict = r.json()
print(response_dict['entries'][0]['summonerName'])
print(response_dict['entries'][1]['summonerName'])
When I attempt to index entries like '[0:299]' I get the following error: list indices must be integers or slices, not str.
I would simply convert the list of dictionaries within entries into a dataframe. You have all the info nicely organised and can access specific items easily including your column for summonerName .
import requests
from bs4 import BeautifulSoup as bs
import json
import pandas as pd
#url = yourURL
res = requests.get(url, headers = {'user-agent' : 'Mozilla/5.0'})
soup = bs(res.content, 'lxml')
data = json.loads(soup.select_one('p').text)
df = pd.DataFrame(data['entries'])
print(df)
You can loop over the index, that'll print them all out
for i in range(300):
print(response_dict['entries'][i]['summonerName'])
When you use response_dict['entries'][M:N]
You create a new list of dictionaries that have to be extracted before you can reference ['summonerName'] directly
If you print(response_dict['entries'][0:3])
You'll see what I mean

Read data from thingspeak using python using urllib library

I wanted to read current data from thingspeak website
I used urllib library to get data using read url of the channel
i used python code
import urllib
from bs4 import BeautifulSoup
data=urllib.urlopen("https://api.thingspeak.com/channels/my_channel_no/feeds.json?
results=2")
print data.read()
select=repr(data.read())
print select
sel=select[20:]
print sel
https://api.thingspeak.com/channels/my_channel_no/feeds.json?results=2
I get this result from the query
'{"channel{"id":my_channel_no,"name":"voltage","latitude":"0.0","longitude":"0.0 ","field1":"Field Label 1","field2":"Field Label 2","created_at":"2018-04-05T16:33:14Z","updated_at":"2018-04-09T15:39:43Z","last_entry_id":108},"feeds":[{"created_at":"2018-04-09T15:38:42Z","entry_id":107,"field1":"20.00","field2":"40.00"},{"created_at":"2018-04-09T15:39:44Z","entry_id":108,"field1":"20.00","field2":"40.00"}]}'
But when this line was executed
select=repr(data.read())
the result was
"''"
And
sel=select[20:]
Output
''
This is used to reduce the length of the query
Can anyone give me sense of direction whats happening and solution for this selection of field value is the goal.

How to take results from json input and put into a csv output in python

I am trying to work out how to take results from python regarding the sentiment polarity of tweets (original input from json file) and turn them into a csv i can export for use in R - im using Python 2.7
I have tried a couple of different ways from similar stackflow queries, but no success so far.
For example, using pandas package
tweet_polarity = []
for tweet in tweet_text:
polarity = analyser.polarity_scores(tweet[1])
tweet_polarity.append([tweet[0], tweet[1], polarity['compound'],
polarity['neg'], polarity['neu'], polarity['pos']])
import pandas
df = pandas.DataFrame(data={"tweet_polarity": tweet_polarity, "tweet_text": tweet_text,
"tweets}": tweets})
df.to_csv("polarityRES.csv")
creates a csv file, but seems to just repeat the same tweet over and over again rather than creating a nice dataframe with the polarity scores
I thought about using cvs.writer, but haven't been able to find a relevant example to what I'm trying to do. Any suggestions gang?
(Sorry for my terrible explanation, I'm still getting to grips with the basics while trying to do this - and typing one handed with tendonitis!)