How to convert Fantasy Premier League Data from JSON to CSV? - json

I am new to python and as per my thesis work I am trying to convert JSON to csv.I am able to download data in JSON but when I am writing it back using dictionaries it is not converting JSON to CSV with every column.
import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.api as sm
import matplotlib.pyplot as plt
import numpy as np
import requests
from pprint import pprint
import csv
from time import sleep
s1='https://fantasy.premierleague.com/drf/element-summary/'
print s1
players = []
for player_link in range(1,450,1):
link = s1+""+str(player_link)
print link
r = requests.get(link)
print r
player =r.json()
players.append(player)
sleep(1)
with open('C:\Users\dell\Downloads\players_new2.csv', 'w') as f: # Just use 'w' mode in 3.x
w = csv.DictWriter(f,player.keys())
w.writeheader()
for player in players:
w.writerow(player)
I have uploaded the expected output(dec_15_expected.csv) and the program out with file name "player_new_wrong_output.csv"
https://drive.google.com/drive/folders/0BwKYmRU_0K6tZUljd3Q0aG1LT0U?usp=sharing
It will be a great help if some can tell what I am doing wrong.

Converting JSON to CSV is simple with pandas. Try this:
import pandas as pd
df=pd.read_json("input.json")
df.to_csv('output.csv')

Related

I get a JSON error while trying to run a script for yFinance

import pandas as pd
import numpy as np
import datetime
import yfinance as yf
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
To use interactive plotting we can also use cufflinks
import cufflinks as cf
To enable offline mode
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
cf.go_offline()
The function downloads daily market data to a pandas DataFrame
def download_daily_data(ticker, start, end):
data = yf.download(ticker, start, end)
return data
The function downloads daily market data to a pandas DataFrame
def download_daily_data(ticker, start, end):
data = yf.download(ticker, start, end)
return data

I am having trouble converting my nested json into a dataframe. I am getting the json from an API and want it in a dataframe

This code is from Sportradar API. The API outputs the data as JSON or XML; below is my attempt at taking the JSON and making it into a dataframe.
import numpy as np
import pandas as pd
import http.client
import json
from pandas.io.json import json_normalize
#API Call including my key
conn = http.client.HTTPSConnection("api.sportradar.us")
conn.request("GET", "/nfl/official/trial/v5/en/players/0acdcd3b-5442-4311-a139-ae7c506faf88/profile.json?api_key=99s3ewmn5rrdrd9r3v5wrfgd")
#conn.request("GET", "/nfl/official/trial/v5/en/games/b7aeb58f-7987-4202-bc41-3ad9a5b83fa4/pbp.json?api_key=99s3ewmn5rrdrd9r3v5wrfgd")
#conn.request("GET", "/nfl/official/trial/v5/en/teams/0d855753-ea21-4953-89f9-0e20aff9eb73/full_roster.json?api_key=99s3ewmn5rrdrd9r3v5wrfgd")
#conn.request("GET", "/nfl/official/trial/v5/en/games/030d37cf-b896-4f10-b16e-2a5120fef6cf/pbp.json?api_key=99s3ewmn5rrdrd9r3v5wrfgd")
res = conn.getresponse()
data = res.read()
data_dec = data.decode("utf-8")
json_data = json.loads(data_dec)
flat_data = json_normalize(json_data)
print(json_data)
df = pd.DataFrame.from_records(flat_data)
df2 = pd.DataFrame.from_dict(json_data, orient='index')
df2.reset_index(level=0, inplace=True)
#The closest thing to a dataframe I can get
df.head()
Why not make use of a Python Wrapper that is publicly available and maintained.
See link.

How to fix "FileNotFoundError" when using proper code and file extension CSV?

I am trying to open a file with the extension .csv in python, however it keeps saying that the file is not found. I am copying the path from the side bar, so I don't believe that's the problem
I have tried to insert / and ./ before the path of the file
And r in front of the file name
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
bkgrnd = pd.read_csv('/Desktop/Sro/Natrium22.csv')
No matter what I've tried, it keeps saying FileNotFoundError
you can import csv if file will be always .csv,
import csv
with open('C:\Users\user\Desktop\Sro\Natrium22.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
specifix on windows, it needs normalization of your pathname, maybe thats the issue,
try doing, will surely work,
import os
import pandas as pd
cwd = os.getcwd()
filePath = 'C:/Users/user/Desktop/Sro/Natrium22.csv'
data = pd.read_csv(os.path.normcase(os.path.join(cwd, filePath)))
print(data)
you can try even,
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
bkgrnd = pd.read_csv(r'C:\Users\user\Desktop\Sro\Natrium22.csv')
print(bkgrnd)

Export JSON to CSV using Python

I wrote a code to extract some information from a website. the output is in JSON and I want to export it to CSV. So, I tried to convert it to a pandas dataframe and then export it to CSV in pandas. I can print the results but still, it doesn't convert the file to a pandas dataframe. Do you know what the problem with my code is?
# -*- coding: utf-8 -*-
# To create http request/session
import requests
import re, urllib
import pandas as pd
from BeautifulSoup import BeautifulSoup
url = "https://www.indeed.com/jobs?
q=construction%20manager&l=Houston&start=10"
# create session
s = requests.session()
html = s.get(url).text
# exctract job IDs
job_ids = ','.join(re.findall(r"jobKeysWithInfo\['(.+?)'\]", html))
ajax_url = 'https://www.indeed.com/rpc/jobdescs?jks=' +
urllib.quote(job_ids)
# do Ajax request and convert the response to json
ajax_content = s.get(ajax_url).json()
print(ajax_content)
#Convert to pandas dataframe
df = pd.read_json(ajax_content)
#Export to CSV
df.to_csv("c:\\users\\Name\desktop\\newcsv.csv")
The error message is:
Traceback (most recent call last):
File "C:\Users\Mehrdad\Desktop\Indeed 06.py", line 21, in
df = pd.read_json(ajax_content)
File "c:\python27\lib\site-packages\pandas\io\json\json.py", line 408, in read_json
path_or_buf, encoding=encoding, compression=compression,
File "c:\python27\lib\site-packages\pandas\io\common.py", line 218, in get_filepath_or_buffer
raise ValueError(msg.format(_type=type(filepath_or_buffer)))
ValueError: Invalid file path or buffer object type:
The problem was that nothing was going into the dataframe when you called read_json() because it was a nested JSON dict:
import requests
import re, urllib
import pandas as pd
from pandas.io.json import json_normalize
url = "https://www.indeed.com/jobs?q=construction%20manager&l=Houston&start=10"
s = requests.session()
html = s.get(url).text
job_ids = ','.join(re.findall(r"jobKeysWithInfo\['(.+?)'\]", html))
ajax_url = 'https://www.indeed.com/rpc/jobdescs?jks=' + urllib.quote(job_ids)
ajax_content= s.get(ajax_url).json()
df = json_normalize(ajax_content).transpose()
df.to_csv('your_output_file.csv')
Note that I called json_normalize() to collapse the nested columns from the JSON. I also called transpose() so that the rows were labelled with the job ID rather than columns. This will give you a dataframe that looks like this:
0079ccae458b4dcf <p><b>Company Environment: </b></p><p>Planet F...
0c1ab61fe31a5c62 <p><b>Commercial Construction Project Manager<...
0feac44386ddcf99 <div><div>Trendmaker Homes is currently seekin...
...
It's not really clear what your expected output is, though ... what are you expecting the DataFrame/CSV file to look like?. If you actually were looking for just a single row/Series with the job ID's as column labels, just remove the call to transpose()

Can't load dataset into ipython. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 1: invalid continuation byte

Fairly new to using ipython so I'm still getting confused quite easily. Here is my code so far. After loading I have to display only the first 5 rows of the file.
# Import useful packages for data science
from IPython.display import display, HTML
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# Load concerts.csv
path1 = 'C:\\Users\\Cathal\\Documents\\concerts.csv'
concerts = pd.read_csv(path1)
Thanks in advance for any help.
try
concerts = pd.read_csv(path1, encoding = 'utf8')
if that doesnt work try
concerts = pd.read_csv(path1, encoding = "ISO-8859-1")