I run an iPython Notebook server, and would like users to be able to download a pandas dataframe as a csv file so that they can use it in their own environment. There's no personal data, so if the solution involves writing the file at the server (which I can do) and then downloading that file, I'd be happy with that.
How about using the FileLinks class from IPython? I use this to provide access to data directly from Jupyter notebooks. Assuming your data is in pandas dataframe p_df:
from IPython.display import FileLink, FileLinks
p_df.to_csv('/path/to/data.csv', index=False)
p_df.to_excel('/path/to/data.xlsx', index=False)
FileLinks('/path/to/')
Run this as a notebook cell and the result will be a list of links to files downloadable directly from the notebook. '/path/to' needs to be accessible for the notebook user of course.
For not too large tables you can use the following code:
import base64
import pandas as pd
from IPython.display import HTML
def create_download_link( df, title = "Download CSV file", filename = "data.csv"):
csv = df.to_csv()
b64 = base64.b64encode(csv.encode())
payload = b64.decode()
html = '<a download="{filename}" href="data:text/csv;base64,{payload}" target="_blank">{title}</a>'
html = html.format(payload=payload,title=title,filename=filename)
return HTML(html)
df = pd.DataFrame(data = [[1,2],[3,4]], columns=['Col 1', 'Col 2'])
create_download_link(df)
If you want to avoid storing CSVs on the server, you can use this Javascript alternative that create the CSV on the client-side:
from IPython.display import Javascript
js_download = """
var csv = '%s';
var filename = 'results.csv';
var blob = new Blob([csv], { type: 'text/csv;charset=utf-8;' });
if (navigator.msSaveBlob) { // IE 10+
navigator.msSaveBlob(blob, filename);
} else {
var link = document.createElement("a");
if (link.download !== undefined) { // feature detection
// Browsers that support HTML5 download attribute
var url = URL.createObjectURL(blob);
link.setAttribute("href", url);
link.setAttribute("download", filename);
link.style.visibility = 'hidden';
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
}
}
""" % data_in_dataframes.to_csv(index=False).replace('\n','\\n').replace("'","\'")
Javascript(js_download)
Basically, it creates a CSV string in python from the pd dataframe and use it in a small js script that creates a CSV file on the client side and open a saving dialog to save it on the user computer. I tested in my iPython env and it works like a charm!
Note that I am escaping the \n. If I don't do so, the js script string will have the CSV variable written on multiple lines.
For example, print "var csv = '%s'" % industries_revenues.to_csv(index=False).replace('\n','\\n') results to this:
var csv = 'Industry,sum_Amount\nBanking,65892584.0\n(...)Finance,20211917.0\n'
Instead of print "var csv = '%s'" % industries_revenues.to_csv(index=False) without the \n escaping that results on a multiple lined and therefore errored javascript:
var csv = 'Industry,sum_Amount
Banking,65892584.0
(...)
Finance,20211917.0
'
I also escape the ' not to break the variable string in javascript.
A function that creates a csv download link, based on Coen Jonker's answer and similar to Yasin Zähringer's answer except that it uses IPython.display.FileLink so that there is no need to create html code.
The function has an optional delete prompt so you can delete the file after download to keep the notebook server clean.
# Import a module to create a data frame
import pandas
# Import a module to display a link to the file
from IPython.display import FileLink
# Import a module to delete the file
import os
# Create a download function
def csv_download_link(df, csv_file_name, delete_prompt=True):
"""Display a download link to load a data frame as csv within a Jupyter notebook
Parameters
----------
df : pandas data frame
csv_file_name : str
delete_prompt : bool
"""
df.to_csv(csv_file_name, index=False)
display(FileLink(csv_file_name))
if delete_prompt:
a = input('Press enter to delete the file after you have downloaded it.')
os.remove(csv_file_name)
# Create an example data frame
df = pandas.DataFrame({'x':[1,2,3],'y':['a','b','c']})
# Use the function to diplay a download link
csv_download_link(df, 'file_name.csv')
This is mostly for people who use jupyter notebooks on their own machine. On a shared machine, the use of os.remove might be problematic depending on how you set up file write permissions.
You can use the fact that the notebook can display html for objects, and data urls, to make the content of a csv downloadable:
import urllib
class CSV(object):
def _repr_html_(self):
html = []
html.append("{},{},{}".format(
"user",
"age",
"city"
)
)
html.append("{},{},{}".format(
"Alice",
"39",
"New York"
)
)
html.append("{},{},{}".format(
"Bob",
"30",
"Denver"
)
)
html.append("{},{},{}".format(
"Carol",
"27",
"Tulsa"
)
)
export = '\n'.join(html)
export = urllib.quote(export.encode("utf-8"))
csvData = 'data:application/csv;charset=utf-8,' + export
return "<a download='export.csv' href='{}' target='_blank'>csv file</a>".format(csvData)
CSV()
The simple method that I found was:
df.to_csv('~/Desktop/file_name.csv')
My simple approach to download all the files from the jupyter notebook would be by simply using this wonderful command
!tar cvfz my_compressed_file_name.tar.gz *
This will download all the files of the server including the notebooks.
In case if your server has multiple folders, you might be willing to use the following command. write ../ before the * for every step up the directory.
tar cvfz zipname.tar.gz ../../*
Hope it helps..
Related
I am trying to make GUI where there is upload button and when i click that file(text) it should be saved in database(mysql).
Am unable to save text file into database. Tried many things still not getting please help. Am new and learning tkinter first time.
I have tried this but not useful, So if any other idea on it please help.
from tkinter import *
from tkinter import filedialog
root = Tk()
import os
def UploadAction(event=None):
filename = filedialog.askopenfilename()
pathlabel.config(text=filename)
file = open(filename)
file_data = file.read()
file.close()
file_name = os.path.basename(filename)
file1=open(file_name,'w+')
file1.writelines(file_data)
button = Button(root, text='Upload', command=UploadAction)
button.pack()
pathlabel = Label(root)
pathlabel.pack()
root.mainloop()
This code showing path but not getting how to save this file into database.
my database name is : fileupload
my table name is : configfile
For some reason this code will say it has downloaded my picture but nothing will pop up in the directory, I thought it might be because you can't access i.redd.it files in where I live so I used a proxy, this still did not fix my problems.
This is my code:
import json
import urllib.request
proxy = urllib.request.ProxyHandler({'http': '176.221.34.7'})
opener = urllib.request.build_opener(proxy)
urllib.request.install_opener(opener)
with open('/Users/eucar/Documents/Crawler/Crawler/Crawler/image_links.json') as images:
images = json.load(images)
for idx, image_url in enumerate(images):
try :
image_url = image_url.strip()
file_name = '/Users/eucar/Desktop/Instagrammemes/{}.{}'.format(idx, image_url.strip().split('.')[-1])
print('About to download {} to file {}'.format(image_url, file_name))
urllib.request.urlretrieve(image_url, file_name)
except :
print("All done")
This is the json file:
["https://i.redd.it/9q7r48kd2dh21.jpg",
"https://i.redd.it/yix3rq5t5dh21.jpg",
"https://i.redd.it/1vm3bd2vvch21.jpg",
"https://i.redd.it/wy7uszuigch21.jpg",
"https://i.redd.it/4gunzkkghch21.jpg",
"https://i.redd.it/4sd2hbe5sch21.jpg", "https://i.redd.it/bv3qior3ybh21.jpg"]
I know this question has been answered before, but I seem to have a different problem. Up until a few days ago, my querying of YouTube never had a problem. Now, however, every time I query data on any video the rows of actual video data come back as a single empty array.
Here is my code in full:
# -*- coding: utf-8 -*-
import os
import google.oauth2.credentials
import google_auth_oauthlib.flow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from google_auth_oauthlib.flow import InstalledAppFlow
import pandas as pd
import csv
SCOPES = ['https://www.googleapis.com/auth/yt-analytics.readonly']
API_SERVICE_NAME = 'youtubeAnalytics'
API_VERSION = 'v2'
CLIENT_SECRETS_FILE = 'CLIENT_SECRET_FILE.json'
def get_service():
flow = InstalledAppFlow.from_client_secrets_file(CLIENT_SECRETS_FILE, SCOPES)
credentials = flow.run_console()
#builds api-specific service
return build(API_SERVICE_NAME, API_VERSION, credentials = credentials)
def execute_api_request(client_library_function, **kwargs):
response = client_library_function(
**kwargs
).execute()
print(response)
columnHeaders = []
# create a CSV output for video list
csvFile = open('video_result.csv','w')
csvWriter = csv.writer(csvFile)
csvWriter.writerow(["views","comments","likes","estimatedMinutesWatched","averageViewDuration"])
if __name__ == '__main__':
# Disable OAuthlib's HTTPs verification when running locally.
# *DO NOT* leave this option enabled when running in production.
os.environ['OAUTHLIB_INSECURE_TRANSPORT'] = '1'
youtubeAnalytics = get_service()
execute_api_request(
youtubeAnalytics.reports().query,
ids='channel==UCU_N4jDOub9J8splDAPiMWA',
#needs to be of form YYYY-MM-DD
startDate='2018-01-01',
endDate='2018-05-01',
metrics='views,comments,likes,dislikes,estimatedMinutesWatched,averageViewDuration',
dimensions='day',
filters='video==ZeY6BKqIZGk,YKFWUX9w4eY,bDPdrWS-YUc'
)
You can see in the Reports: Query front page that you need to use the new scope:
https://www.googleapis.com/auth/youtube.readonly
instead of the old one:
https://www.googleapis.com/auth/yt-analytics.readonly
After changing the scope, perform a re-authentication (delete the old credentials) for the new scope to take effect.
This is also confirmed in this forum.
One of the mishaps may come if you chose wrong account/accounts during oAuth2 authorisation. For instance you may have to get "account" on the firs screen but then on second screen (during authorisation) use "brand account" and not the main account from the first step that also is on a list for second step.
I got the same problem and replacing with https://www.googleapis.com/auth/youtube.readonly scope doesn't work.
(Even making requests in the API webpage, it returns empty rows.)
Instead, using the https://www.googleapis.com/auth/youtube scope works fine in my case.
I'm trying to search a webpage (http://www.phillyhistory.org/historicstreets/). I think the relevent source html is this:
<input name="txtStreetName" type="text" id="txtStreetName">
You can see the rest of the source html at the website. I want to go into the that text box and enter an street name and download an output (ie enter 'Jefferson' in the search box of the page and see historic street names with Jefferson). I have tried using requests.post, and tried typing ?get=Jefferson in the url to test if that works with no luck. Anyone have any ideas how to get this page? Thanks,
Cameron
code that I currently tried (some imports are unused as I plan to parse etc):
import requests
from bs4 import BeautifulSoup
import csv
from string import ascii_lowercase
import codecs
import os.path
import time
arrayofstreets = []
arrayofstreets = ['Jefferson']
for each in arrayofstreets:
url = 'http://www.phillyhistory.org/historicstreets/default.aspx'
payload = {'txtStreetName': each}
r = requests.post(url, data=payload).content
outfile = "raw/" + each + ".html"
with open(outfile, "w") as code:
code.write(r)
time.sleep(2)
This did not work and only gave me the default webpage downloaded (ie Jefferson not entered in the search bar and retrieved.
I'm guessing your reference to 'requests.post' relates to the requests module for python.
As you have not specified what you want to scrape from the search results I will simply give you a snippet to get the html for a given search query:
import requests
query = 'Jefferson'
url = 'http://www.phillyhistory.org/historicstreets/default.aspx'
post_data = {'txtStreetName': query}
html_result = requests.post(url, data=post_data).content
print html_result
If you need to further process the html file to extract some data, I suggest you use the Beautiful Soup module to do so.
UPDATED VERSION:
#!/usr/bin/python
import requests
from bs4 import BeautifulSoup
import csv
from string import ascii_lowercase
import codecs
import os.path
import time
def get_post_data(html_soup, query):
view_state = html_soup.find('input', {'name': '__VIEWSTATE'})['value']
event_validation = html_soup.find('input', {'name': '__EVENTVALIDATION'})['value']
textbox1 = ''
btn_search = 'Find'
return {'__VIEWSTATE': view_state,
'__EVENTVALIDATION': event_validation,
'Textbox1': '',
'txtStreetName': query,
'btnSearch': btn_search
}
arrayofstreets = ['Jefferson']
url = 'http://www.phillyhistory.org/historicstreets/default.aspx'
html = requests.get(url).content
for each in arrayofstreets:
payload = get_post_data(BeautifulSoup(html, 'lxml'), each)
r = requests.post(url, data=payload).content
outfile = "raw/" + each + ".html"
with open(outfile, "w") as code:
code.write(r)
time.sleep(2)
The problem in my/your first version was that we weren't posting all the required parameters. To find out what you need to send, open the network monitor in your browser (Ctrl+Shitf+Q in Firefox) and make that search as you would normally. If you select the POST request in the network log, on the right you should see 'parameters tab' where the post parameters your browser sent.
I have a JSON string that I am reading from a web form that I would like to create a temporary file out of and allow the file to be downloaded to the local client machine. In other words my app.route reads the string, writes the string to a file and then sends the file to the client:
#app.route('/sendFile', methods=['POST'])
def sendFile():
content = str(request.form['jsonval'])
with open('zones.geojson', 'w') as f:
f.write(content)
return send_file(f)
What's the best way to make this work?
From this answer all that is needed is to specify the correct Response header:
From flask import Response
#app.route('/sendFile', methods=['POST'])
def sendFile():
content = str(request.form['jsonval'])
return Response(content,
mimetype='application/json',
headers={'Content-Disposition':'attachment;filename=zones.geojson'})