Auto renaming a download? - html

I have a webpage with a bunch of download links and a name before each download link, the format of the webpage looks like this:
File One's name
direct download link here
File Two's name
direct download link here
and so on.
Is it possible to write a program so it mass downloads all the links on the page with each one being named the names above their respective link? What can I use in order to write something that can this?

I would recommend writing a scraper in python using the scrapy framework.
Using this framework you will be able to extract all of the links from the page and pass them to a media pipeline in scrapy, this media pipeline will allow you to download and save the files with custom names (For example the name above the link).
It's a lot to take in if you haven't programmed, used python or used scrapy before but there are loads of tutorials out there to help you.
Good luck!!

You can initialise a variable for Name of your download file at first.
$filename = "XYZ";//File Name
And at the download function you can call that variable name so that it will be the name of your downloaded file.
//header info for browser
header("Content-Type: application/xls");
header("Content-Disposition: attachment; filename=$filename.xls");
header("Pragma: no-cache");
header("Expires: 0");
In this the downloaded file will be named as XYZ with extension .xls

Related

Is there a good alternative for embedding a PDF with HTML next to using a local file path, online file path or data source as base64-string?

I am building a web app and I would like to show PDF files to my users. My files are mainly stored as byte arrays in the database as they are generated in the backend. I am using the embed element and have found three ways to display a PDF:
Local file path in src attribute: Works, but I need to generate a file from the database byte array, which is not desirable as I have to manage routines to delete them once they are not needed anymore.
Online file path in src attribute: Not possible since my files may not be hosted anywhere but on the server. Also has the same issues as the previous method anyway.
Data as base64 string in src attribute: Current method, but I ran into a problem for larger files (>2MB). Edge and Chrome will not display a PDF when I covert a PDF of this size to a base64 string (no error but the docs reveal that there is a limit for the data in the src attribute). It works on Firefox but I cannot have my users be restricted to Firefox.
Is there any other way to transmit valid PDF data from a byte array out of the database without generating a file locally?
You have made the common mistake of thinking of URLs and file paths as the same thing; but a URL is just a string that's sent to the server, and some content is sent back. Just as you wouldn't save an HTML file to disk for every dynamic page on the site, you don't have to write to the file system to display a dynamic PDF.
So the solution to this is to have a script on your server that takes the identifier of a PDF in your system, maybe does some access checking, and outputs it to the browser.
For example, if you were using PHP, you might write the HTML with <embed src="/loadpdf.php?id=42"> and then in loadpdf.php would write something like this:
$pdfContent = load_pdf_from_database((int)$_GET['id']);
header('Content-Type: application/pdf');
echo $pdfContent;
Loading /loadpdf.php?id=42 directly in the browser would then render the PDF just the same as if it was a "real" file, and embedding it should work the same way too.

Can I have a link automatically download a file in an email?

I have an email with a link to a video file stored on a cloud hosting service (backblaze)
I'm trying to make it so that when someone can clicks the link the file would start to download. Right now I have this:
Download Here
That takes you to the video instead of downloading the file. I'd like to avoid that if possible and have the link prompt you to start downloading the file when you click on it.
Is this possible to do in an email when the file is stored on a server somewhere?
Thanks
I think you can't do this in plain html.
Since you can't use JavaScript in email, the best option would be to manage to include some PHP script in the server that do the job.
This example was taken from serverfault
PHP
?php
// We'll be outputting a MP4
header('Content-type: video/mp4');
// It will be called downloaded.mp4
header('Content-Disposition: attachment; filename="downloaded.mp4"');
// Add your file source here
readfile('original.mp4');
?>

How to use a download link to download a file in Python

Basically I am trying to write a script which will grab certain files on a webpage and download it to specific folders.
I am able to complete this with most of the webpages using Python, Selenium, and FirefoxPreferences.
However, when I try to grab off of this specific webpage, due to credential rights, I can't parse the html.
Here is the question. I am able to grab the download link for the file, and I can open a browser and have the open/save widget pop up. I can't however click or actually down the file any further. I have already set the Firefox Preferences to not show this widget, to download automatically, and to a specific file. This is ignored for some reason, and I am still left staring at the open browser, with the save/open widget.
How do I use the download link of a file to download to specific folder using Python... Selenium... any other related CS tricks. I don't want to build a bot to click the save for me. Too "hacky" and this is a company project.
Thanks!
you can try urllib
urllib.urlretrieve(<url>,<filename_with_path>)
import urllib
testfile = urllib.URLopener()
testfile.retrieve("http://randomsite.com/file.gz", "file.gz")
The good way to download a file with python.
Refer Here

Saving web page as an html file on computer

I want to save the source code for my website page into my computer. I know that I have to use an http request to download the source code for my web page into the computer as a html file. I want to run a diff to track changes between two html files for a web page. I am wondering how to implement a program to perform the function of saving a web page as an html file on my computer. Please help it is really appreciated. I want to solve the problem programatically. I was researching on this topic and found that httpget, and selenium scripts can achieve this task but I am struggling with the implementation.
With linux you can just use wget.
wget http://google.com
that will save a file called index.html on your computer.
Programmatically you can use python:
import urllib2
# create a list of urls that you want to download
urls = ['http://example.com', 'http://google.com']
# loop over the urls
for url in urls:
# make the request
request = urllib2.urlopen(url)
# make the filename valid (you can change this to suit your needs)
filename = url.replace('http://', '')
filename = filename.replace('/','-')
# write it to a file.
with open(filename + '.html', 'w') as f:
f.write(request.read())

download links from a web page with renaming

I'm trying to find a way to automatically download all links from a web page, but I also want to rename them. for example:
<a href = fileName.txt> Name I want to have </a>
I want to be able to get a file named 'Name I want to have' (I don't worry about the extension).
I am aware that I could get the page source, then parse all the links, and download them all manually, but I'm wondering if there are any built-in tools for that.
lynx --dump | grep http:// | cut -d ' ' -f 4
will print all the links that can be batch fetched with wget - but is there a way to rename the links on the fly?
I doubt anything does this out of the box. I suggest you write a script in Python or similar to download the page, and load the source (try the Beautiful Soup library for tolerant parsing). Then it's a simple matter of traversing the source to capture the links with their attributes and text, and download the files with the names you want. With the exception of Beautiful Soup (if you need to be able to parse sloppy HTML), all you need is built in with Python.
I solved the problem by converting the web page entirely to unicode on the first pass (using notepad++'s built-in conversion)
Then I wrote a small shell script that used cat, awk and wget to fetch all the data.
Unfortunately, I couldn't automate the process since I didn't find any tools for linux which would convert an entire page from KOI8-R to unicode.