I am trying to fetch a zip from a URL and import it automatically into a Google Spreadsheet.
The zip contains one file of CSV data.
I know I can import the CSV data into a Google Spreadsheet but I would like to be able to cut out the step of me having to download the zip and extract the file first before uploading it into a Google Spreadsheet.
So my queries, is it possible to import a zipped CSV file from a URL directly into a Google Spreadsheet? If not, how would this be done with a Google Apps Script?
To answer your question, no, you cannot directly import a Zip file containing a CSV directly into a spreadsheet. To answer your second question:
This question is kind of broad and needs to be broken down into three bits.
Retrieving the Zip from the URL
Extracting the CSV from the Zip
Inserting the CSV into your sheet
I'll briefly cover each of these three areas to get you going in the right direction, I am not going to write out a full fledged solution for you. I provide some code examples to make it easier for you to get going, this is not meant to be an end-to-end solution.
Retrieving the Zip from the URL
You can do this with the URLFetchApp Service.
Something like:
var urlData = UrlFetchApp.fetch(link);
var zipBlob = urlData.getBlob();
var files = Utilities.unzip(blob);
Extracting the CSV from the Zip
You need to get the contents of the ZIP file, find the CSV file in the ZIP, then parse it as a CSV into an array . In my example I use regex to escape the CSV as the Apps Script CSV parser is buggy.
function GetCSVFromZip(zipBlob){
var files = Utilities.unzip(zipBlob);
var csvAttachment = FindBlobByName(files, 'myPartialName');
if(csvAttachment !== -1){
var dataString = csvAttachment.getDataAsString();
var escapedString = dataString.replace(/(?=["'])(?:"[^"\\]*(?:\\[\s\S][^"\\]*)*"|'[^'\\]\r\n(?:\\[\s\S][^'\\]\r\n)*')/g, '\r\n'); //http://stackoverflow.com/a/29452781/3547347
var csv = Utilities.parseCsv(escapedString);
}
}
//Finds a blob by a partial name match, assumes no multiple matches
function FindBlobByName(blob, name){
for(var i = 0; i < blob.length; i++){
var blobName = blob[i].getName();
var regex = new RegExp(name, 'i');
var result = blobName.match(regex);
if(result){
return blob[i];
}
}
return -1;
}
Inserting the CSV into your sheet
You need to use the SpreadsheetApp Service for this. Get your spreadsheet, get a data range, and set it's values to your CSV array. Something along these lines:
var sheet = SpreadsheetApp.openById(id).getSheetByName(name);
var range = sheet.getRange(1, 1, csv.length, csv[0].length);
range.setValues(csv);
Using Douglas's answer, I managed to simplify it and read the zip from the link directly to spreadsheets. Here is my code:
function testzip(){
var url = "url goes here"
var zipblob = UrlFetchApp.fetch(url).getBlob();
var unzipblob = Utilities.unzip(zipblob);
var unzipstr=unzipblob[0].getDataAsString();
var csv = Utilities.parseCsv(unzipstr);
var ss = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Sheet1');
ss.getRange(1, 1, csv.length, csv[0].length).setValues(csv);
}
Heading
Related
I am trying to automate attaching and sending emails using Gscripts. Each recipient has his own unique attachment so I have a Google sheet containing the details:
Column A contains the email addresses.
Column B contains the message.
Column D contains the filename of the attachment along with the file extension.
Here's my code:
function sendEmails() {
var sheet = SpreadsheetApp.getActiveSheet();
var rowEmails = sheet.getRange(1,1,3);
var rowMessage = sheet.getRange(1,2,3);
var rowAttachments = sheet.getRange(1,4,3);
var vEmails = rowEmails.getValues();
var vMessage = rowMessage.getValues();
var vAttachments = rowAttachments.getValues()
for (i in vEmails) {
var emailAddress = vEmails[1] + '';
var message = vMessage[1];
var subject = "Test"
var path = 'certs\\' + vAttachments[i]
var file = Drive.Files.get(path);
MailApp.sendEmail(emailAddress, subject, vMessage[1], {attachments: files});
}
}
Everything works well but when it comes to var file = Drive.Files.get(path), I get an error saying no file is found. I checked my drive and I'm sure it is there. I've also checked the Drive API. I don't know what is wrong.
Something like this might be a little closer to working:
function sendEmails() {
var sheet=SpreadsheetApp.getActiveSheet();
var Emails=sheet.getRange(1,1,3).getValues();
var Message=sheet.getRange(1,2,3).getValues();
var fileids=sheet.getRange(1,3,3).getValues();//you need to add file ids
Emails.forEach(function(r,i){
MailApp.sendEmail(Emails[i][0],"Test",Message[i][0],{attachments:[DriveApp.getFileById(fileids[i][0])]});
});
}
The code in the question has several flaws:
path : Google Drive doesn't use "path"
for in / vAttachments[i]: a different property name is assigned to the variable on each iteration but vAttachments hasn't named properties, so vAttachments[i] will return null on each iteration.
The files property on the options should be an Array.
How to "fix" the script
The way to directly get files in Google Drive is by using file ids.
If you want to get files by "path", first your script should get the folder, then look for the file inside that folder but bear in mind that getting a folder/file by name returns a folder/file iterator
Instead of for in use for of or use for (most of the scripts I have seen usen use for)
When trying to use the IMPORTDATA function for this file:
https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset#players_20.csv
An unexpected error occurs that says it is impossible to import data into the spreadsheet. Is there any other way that I can bring this data to my spreadsheet?
This data would be very important to the work I'm doing. It would save me from almost 3 months of work to be able to type and copy everything and then filtering according to my need.
It would be very important to be able to import at least the simple info of all players, but do not necessarily have to import all columns of info from each player. The amount of columns can import is already perfect.
I would be grateful if there was any way.
You want to download a CSV file of players_20.csv from https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset and put the CSV data to the Spreadsheet.
You want to achieve this using Google Apps Script.
If my understanding is correct, how about this answer? Please think of this as just one of several answers.
Issue and workaround:
Unfortunately, the CSV data cannot be directly downloaded from the URL of https://www.kaggle.com/stefanoleone992/fifa-20-complete-player-dataset#players_20.csv. In order to download the CSV file, it is required to login to kaggle. As other pattern, you can also download it using API. In this answer, in order to download the CSV file, I used Kaggle's public API.
Usage:
1. Retrieve token file:
Before you use the script, please register an account to https://www.kaggle.com, and retrieve the token file. About how to retrieve the token file, you can see the official document.
In order to use the Kaggle’s public API, you must first authenticate using an API token. From the site header, click on your user profile picture, then on “My Account” from the dropdown menu. This will take you to your account settings at https://www.kaggle.com/account. Scroll down to the section of the page labelled API:
To create a new token, click on the “Create New API Token” button. This will download a fresh authentication token onto your machine.
In this script, the token object in the downloaded token file is used.
2. Run script:
Please copy and paste the following script to the container-bound script of Spreadsheet. And please set the variavles of csvFilename, path and tokenObject. In your case, I have already set csvFilename and path. So please set only your token object.
function myFunction() {
var csvFilename = "players_20.csv"; // Please set the CSV filename.
var path = "stefanoleone992/fifa-20-complete-player-dataset"; // Please set the path.
var tokenObject = {"username":"###","key":"###"}; // <--- Please set the token object.
var baseUrl = "https://www.kaggle.com/api/v1/datasets/download/";
var url = baseUrl + path;
var params = {headers: {Authorization: "Basic " + Utilities.base64Encode(tokenObject.username + ':' + tokenObject.key)}};
var blob = UrlFetchApp.fetch(url, params).getBlob();
var csvBlob = Utilities.unzip(blob).filter(function(b) {return b.getName() == csvFilename});
if (csvBlob.length == 1) {
var csvData = Utilities.parseCsv(csvBlob[0].getDataAsString());
var sheet = SpreadsheetApp.getActiveSheet();
sheet.getRange(1, 1, csvData.length, csvData[0].length).setValues(csvData);
} else {
throw new Error("CSV file of " + csvFilename + " was not found.");
}
}
Flow:
The flow of this script is as follows.
When the script is run, the kaggle command of kaggle datasets download -d stefanoleone992/fifa-20-complete-player-dataset is run with Google Apps Script. By this, the ZIP file is downloaded.
Retrieve the CSV file of csvFilename from the downloaded ZIP file.
Parse the CSV data from the CSV file.
Put the CSV data to the active sheet.
In this script, all data is processed with the blob. So the file is not created.
Note:
It seems that the CSV data is large. So please wait until the script is finished.
In my environment, I spent for about 150 seconds until the CSV data is put to the Spreadsheet.
The CSV data of players_20.csv has 18279 rows and 104 columns.
If an error occurs at Utilities.unzip(blob), please test to modify from var blob = UrlFetchApp.fetch(url, params).getBlob() to var blob = UrlFetchApp.fetch(url, params).getBlob().setContentTypeFromExtension().
References:
Authentication of Kaggle's public API
kaggle-api
If I misunderstood your question and this was not the direction you want, I apologize.
Added 1:
If you want to select the columns you want to put, please modify above sample script as follows.
From:
var csvData = Utilities.parseCsv(csvBlob[0].getDataAsString());
var sheet = SpreadsheetApp.getActiveSheet();
To:
var csvData = Utilities.parseCsv(csvBlob[0].getDataAsString());
var needColumns = [1, 2, 3];
csvData = csvData.map(function(row) {return needColumns.map(function(col) {return row[col]})});
var sheet = SpreadsheetApp.getActiveSheet();
In above modification, as the test case, the columns of 1, 2 and 3 are put to the Spreadsheet.
Added 2:
From the result of benchmark for putting CSV data to Spreadsheet, for example, how about using Sheets API for putting CSV data? For this, please modify above sample script as follows. Before you run the script, please enable Sheets API at Advanced Google services.
From:
var csvData = Utilities.parseCsv(csvBlob[0].getDataAsString());
var sheet = SpreadsheetApp.getActiveSheet();
sheet.getRange(1, 1, csvData.length, csvData[0].length).setValues(csvData);
To:
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var resource = {requests: [{pasteData: {data: csvBlob[0].getDataAsString(), coordinate: {sheetId: sheet.getSheetId()}, delimiter: ","}}]};
Sheets.Spreadsheets.batchUpdate(resource, ss.getId());
In this case, I spent for about 50 seconds until the CSV data is put to the Spreadsheet.
Reference:
Benchmark: Importing CSV Data to Spreadsheet using Google Apps Script
Advanced Google services
I have been researching the site but can't find the solution to my question. I a writing a script to extract a list of one or more filenames from a sheet and use these filenames as input to move the actual files from one folder to another in Drive. My issue now is that I don't know how to handle the value "Fileiterator" that is coming back in my script. As a result, the error I am getting when I run my script is "TypeError: Cannot find function makeCopy in object FileIterator"
I'm not sure if I am missing something when I am using the MakeCopy() method or setting up the variable pulling values from the sheet?
here is my code:
// Access Mailing List sheet in gdrive and get filename
var spreadsheet = SpreadsheetApp.openByUrl('spreadsheetURL');
var sheet = spreadsheet.getSheets()[0];
var value = sheet.getSheetValues(2,39,1,1);
Logger.log(value);
var folder = DriveApp.getFolderById("folderid1");
Logger.log(folder);
var files = DriveApp.getFilesByName(value);
Logger.log(files);
var destination = DriveApp.getFolderById("folderid2");
Logger.log(destination);
newfile = files.makeCopy("copy of"+files,destination);
Logger.log(newfile);
Please advise!
The problem occurs because getFilesByName(name) returns a FileIterator object but makeCopy is a File object method.
Example:
This shows how to use a file iterator.
var name = 'File name';
var destination = DriveApp.getFolderById("folderId");
var files = DriveApp.getFilesByName("File name");
while (files.hasNext()) {
var file = files.next();
file.makeCopy("copy of " + name, destination);
}
I receive an email with a hyperlink that when clicked starts a download of a csv file to my Gmail account. It's not an actual attachment. When I receive this email (which has a unique subject line), I need a way to automatically add the contents of the downloaded .csv
Trigger:
An email with a specific subject line is received to my gmail account
Action 1:
Download a .csv file from a hyperlink within the body of the email
Action 2:
Add the contents of the .csv file to a Google Sheet file
I need an already built service that does this or suggestions on how to approach it.
If I can get this Google script to run, I should be able to find a working solution. The problem is the script keeps giving me errors.
function downloadFile(fileURL,folder) {
var fileName = "";
var fileSize = 0;
var response = UrlFetchApp.fetch(fileURL, {muteHttpExceptions: true});
var rc = response.getResponseCode();
if (rc == 200) {
var fileBlob = response.getBlob()
var folder = DocsList.getFolder(folder);
if (folder != null) {
var file = folder.createFile(fileBlob);
fileName = file.getName();
fileSize = file.getSize();
}
}
var fileInfo = { "rc":rc, "fileName":fileName, "fileSize":fileSize };
return fileInfo;
}
This is something I recently tackled at work, fully automating data pulls from my emails to a database. I am not going to write a solution for you, but I will provide you with the information and links you need to do it yourself.
Note: Your question is very broad, and covers a large range of different problems, each of which should be tackled one at a time with their own question (Many of which already have multiple answers on StackOverflow). This is a process to follow with linked documentation, and a couple code snippets so you can do it yourself and tackle each problem along the way.
The Proposed Process:
Open the email with the GmailApp Service
Extract the link via the script below
Get the CSV from the link via the code linked below. This utilizes UrlFetchAp, the Blob datatype, and the parseCsv utility (which you have to escape commas first, because it's buggy)
Modify the contents of the resulting array to your liking
Use the SpreadsheetApp Service to open a spreadsheet and get a range
Set the values of that range to your array of data.
Extract href link from email (assumes only 1 link):
//Retrieves a URL from a HTML string from an href. Only applicable if there is only one link in the string
function GetHrefURLsFromString(string){
var href = string.match(/href="([^"]*)/)[1];
if(href){
return href;
} else {
throw "No URL Found"
}
}
Extract CSV from link:
//Gets a CSV from a provided link, and parses it.
function GetCSVFromLink(link){
var urlData = UrlFetchApp.fetch(link);
if(urlData.getBlob().getContentType() == 'csv'){
var stringData = urlData.getContentText();
var escapedStringData = stringData.replace(/(?=["'])(?:"[^"\\]*(?:\\[\s\S][^"\\]*)*"|'[^'\\]\r\n(?:\\[\s\S][^'\\]\r\n)*')/g, '\r\n');
var CSV = Utilities.parseCsv(escapedStringData);
return CSV;
}
Logger.log('DataType Not CSV')
return null;
}
I am trying to extract the text from each Google document in a folder in Drive and paste the text into the first column of a Google spreadsheet so that the contents of file 1 are in A1, the contents of file 2 in A2 etc. Ultimately I am trying to recreate a database of the information stored in all these files, so if the text can be split by field so much the better, but I think this should be trivial in Excel using Text to Columns.
I have used a few snippets online to have a stab at it but I'm now stumped.
Here is my script as it stands:
//Function to extract the body from each document in a folder and copy it to a spreadsheet
function extract() {
//Define the folder we're working with ("Communication Passports") and get the file list
var folder = DocsList.getFolder("Communication Passports");
var contents = folder.getFiles();
//Define the destination spreadsheet file (CP) and set up the sheet to receive the data
var ss = SpreadsheetApp.openById("0AicdFGdf-Cx5dHFTX1R3Wm1RTEFTZ2d5ZmxuSjJSOHc");
SpreadsheetApp.setActiveSpreadsheet(ss);
Logger.log('File name: ' + ss.getName());
var sheet = SpreadsheetApp.getActiveSheet();
sheet.clear();
sheet.appendRow(["Name", "Date", "Contents", "URL", "Download", "Description"]);
//Set up other variables
var file;
var data;
//Loop through and collect the data (I don't actually need this - just borrowed the code from a snippet online - but it is SO CLOSE!)
//Sadly, getBody doesn't work on files, only on documents
for (var i = 0; i < contents.length; i++) {
file = contents[i];
data = [
file.getName(),
file.getDateCreated(),
file.getViewers(),
file.getUrl(),
"https://docs.google.com/uc?export=download&confirm=no_antivirus&id=" + file.getId(),
file.getDescription()
];
sheet.appendRow(data);
//Extract the text from the file (this doesn't work at present, but is what I actually need)
var doc = DocumentApp.openById(file.getId());
var body = doc.getBody();
//Find a way to paste the extracted body text to the spreadsheet
}
};
Any help would be very gratefully received - I'm not a programmer, I'm a teacher and the information is about children's learning needs at our school (someone deleted the database over summer and our backups only go back a month!).
Thanks,
Simon
Try to add:
var doc = DocumentApp.openById(file.getId());
body = doc.getBody().getText();
to return the actual contents of the document.
I wrote another function to parse the content into more usable chunks and then pass back to an entry in the data table and it worked fine.