PDF Realtime down load and conversion - google-apps-script

Im Looking for a way to use Google Apps Script to Download PDF file and convert file into Google Sheets.
The reason for this is that website only gives data in PDF form and i cant use Import function to get data for real-time updates

It depends on the way you will download your pdf files.
Here is simple example how you can convert PDF file from your Google Drive into Googe Document:
function pdfToDoc() {
var fileBlob = DriveApp.getFilesByName('test.pdf').next().getBlob();
var resource = {
title: fileBlob.getName(),
mimeType: fileBlob.getContentType()
};
var options = {
ocr: true
};
var docFile = Drive.Files.insert(resource, fileBlob, options);
Logger.log(docFile.alternateLink);
}
To make it work you need to enable Drive API:
Based on the answer: https://webapps.stackexchange.com/a/136948
And as far as I can see there is only DOC as output. Probably you can extract data from DOC and put it into Spreadsheet with script. But it depends on how exactly looks your data.

Related

Saving charts as images when converting xls to Google Sheets - Google Script

I have a standard script from the Internet which converts xls to Google Sheets. The charts in the output file got converted to images which is a good thing, I needed charts to be saved as images as the original xls have specific formatting I would like to preserve but the problem is that they look awful - the color shaded, the font dropped to a very small size, the legend floats somewhere. Ultimately I need to have these images to be saved in Google Slides. So, what I was thinking is to find a way (1) to save images from xls and saves directly in Google slides or (2) to save images from xls to Google Sheets but somehow preserve the original formatting and then, run another code that saves the images to the target Slides. Interestingly, I have not found any mention of people complaining of the loss of the chart formatting.
function convertExceltoGoogleSpreadsheet2(fileName) {
try {
fileName = fileName || "name";
var excelFile = DriveApp.getFilesByName(fileName).next();
var fileId = excelFile.getId();
var folderId = Drive.Files.get(fileId).parents[0].id;
var blob = excelFile.getBlob();
var resource = {
title: excelFile.getName().replace(/.xlsx?/, ""),
key: fileId,
parents: [{id: 'id'}]
};
Drive.Files.insert(resource, blob, {
convert: true
});
} catch (f) {
Logger.log(f.toString());
}
}

When trying to convert ppt to Google Slide receive converting error

In my google script program, I am trying to iterate over a folder and make all of the ppt files into google slide files.
function makeSlides(url) {
slideUrls = [];
var id = getId(url);
var powerPoints = DriveApp.getFolderById(id).getFilesByType(MimeType.MICROSOFT_POWERPOINT);
// turn ppt into slides
while(powerPoints.hasNext()) {
var powerPoint = powerPoints.next()
try{
var sheet = powerPoint.getBlob().getAs(MimeType.GOOGLE_SLIDES);
DriveApp.getFolderById(url).createFile(sheet)
Logger.log("OK " + powerPoint.getName());
}catch(e) {
Logger.log("ERROR: " + e)
}
}
After checking the logs I get an error
Exception: Converting from application/vnd.openxmlformats-officedocument.presentationml.presentation to application/vnd.google-apps.presentation is not supported.
I know within the UI of Google Drive, you can open a ppt as a Google Slide. Is there any work around to this? Or am I doing it wrong?
I did find this but this is the opposite of what I am trying to accomplish.
It cannot convert from Powerpoint format to Google Slides using getAs(). You can achieve this using Drive API. In this modification, I used Drive API using Advanced Google Services.
When you use this script, please enable Drive API at Advanced Google Services and API console. You can see about this at here.
Modified script:
Please modify as follows.
From:
var sheet = powerPoint.getBlob().getAs(MimeType.GOOGLE_SLIDES);
DriveApp.getFolderById(url).createFile(sheet)
To:
Drive.Files.insert({title: powerPoint.getName(), mimeType: MimeType.GOOGLE_SLIDES}, powerPoint.getBlob());
Note:
In this modified script, the converted file is created to the root folder. If you want to create in the specific folder, please modify from {title: powerPoint.getName(), mimeType: MimeType.GOOGLE_SLIDES} to {title: powerPoint.getName(), mimeType: MimeType.GOOGLE_SLIDES, parents: [{id: folderId}]}.
If you want to retrieve the file ID from the converted file, please use var id = Drive.Files.insert({title: powerPoint.getName(), mimeType: MimeType.GOOGLE_SLIDES}, powerPoint.getBlob()).id.
References:
Advanced Google Services
Drive API
Drive.Files.insert
If I misunderstand your question, please tell me. I would like to modify it.

Batch convert jpg to Google Docs

I have a folder of jpgs in Google Drive that I would like to convert to Google Docs. Now I can select each one manually and in the context menu "Open in Google Docs" This creates a new document with the image at the top of the page and OCR text below. I just want to do this with all my images.
There is a script here which converts gdoc to docx which I ought to be able to adapt for my case but I don't seem to be able to get it to work.
Here is my adapted script:
function convertJPGtoGoogleDocs() {
var srcfolderId = "~~~~~~~~~Sv4qZuPdJgvEq1A~~~~~~~~~"; // <--- Please input folder ID.
var dstfolderId = srcfolderId; // <--- If you want to change the destination folder, please modify this.
var files = DriveApp.getFolderById(srcfolderId).getFilesByType(MimeType.JPG);
while (files.hasNext()) {
var file = files.next();
DriveApp.getFolderById(dstfolderId).createFile(
UrlFetchApp.fetch(
"https://docs.google.com/document/d/" + file.getId() + "/export?format=gdoc",
{
"headers" : {Authorization: 'Bearer ' + ScriptApp.getOAuthToken()},
"muteHttpExceptions" : true
}
).getBlob().setName(file.getName() + ".docx")
);
}
}
Can anyone help?
Thanks.
You want to convert Jpeg files in a folder as Google Document.
When the Jpeg file is converted to Google Document, you want to use OCR.
If my understanding is correct, how about this modification?
Modification points:
In the script you modified, MimeType.JPG returns undefined. So the script in while is not run.
Please use MimeType.JPEG.
The script of this answer is used for exporting Google Document as Microsoft Word. Unfortunately, that script cannot be directly used for converting Jpeg file to Google Document.
If you want to modify the script of this answer, how about modifying as follows?
When you use this script, please enable Drive API at Advanced Google Services. By this, the API is automatically enabled at API console. The specification of Google Apps Script Project was Changed at April 8, 2019.
Modified script:
function convertJPGtoGoogleDocs() {
var srcfolderId = "~~~~~~~~~Sv4qZuPdJgvEq1A~~~~~~~~~"; // <--- Please input folder ID.
var dstfolderId = srcfolderId; // <--- If you want to change the destination folder, please modify this.
var files = DriveApp.getFolderById(srcfolderId).getFilesByType(MimeType.JPEG); // Modified
while (files.hasNext()) {
var file = files.next();
Drive.Files.insert({title: file.getName(), parents: [{id: dstfolderId}]}, file.getBlob(), {ocr: true}); // Modified
}
}
Note:
If there are a lot of files in the source folder, there is a possibility that the limitation of script runtime (6 min / execution) exceeds.
References:
Enum MimeType
Drive.Files.insert
If I misunderstand your question, please tell me. I would like to modify it.

Is there a way to save html text to a file in google drive using apps script?

Is there a way to save html text to a file in google drive using apps script? My app sends back an html string, which I want to save as an .html file into drive, as if I'd uploaded an HTML file to drive. I then intend on opening this .html file as a google doc, which will convert it to a doc format. I've tried this procedure manually, and it works well. Just need to do it in a script.
More to the point, I'd love a direct way to convert HTML into a google doc.
var url = 'YOUR PAGE';
var p = SitesApp.getPageByUrl(url);
var html = p.getHtmlContent();
var blob = DriveApp.createFile('dummy',html, 'text/html').getBlob();
var resource = {
title: 'YOUR FILE NAME',
convert: true,
mimeType: 'application/vnd.google-apps.document'
};
Drive.Files.insert(resource,blob);

How to replace text within a pdf file?

Is it possible to replace text within a pdf file using Google Apps Script?
I am trying the following code without success on the replace, it seems like the string is encoded in a way I cannot understand.
var pdfFile = DocsList.getFileById("pdf-doc-id");
var asBlob = pdfFile.getBlob();
var asString = asBlob.getDataAsString();
var s2s = "old string";
var s2r = "new string";
var repString = asString.replace(s2s, s2r);
var repBlob = Utilities.newBlob(repString).setContentType("application/pdf").setName("Testing");
DocsList.createFile(repBlob);
EDIT1: I got an empty pdf back
Any ideas?
Thanks
The function getDataAsString() doesn't return the textual content of a PDF file, but instead a textual representation of the binary content of the file. That function works on any file, even those that don't have text (like images).
Unfortunately I don't think you can fully accomplish your goal with Apps Script. If you are able to import your PDF as a Google Document using the Drive UI, then you can use Apps Script's DocumentApp to modify the document and export it as a PDF.