Spreadsheet uploading appropriate for business/end-users in Foundry - palantir-foundry

Does Foundry have native support for uploading and appending spreadsheets (identical schema) to one dataset, with an interface appropriate for business/end-users?
I'm evaluating a user workflow that involves receiving tabular spreadsheets ad-hoc and appending them using regular programmatic methods. I'm trying to enable this workflow in Foundry, wherein users would upload these spreadsheets (identical schema) to a single dataset in Foundry, integrated into downstream pipelines. The workflow would look like this:
User navigates to spreadsheet upload page
Button for import or upload
Hit button, enables selection of spreadsheet to upload
Upload
File is appended to the dataset
OPTIONAL: Users have the ability to delete uploaded spreadsheets from the dataset.
I'm aware that users can upload multiple CSV / Excel files to a single dataset via APPEND options, but the interface is not suitable for end-users, i.e. it's possible to overwrite (SNAPSHOT) the entire dataset if the wrong option is selected. A prior discussion was raised here but never resolved: How to import excel/csv with "File Import" widget in Foundry's Slate?

There are a number of robust approaches to enable this workflow. Foundry Actions combined with Workshop present a robust option to enable ad-hoc uploads and appends of spreadsheets (or any file formats) to a specific dataset, with an interface appropriate for business/end-users.
Actions can be configured to support Attachment uploads - these are easily configurable, enable add/delete of specific files, and support uploads of single files up to 200MB. A Workshop app can be created to support a workflow where the user uploads the files to a dataset via Action. Then, uploaded files can be called down via API calls to the attachment RID, parsed and appended to the dataset in a transform.
Actions: https://www.palantir.com/docs/foundry/action-types/overview/
Workshop: https://www.palantir.com/docs/foundry/workshop/overview/
API calls to attachment:
https://www.palantir.com/docs/foundry/api/ontology-resources/attachments/get-attachment-content/

Related

Disable Spreadsheet copy - Google Sheets

I would like to allow users to use my spreadsheet but not copy it as it contains intellectual property. I tried going to sharing settings and disabling:
Editors can change permissions and share
Viewers and commenters can see the option to download, print, and copy
But the sheet can still be copied. Ideas?
Unfortunately, it is not possible to disable copy / download for editors.
You can only do that for commenters and viewers.
As a workaround, I would advice you to keep your sensitive information into one master file and then importrange or copy via a script the shareable information into another file. So even if they copy or download the latter your sensitive information won't be copied / downloaded.
Related questions:
How to disable copy/ download access for editors in google sheets
Prevent editors from downloading the file
Disable download & Copy to Option in Google Spreadsheet
I think the simplest solution would be to copy and paste from the master file the range of values you want to share with the other document. In this scenario the editors of the other document won't have access to neither the code nor the full data of the master file since the latter won't be shared with them.
The copy and paste part can be done automatically via a script and a trigger mechanism to update the data automatically so you won't have to do anything manually and the master file won't be exposed to any user.
There isn't any sure way to hide your data. Once something is published on the internet, you should consider it saved on many devices all over the world. Consider some ways to get hidden spreadsheet data
Attack scenarios:
By far the easiest way is CTRLC and CTRLV(Copy and Paste)
Editor menu options: File->Copy and File->Export
Once your file id is visible, any editor or even viewer with access to the file can easily copy the file itself through
Url manipulation: Adding /copy at the end instead of /edit
google-drive-api: File:get and File:copy
google-sheets-api: Useful to directly get data as json
google-vizualization-api: Can get data as html,csv or json(google query). See endpoints
Screenshot and use OCR(Optical character recognition)
View source code in the browser and directly copy the table
web-scraping Simulate browser using selenium
Hiding data:
Data may be hidden from naive users. Data cannot be hidden from users, who know the basics of how the web works.
Add ?rm=minimal to url, when sharing the sheets file. This hides all menu options.See here
Frame the edior in a iframe in your own website and use css to hide the top portion of the web page.
Hiding Logic:
You may still be able to hide logic of your code.
IMPORTRANGE: This is a very basic and easy way to hide your logic. But there are limitations and any editor can access any part of your master spreadsheet.
You can implement a IMPORTRANGE like logic using custom functions and webapps. This gives more control over the connector and secures your master spreadsheet much better than IMPORTRANGE. Here,
Two web apps are created, each associated with a spreadsheet(Master and client).
You use two KEYs to communicate between them. One for access and other for encryption.
Once access is verified, the data from master spreadsheet is encrypted and sent back to the custom function. Simultaneously the encryption key is posted to the client webapp.
The key here is the Master/Server webapp posts the encryption key only to the published client web app link. So, no other sheet or anything else can intercept the key or decrypt the data. Furthermore, a random key is generated for each access.
Another option is to let go off the spreadsheet completely and use a single webapp to show the data. This hides the logic in server scripts and linked spreadsheets.
Comment thoughts:
Create a script onOpen to kill sheets if the file is wrong?
onOpen cannot post data anywhere without the new copy owner permission. It's not possible to kill sheets. But data can be erased.
/**
* Deletes all sheets on the copy, if a copy is made
*/
const onOpen = () => {
const ss = SpreadsheetApp.getActive();
const id = ss.getId();
const sheets = ss.getSheets();
ss.insertSheet(sheets.length);//insert a blank sheet at the end
if (id !== '###Original ID###') sheets.forEach(s => ss.deleteSheet(s));//will fail at the last sheet(doesn't matter)
};
But editor can modify the original script before making a copy. And a revision of the original spreadsheet will still be available. The new owner can revert to the original version, use api endpoints mentioned above to get the data. Also mobile apps don't support onOpen. New owners can simply use mobile versions to access data.
Use formula web-app to notify file owner, ?
Possible, but data is already copied and there's no specific information that can be used to accurately identify the new owner. You maybe able to get locale information though.

Get BigQuery to read all CSV files in a Drive Folder (same schema)

Is it possible to have a BigQuery query on a Google Drive folder if all CSV files in the folder have the same schema? Is it possible for the query to be updated automatically whenever a file is added or deleted? Just wondering whether this would require some Apps script or can just be done within BigQuery somehow.
Option 1: PubSub detection of Google Drive changes
You can likely do this via PubSub.
Then have the PubSub subscription be a PUSH notification to an Apps Script web service HTTP endpoint.
Your doGet() method in Apps Script can then do your normal import actions on that CSV based on the filename passed as a parameter in the PubSub HTTP push notification.
Option 2: Use BQ external tables
Link BQ to an external Google Drive source as an external table. This does not require "importing" data to BQ at all, it reads directly from CSV on Google Drive, etc.

Date based Folder creation in Google Cloud Storage

Use Case :
i have to store request/response objects in Google Cloud Storage on daily basis, wanted to create folder on daily basis (bucket/year/month/day format) and store all the objects within current date/day.
my typical flow is like below:
Json message to PubSub --> Cloud Function (Python) ---> Google Cloud storage on daily basis.
Query:
Since Cloud Function can trigger parallel for each events in PubSub (millions of messages a day)and might create duplicate folders in GCS , is there any way to synchronise folder creation before creating object in GCS for given day?
In Google Cloud Storage, the file name includes the full path (flat namespace).
For example, the name of a hypothetical file is "your-bucket/abc/file.tx" instead of just "file.txt"
Having said that, folders don't exist in cloud storage, you don't have to worry about creating folders or creating folders simultaneously, you only need to avoid to create files with the same name.

Programatically retrieve files uploaded to a Google Form

Looking to automate the processing of data in a spreadsheet generated by google forms; specifically, I want to attach the files uploaded to the form to outgoing emails as attachments. The files can be synced to a local folder for the program to access, but the google form only has a url for each uploaded file.
What would be the most efficient way to determine which of the files uploaded correspond with each form submission and its corresponding link?
Edit: my research indicates that I may be able to use an identifier from the url to see which document it is through an apps api. Another thought was to scrape the html from the linked webpage and then to glean the file name from somewhere in the html here it hopefully occurs with regularity.
Then I could use the filename to construct a filepath to the synced folder on my local machine or would it be better to use the drive api to manipulate the file into an email attachment?

How to create Google Sheets file when CSV file is downloaded

When a user downloads a CSV file from an arbitrary site, I would like to be informed of this event and automatically upload the same file as a Google Sheets file. Is this possible or would this be violating a fundamental browser security concept?
The CSV file in question does not have a URL, but is created on the fly by the arbitrary web site, when the user clicks a button. An example would be the user's list of financial transactions at a bank web site.
I am not new to Google Apps/Drive but I am new to Google Apps Script.
There is no way to detect that kind of events.
I can't even imagine a function that would allow for automatic file upload of a local file... don't forget Google Apps Script is a server based environment.