Use Case :
i have to store request/response objects in Google Cloud Storage on daily basis, wanted to create folder on daily basis (bucket/year/month/day format) and store all the objects within current date/day.
my typical flow is like below:
Json message to PubSub --> Cloud Function (Python) ---> Google Cloud storage on daily basis.
Query:
Since Cloud Function can trigger parallel for each events in PubSub (millions of messages a day)and might create duplicate folders in GCS , is there any way to synchronise folder creation before creating object in GCS for given day?
In Google Cloud Storage, the file name includes the full path (flat namespace).
For example, the name of a hypothetical file is "your-bucket/abc/file.tx" instead of just "file.txt"
Having said that, folders don't exist in cloud storage, you don't have to worry about creating folders or creating folders simultaneously, you only need to avoid to create files with the same name.
Related
Does Foundry have native support for uploading and appending spreadsheets (identical schema) to one dataset, with an interface appropriate for business/end-users?
I'm evaluating a user workflow that involves receiving tabular spreadsheets ad-hoc and appending them using regular programmatic methods. I'm trying to enable this workflow in Foundry, wherein users would upload these spreadsheets (identical schema) to a single dataset in Foundry, integrated into downstream pipelines. The workflow would look like this:
User navigates to spreadsheet upload page
Button for import or upload
Hit button, enables selection of spreadsheet to upload
Upload
File is appended to the dataset
OPTIONAL: Users have the ability to delete uploaded spreadsheets from the dataset.
I'm aware that users can upload multiple CSV / Excel files to a single dataset via APPEND options, but the interface is not suitable for end-users, i.e. it's possible to overwrite (SNAPSHOT) the entire dataset if the wrong option is selected. A prior discussion was raised here but never resolved: How to import excel/csv with "File Import" widget in Foundry's Slate?
There are a number of robust approaches to enable this workflow. Foundry Actions combined with Workshop present a robust option to enable ad-hoc uploads and appends of spreadsheets (or any file formats) to a specific dataset, with an interface appropriate for business/end-users.
Actions can be configured to support Attachment uploads - these are easily configurable, enable add/delete of specific files, and support uploads of single files up to 200MB. A Workshop app can be created to support a workflow where the user uploads the files to a dataset via Action. Then, uploaded files can be called down via API calls to the attachment RID, parsed and appended to the dataset in a transform.
Actions: https://www.palantir.com/docs/foundry/action-types/overview/
Workshop: https://www.palantir.com/docs/foundry/workshop/overview/
API calls to attachment:
https://www.palantir.com/docs/foundry/api/ontology-resources/attachments/get-attachment-content/
Is it possible to have a BigQuery query on a Google Drive folder if all CSV files in the folder have the same schema? Is it possible for the query to be updated automatically whenever a file is added or deleted? Just wondering whether this would require some Apps script or can just be done within BigQuery somehow.
Option 1: PubSub detection of Google Drive changes
You can likely do this via PubSub.
Then have the PubSub subscription be a PUSH notification to an Apps Script web service HTTP endpoint.
Your doGet() method in Apps Script can then do your normal import actions on that CSV based on the filename passed as a parameter in the PubSub HTTP push notification.
Option 2: Use BQ external tables
Link BQ to an external Google Drive source as an external table. This does not require "importing" data to BQ at all, it reads directly from CSV on Google Drive, etc.
I am writing a PowerShell script that creates iot-hub account with storage account and a streamAnalytics job.
For updating the json file for the streamAnalytics job, I need to retrieve the storage account name that has just been created. Unfortunately, AzureRM has no function to retrieve storage account Name.
Any suggestions on how to do that?
My current script receives it as input from user, but I want the script to be automated and with no need for user input.
got it, I just used:
$storageAccountName = (Get-AzureRmStorageAccount -ResourceGroupName $IotHubResourceGroupName).storageAccountName
Ahoy!
How can I export all of the Google Spreadsheet's data to a MySQL, I have the basics of an export script but All of my Spreadsheets have 1,500+ rows and there is 41 of them, next my question is Can I execute these scripts on all of the Spreadsheet Files at once, perhaps in a folder? because I don't fancy trawling through all 41 and assigning a script to each.
Thanks in advance :)
How can I export all of the Google Spreadsheet's data to a MySQL
There are several ways you can do this. Which one to use depends on how your MySQL instance is configured.
If your MySQL instance is a closed local network-only instance, then you can't connect to it from outside your local network, so google apps script will not be able to connect to it. In this case your only option is to export your google spreadsheets data as CSV files (i.e. using File->Download as->Comma-separated values menu), then import those into your MySQL db table. See Load Data Infile MySQL statement syntax for details.
If your MySQL instance is a public-facing instance, accessible from outside your local network, you could use Google Apps Script JDBC Service to connect to your MySQL instance and insert/update data from your google sheets. Please read the Setup for other databases section of JDBC guide for details on setting up your database for connection from Google Apps Script.
Can I execute these scripts on all of the Spreadsheet Files at once,
perhaps in a folder?
In the second case (public-facing MySQL instance) you can definitely automate this with a bit of scripting. You can have one script that loops through all spreadsheets in a given folder (or a list of spreadsheet ids, if they are in different folders) and inserts data from each into your MySQL database. The Drive Service and Spreadsheet Service will be your friends here. However, keep in mind that maximum execution time for a google script is 10(?) minutes, so if your sheets contain a lot of data and/or your connection to your db instance is slow, such script may run into a timeout. You may have to implement some back-off/resume functionality in your script so it knows where it finished previous run and picks up from there on next run.
We have successfully deployed Google Apps Script based web applications to teams of users, those applications allow the users to log their daily activities via a simple GUI. The log data is collected per team using Google spreadsheets or ScriptDBs, depending on the size of the team.
We now want to go one step further and do analysis and reports on the user activity data across the teams. Given the amount of data, BigQuery looks like a good technology to do that. We are thinking about using Google Apps Scripts to push the data automatically on a regular (e.g. daily basis) to the BigQuery table(s). We are wondering what the best practices are to do that with the data originating from Google spreadsheets and ScriptDBs.
Unlike in previous cases, just from simply reading through the BigQuery API documentation and code snippets it does not become obvious to us what the recommended approach is.
The hint we found so far:
Write Data from Google Spreadsheets to a BigQuery Table
We don't have any examples for it, but you can use Apps Script's built-in BigQuery service to upload data directly.
BigQuery.Jobs.insert(resource, mediaData, optionalArgs)
The mediaData parameter was recently added, and allows you to pass in the CSV data as a blob which gets sent directly to BigQuery.