How to download files from AWS s3 storage on schedule using Github actions? - json

I've some small files to download from AWS S3 storage on a specific schedule. The limitation is that I need to use Github Actions to get them, so no external scripts. I realise it's possible to send a file from Github to AWS S3 using a Github Action but what about the other way around?
The files are just monthly generated metadata.json files, nothing massive. Each month there is a new one created and I need to grab the last 3 most recent for reporting. Ultimately I'd want to automatically convert it to .csv also and possibly join them into one larger.json, but I don't need to do anything else special with them.
I can find dozens of examples of sending files to S3 on a schedule but not receiving.

Related

How to get updated server files from elastic beanstalk?

I am hosting my server for my website on AWS Elastic beanstalk. On my server I am storing files that are getting uploaded by end users (or myself) in an "Images" folder. Thus new files are getting created in the folder every time an image gets uploaded to the website.
How can i download the latest file of my server on E.B. with these new images. I can download the original zip file I uploaded but it doesn't have the new data in the folders.
TY
You should not be storing any files worth anything on the EB node.
if user uploads content, you should be uploading that in turn to S3, or your DB, or any kind of file storage. That is usually decided during architecture.
So while the actual answer is "this should never happened in the first place", I must precise the main reason is that auto-scaling can kill your nodes without you knowing. which would destroy the uploads. or bring new nodes, spreading your content through multiple nodes.
While I also understand this answer might not be helping you if you already done the mistake have content to be transfered out of the node. I would
disable autoscaling
enable termination protection on the node
transfer data via rsync/ssh/s3 or favorite different options by SSH
automate transfer method to be done a subsequent time.
implement new upload method for the future of your app
deploy new method so no new content is uploaded to previous storage location
re-transfer your date from old to new storage location.
disable termination protection and reenable autoscaling
make sure new nodes and receiving traffic and why not kill that previous node
remember servers are cattle not pets

Google Drive API Retrieve Revision Difference

I am trying to get the differences between two distinct revisions of a file. I have tried both listing the "changes" and "revisions" resources provided by the API. However these resources do not seem to be having the information on the "differences" between two revisions. They simply return the specific revision's file content as the information.
The main goal here is to bypass the overhead of downloading the current state of the file and just fetching the differences (between the local version that I have) so that it can be applied to the local version of that file.
Is there any way to fetch just the differences?
Is there any way to fetch just the differences?
No.
The workaround you suggested (download both file versions, then diff them locally) is the only way available so far.

Monitor and automatically upload local files to Google Cloud Bucket

My goal is to make a website (hosted on Google's App Engine through a Bucket) that includes a upload button much like
<p>Directory: <input type="file" webkitdirectory mozdirectory /></p>
that prompts users to select a main directory.
The main directory will first generate a subfolder and have discrete files being written every few seconds, up to ~4000 per subfolder, upon which the machine software will create another subfolder and continue, and so on.
I want Google Bucket to automatically create a Bucket folder based on metadata (e.g. user login ID and time) in the background, and the website should monitor the main directory and subfolders, and automatically upload every file, sequentially from the time they are finished being written locally, into the Cloud Bucket folder. Each 'session' is expected to run for ~2-5 days.
Creating separate Cloud folders is meant to separate user data in case of multiple parallel users.
Does anyone know how this can be achieved? Would be good if there's sample code to adapt into existing HTML.
Thanks in advance!
As per #JohnHanely, this is not really feasible using a application. I also do not understand the use case entirely but I can provide some insight into monitoring Cloud Buckets.
GCP provides Cloud Functions:
Respond to change notifications emerging from Google Cloud Storage. These notifications can be configured to trigger in response to various events inside a bucket—object creation, deletion, archiving and metadata updates.
The Cloud Storage Triggers will help you avoid having to monitor the buckets yourself and can instead leave that to GCF.
Maybe you could expand on what you are trying to achieve with that many folders? Are you trying to create ~4,000 sub-folders per user? There may be a better path forward should we know more about the intended use of the data? Is seems you want hold data and perhaps a DB is better suited?
- Application
|--Accounts
|---- User1
|-------Metadata
|----User2
|------Meatadata

using extract.autodesk.io and automatically download bubbles to our local server

I'm trying to use and modify the extract.autodesk.io (thanks to Cyrille Fauvel) but not yet successful. In a nut shell, this is what I want to do:
user drag-drop the design file (i'm ok with this)
I've removed the submit button - so right after uploading, extraction should begin in autodesk's server. (i've added a .done to trigger the auto-extraction : uploadFile (uri).done(function(){SubmitProjectDirect();}); )
no need to load a temp viewer for view/test
automatically download the bubble in zip file into our local server folder.
Delete uploaded model right away as our projects are mostly strictly confidential.
I'm encountering a 405 'Method not allowed' on 'api/file' sub folder, which I believe it should be autodesk's folder in the server.
Can anyone point the root urn of api/file?
I seem to get stuck on item 2 above due to the 405 error. But if get passed that one, I still need to solve 3, 4 and 5.
Appreciate any help...
In light of the additional comment above, the issue is a bit more complicated than I thought originally. In order to upload a file on the Autodesk cloud storage, you need to use specific endpoints, with a PUT verb and provide an oAuth Access Token.
It should be possible to setup the Flow.js to use all the above, but since it is a javascript library running on your client, it means anyone can steal your access token and use it illegitimately to either access your data, or consume your cloud credit to do action on your behalf.
Another issue is that the OSS minimum chunk is 5Mb - see this article, so you need to control this as well as providing OSS the byte assembly range information.
I would not recommend uploading to OSS from the client directly for security reason, but if you do not want to store on your server as a temporary storage, we can either proxy the Flow.js upload on the OSS storage or pipe the uploaded chunk on the Autodesk cloud storage. Both of the solution will be secured with no storage on your server, but traffic will continue to go via your server. I will create a branch on the github repo in a few days to demonstrate both approach.

How i can get multiple files from google drive through the google drive api?

I would like to know how i could to obtain multiple files from google drive. I searched this in the reference but i not found this information. I'm building a web application that will talk to drive and retreive a link of a zip file to download. Zip of the files.
I'm using php with api v2.
That is currently not possible with the Drive API, you have to send multiple requests to retrieve multiple files.
I've been faced with a similar problem and while there's currently no way of doing this through Drive (to my knowledge), this is the solution I came up with.
I'm serving up hundreds of thousands of files to clients using a Drive folder as the storage with a custom built front-end built with the Drive API. With that many files, it's ridiculously tedious for users to download files one at a time. So the idea was to let the users select which files they'd like to download and then present them with a zip file containing the files.
To do that, first you'll need to get an array of the Drive files that you want to download whether that's some you generate programmatically or through check-boxes on the front-end. Then you'll loop through that array and grab the 'downloadURL' value for each file and perform a cURL to the url provided. Depending on how many files you're planning on handling per request you can either keep them all in memory or temporarily store them on the disk or in a database. Regardless, once you have all of the files, you can then zip them up using any number of zip libs that are out there. Then just send the resulting zip file to the user.
In our case we ended up sticking with individual file downloads because of the potentially massive amount of resources and bandwidth this can eat but it's a potential solution if you're not serving large numbers of files.
Assuming I am answering the correct query, if you place the files in a folder on google drive, then as far as I know it is possible to download as a complete group of files.