Can you give me an example (html or java) to implement the upload file in chunks. How can I upload a file If the length is unknown at the time of the request? I found this way https://developers.google.com/drive/manage-uploads#resumable but I can't understand how can I say to the server the stream is terminated.
Thank you in advance.
aGO!
I can't give you an example with the google drive sdk, but I can tell you how it's done with the raw HTTP requests.
First you have to initiate a resumable upload session, after which you can make partial uploads with unknown size by setting the Content-Range header to something like this 0-262144/* for the first 256 KB, or whatever the size of your chunks are (they have to be multiples of 256KB as per the documentation).
You make requests like this to upload intermediate chunks until you get to the final chunk, which should be smaller than those fixed intermediate chunks. This last chunk will complete the file upload by setting the Content-Range field to <byte interval of the chunk>/<final size of file>.
The final size of the file can be calculated before the last request: number of requests * 256 KB + size of last chunk.
The algorithm is exemplified better here: http://pragmaticjoe.blogspot.ro/2015/10/uploading-files-with-unknown-size-using.html
Related
I got the following problem.
I am working with small machine with low memory (1 GB).
my program download a huge gzip file from some url. And I need to decompress it to dict I know for sure that the file is in json format.
My problem is that after I run the following command I got a memory error:
data = zlib.decompress(url, 16 + zlib.MAX_WBITS).decode('utf8')
results_list.append(json.loads(data ))
now for small files this works fine, but for large I got the error.
my intuition tells me that I should split the file into chunks, but then because I am expecting a json file i wont be able to restore the chunks back to json (because each part wont be a valid json string).
what I should do?
Thank a lot!
Create a decompression object using z=zlib.decompressobj(), and then do z.decompress(some_compressed_data, max), which will return no more than max bytes of uncompressed data. You then call again with z.decompress(z.unconsumed_tail, max) until the rest of some_compressed_data is consumed, and then feed it more compressed data.
You will need to then be able to process the resulting uncompressed data a chunk at a time.
I have uploaded a .avro file on Google Cloud Storage which is about 100MB. It is converted from a 800MB .csv file.
When trying to create a table from this file in the BigQuery web interface, I get the following error after a few seconds:
script: Resources exceeded during query execution: UDF out of memory. (error code: resourcesExceeded)
Job ID audiboxes:bquijob_4462680b_15607de51b9
I checked the BigQuery Quota Policy and I think my file does not exceed it.
Is there a workaround or do I need to split my original .csv in order to get multiple, smaller .avro files ?
Thanks in advance !
This error means that the parser used more memory than allowed. We are working on fixing this issue. In the meantime, if you used compression in the Avro files, try remove it. Using a smaller data block size will also help.
And yes splitting into smaller Avro files like 10MB or less will help too, but the two approaches above are easier if they work for you.
For Google Drive on create file API I could see the Maximum file size: 5120 GB.
So my question here is, is this the maximum file size that can be uploaded in a "Single request"?. Or is it like we can have file of maximum size of 5120 GB but it has to be uploaded in chunks of some XXX size(lets say 2 GB max size for a chunk in a single request).
Depends on your speed connection. It is possible to upload a 5 TB file. However, the possibility that it eventually fails to upload is pretty big. It is much wiser to create .rar compressed files and upload them.
I am creating an application which requires slicing a video file (mp4 format) into chunks. Our server caps the upload_max_filesize at 2MB, but we have files that are hundreds of MBs in size which require uploading. So far I slice the file (into 1MB chunks) using HTML5 FileReader() then upload each chunk using ajax. (here is part of the function I have written)
reader.onload = function() {
$.ajax -> send current blob to server by method POST
};
blob = window.videoFile.slice(byteStart, byteEnd);
reader.readAsBinaryString(blob);
Here's the question: will concatenating the files (in order) on the backend then simply setting the Content-type as so:
header('Content-Type: video/mp4');
before saving the file actually reproduce the video file (i.e., perfectly not as some choppy second rate facsimile) or am I missing something here? This will require some time and the faster option may be for me to beg our server admin to alter the php.ini file to allow a much larger upload_max_filesize.
When i try to upload a large csv file to CKAN datastore it fails and shows the following message
Error: Resource too large to download: 5158278929 > max (10485760).
I changed the maximum in megabytes a resources upload to
ckan.max_resource_size = 5120
in
/etc/ckan/production.ini
What else do i need to change to upload a large csv to ckan.
Screenshot:
That error message comes from the DataPusher, not from CKAN itself: https://github.com/ckan/datapusher/blob/master/datapusher/jobs.py#L250. Unfortunately it looks like the DataPusher's maximum file size is hard-coded to 10MB: https://github.com/ckan/datapusher/blob/master/datapusher/jobs.py#L28. Pushing larger files into the DataStore is not supported.
Two possible workarounds might be:
Use the DataStore API to add the data yourself.
Change the MAX_CONTENT_LENGTH on the line in the DataPusher source code that I linked to above, to something bigger.