I have an idea on how to extract Table data to Cloud storage using Bq extract command but I would like rather like to know, if there are any options to extract a Big Query table as NewLine Delimited JSON to Local Machine?
I could extract Table data to GCS via CLI and also download JSON data from WEB UI but I am looking for solution using BQ CLI to download table data as JSON in Local machine?. I am wondering is that even possible?
You need to use Google Cloud Storage for your export job. Exporting data from BigQuery is explained here, check also the variants for different path syntaxes.
Then you can download the files from GCS to your local storage.
Gsutil tool can help you further to download the file from GCS to local machine.
You first need to export to GCS, then to transfer to local machine.
If you use the BQ Cli tool, then you can set output format to JSON, and you can redirect to a file. This way you can achieve some export locally, but it has certain other limits.
this exports the first 1000 line as JSON
bq --format=prettyjson query --n=1000 "SELECT * from publicdata:samples.shakespeare" > export.json
It's possible to extract data without using GCS, directly to your local machine, using BQ CLI.
Please see my other answer for details: BigQuery Table Data Export
Related
I've got 70,000+ CSV files in an S3 bucket. They all have the same headers. I would like to combine the files into one CSV, which I want to download onto my machine.
Using AWS Athena, I seem to be most of the way there. I have created a database from the S3 bucket. I can then run queries like this:
select * from my_table_name limit 100
And see the results of the query (which in my case is combining many CSVs from S3) in the Athena console.
However when I go to "Download results" of that query, I can't open the CSV in Excel (or a text editor).
Doing
file -b my_table_name.csv
returns data.
I'm confused because I can visually see the results of my Athena query but can't download them in a usable file format. Am I missing something obvious for how to download this data? Why isn't it giving me a normal (perhaps UTF-8) CSV?
In Athena settings, I had encryption on. That solved it.
I am using spark on Google dataproc cluster. I have created a dictionary in Jupyter notebook which I want to dump in my GCS bucket. However, it seems the usual way of dumping to json using fopen() does not work in case of gcp. So, how can I write my dictionary as .json file to GCS. Or, is there any other way to get the dictionary?
It's funny, I could write spark dataframe to gcs without any hassle, but apparently, I can't load JSON on gcs unless I have it on my local system!
Please help!
Thank you.
The file in GCS is not in your local file system so that's why you cannot call "fopen" on it. You can either save to GCS by directly using a GCS client (for example, this tutorial), or treat the GCS location as an HDFS destination (for example, saveAsTextFile("gs://...")
I have an Excel file people use to edit data outside Azure Data Explorer (Kusto). What is the Kusto code I would use to ingest this data as needed into Kusto query?
So far it seems I need to use:
.create table (Name:type, Name:type)
to create a table.
If my CSV file is stored in OneDrive, what is the syntax to fill the table? Assume the file name is Sample.csv.
OneDrive location is not supported directly by Azure Data Explorer. However there are other options:
Using ingestion commands - you will need to place the files first in Azure Storage.
One Click Ingestion - is a feature of the Web Explorer tool, it will also can create the table for you. you can either download the files to your local computer or place it in Azure storage.
Using Import data from local file feature of Kusto Explorer (Windows client) (only works for local files)
Besides streaming a csv file yourself and painstakingly executing inserts for each line of data, is it possible to use the google cloud sdk to import an entire csv file in bulk, from inside a cloud function. I know in gcp console you can go to the import tab, select a file from storage and just import. But how can I emulate this programmatically?
in general, one has to parse the .csv and generate SQL from that; one line in the .csv would be represented by one INSERT statement. in the first place, you would have to upload the file - or pull it into Cloud Functions temporary storage, eg. with gcs.bucket.file(filePath).download.
then the most easy might be utilize a library, eg. csv2sql-lite - with the big downside, that one does not have full control over the import - while eg. csv-parse would provide a little more control (eg. checking for possible duplicates, skipping some columns, importing to different tables, whatever).
... and order to connect to Cloud SQL, see Connecting from Cloud Functions.
How can I export the one database table from parse.com into a *.csv file which is stored in parse online?
I just once got a file in the following format and now I need to do that on my own:
http://files.parsetfss.com/f0e70754-45fe-43c2-5555-6a8a0795454f/tfss-63214f6e-1f09-481c-83a2-21a70d52091f-STUDENT.csv
So, the question is how can I do this? I have not found a dashboard function yet
Thank you very much
You can create a job in cloud code which will query through all the rows in the table and generate CSV data for each. This data can then be saved to a parse file for access by URL.
If you are looking to simply export a class every once in awhile and you are on a mac, check out ParseToCSV on the Mac App Store. Works very well.