In Bigquery - I want to create a table, then load the table from a csv file on my local drive in a single query.
I know the Statements below are not correct, looking for an exmaple of how to do it.
I can create the table, I am not able to insert, or is there another method (upsert, merge???)
CREATE OR REPLACE TABLE Project1.DataSet_Creation.tbl_Store_List_Full
( Store_Nbr string(255),Sister_Store string(255))
,
INSERT INTO Project1.DataSet_Creation.tbl_Store_List_Full (Store_Nbr,Sister_Store)
FROM C:\GCP_Transition\tbl_Store_List_Full.csv
AFAIK, for this purpose you need to use the Bigquery web UI, in a project tab click the create table and choose the CSV file as upload method, enable the auto detect if it is disabled and header rows to skip as 1 so that Bigquery will take your columns as proper of the CSV file with no title row as the docs suggest.
https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv#loading_csv_data_into_a_table
Related
I have a BigQuery table where I added a new column and am not sure as to how I can append data to its row.
This is the BigQuery table:
This is the csv/excel file: I did try to upload the csv directly as a new table but had errors and am now trying to update the column named 'Max_Takeoff_kg', its the last column in the csv. How do I write a query within BigQuery to update the rows with the data in the csv in the last column.
If you're loading your data only for this time, I'd recommend that you save your XLS as CSV and try to create a new table again.
Anyway, you can update your table using BigQuery DML as you can see here
Its important to remember that in your case, for this approach works correctly you must have a way to identify your rows uniquely.
Example:
UPDATE your_db.your_table
SET your_field = <value>
WHERE <condition_to_identify_row_uniquely>
I hope it helps
I think that what I want to do is not feasible at the moment but want to clarify.
I have a bucket say bucketA with files served to the public and a bucket say bucketB where access logs of bucketA are stored in a specific CSV format
What I want to do is to run SQL queries to these access logs. The problem that I have is that the logs are stored in different CSVs (one per hour I think). I tried to import them through BigQuery UI interface but it seems that there is a one to one CSV to table mapping. When you define the input location the placeholder and documentation as you to put a gs://<bucket_name>/<path_to_input_file>.
Based on the above my question is: Is it possible to upload a all files in a bucket to a single BigQuery table, with something like an "*" asterisk operator?
Once the table is constructed what happens when more files with data get stored in the bucket? Do I need to re-run, is there a scheduler?
Based on the above my question is: Is it possible to upload a all
files in a bucket to a single BigQuery table, with something like an
"*" asterisk operator?
You can query them directly in GCS (federated source) or load then all into a native table using * in both cases:
Once the table is constructed what happens when more files with data get stored in the bucket? Do I need to re-run, is there a scheduler?
If you leave it as en external table, then each time you query BigQuery will scan all the files, so you'll get new files/data. If you load it as a native table, then you'll need to schedule a job yourself to append each new file to your table.
Using BigQuery web UI, after I have created the new table + some initial data with the standard upload csv method.
For quick testing, how to use BigQuery web UI to insert more new data into the existing table?
I realized I CANNOT copy and paste multiple insert statements in the Query editor textbox.
INSERT INTO dataset.myschema VALUES ('new value1', 'more value1');
INSERT INTO dataset.myschema VALUES ('new value2', 'more value2');
Wow, then it will be tedious to insert new line of data 1 by 1.
Luckily BigQuery supports INSERT statements that use VALUES syntax can insert multiple rows.
INSERT INTO dataset.myschema VALUES ('new value1', 'more value1'),
('new value2', 'more value2');
I created a table in BigQuery In Cloud Application . By mistake I uploaded two csv files in a bigquery Table.
How to delete either one or both csv file from bigquery table?
Thanks
Arvind
Unfortunately, there is currently no way to remove data from a BigQuery table. Your best option is to re-import the data in a new table. (If you no longer have the original CSV, you can export the table and then remove the duplicates before re-importing.)
I wonder if there is a (native) possibility to create a MySQL table from an .xls or .xlsx spreadsheet. Note that I do not want to import a file into an existing table with LOAD DATA INFILE or INSERT INTO, but to create the table from scratch. i.e using the header as columns (with some default field type e.g. INT) and then insert the data in one step.
So far I used a python script to build a create statement and imported the file afterwards, but somehow I feel clumsy with that approach.
There is no native MySQL tool that does this, but the MySQL PROCEDURE ANALYSE might help you suggest the correct column types.
With a VB Script you could do that. At my client we have a script which takes the worksheet name, the heading names and the field formats and generates a SQL script containing a CREATE TABLE and a the INSERT INTO statements. We use Oracle but mySQL is the same principle.
Of course you could do it even more sophisticated by accessing mySQL from Excel by ODBC and post the CREATE TABLE and INSERT INTO statements that way.
I cannot provide you with the script as it is the belonging of my client but I can answer your questions on how to write such a script if you want to write one.
As the title says: I've got a bunch of tab-separated text files containing data.
I know that if I use 'CREATE TABLE' statements to set up all the tables manually, I can then import them into the waiting tables, using 'load data' or 'mysqlimport'.
But is there any way in MySQL to create tables automatically based on the tab files? Seems like there ought to be. (I know that MySQL might have to guess the data type of each column, but you could specify that in the first row of the tab files.)
No, there isn't. You need to CREATE a TABLE first in any case.
Automatically creating tables and guessing field types is not part of the DBMS's job. That is a task best left to an external tool or application (That then creates the necessary CREATE statements).
If your willing to type the data types in the first row, why not type a proper CREATE TABLE statement.
Then you can export the excel data as a txt file and use
LOAD DATA INFILE 'path/file.txt' INTO TABLE your_table;