I am trying to join two input sources in the Google Cloud Platform, one from BigQuery and the other from Google Cloud Storage which contains a .csv file. I see using a joiner is the best option.
But I am curious whether the same can be achieved using the table lookup: column 'table' directive. The input records will be from BigQuery, and the 'table' will refer to the .csv file in Google Cloud Storage. Is it possible to achieve this with just Wrangler without using joiner?
Absolutely yes, you can use Wrangler instead of joiner to connect two data sources, you can apply basic transformations and export this information into a sink in Google Cloud Platform.
For your specific scenario using BigQuery for the input records and the 'table' from the .CSV file contained in Google Cloud Storage please check this tutorial which contains the specific steps on how to achieve it.
Related
Can I query using the wildcard feature in BigQuery from external tables stored as CSVs on Google Cloud Storage?
The CSV files are in a Google Cloud Storage bucket and the files have different partitions / chunks of the data, like this
org_score_p1
org_score_p2
...
org_score_p99
Also, I expect that the number of files in bucket will continue to grow, so new files will be added with the same naming scheme.
Yes. However, you need make sure that
your Google Cloud Storage bucket is configured as multi-regional
your bucket's multi-regional location is set to the same place as the one where you are running your BigQuery jobs.
Otherwise you will get an error / exception similar to this one:
Cannot read and write in different locations: source: US-EAST4, destination: US
Is it possible to stream/download CSV data files from a live Plotly chart using API access in real time? I am setting up a system that accepts data from various sensors and plots it online using Plotly with following requirements:
data has to be accessible from multiple locations/ multiple users
API level access to actual data (numbers in CSV or alike) in realtime (a few seconds lag is acceptable)
Would Plotly be the right tool for this? I couldn't find any resource on downloading data in real time through API level access on their site.
Thanks in advance!
Unfortunately, there is no API for collecting steam data through plotly - the streaming service is intended for displaying data and the data is not persisted.
If you are plotting data that you are collecting realtime, you should be able to aggregate it and save it to a database or CSV at the same time you stream to Plotly for display.
I see that Google supports both standard files and shortcuts. I do not get which format should I create for storing my model permanently and if I can do that at all. Can I exploit google realtime to use as simple cloud storage (previous generations of programmers referred to the cloud storage as database)?
If you have binary contents, use a standard file. Otherwise, use a shortcut file.
If are using the realtime API and plan to store all data using it, then you can use a shortcut file and associate the realtime document with it.
I'm not sure what you mean by simple cloud storage, but you could use the realtime API to store arbitrary key value pairs.
Is it possible to use Google BigQuery to channel Google Documents into Tableau? Is there another way to do this without downloading the documents from drive onto a local machine?
If you are using tableau 8.0, there is data connection item for Google BigQuery so i would say yes
What kind of database can be created to handle data in google drive ?
I want to create a google drive app where I have to save some data (text) for future manipulation.
I don't exactly understand your question. You can save any file to Google Drive. So, if you want to back up a file-based database like SQLite3, you can.