Similarly to the bq command-line, can the bigquery googleSheetsOptions range be used when defining bigquery queries in Apps Script (externalDataConfiguration.googleSheetsOptions.range)?
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables
For example, given a table definition like google_sheets_tabeledef.json below, can I pass this table definition as a bigquery method in Apps Script?
{
"autodetect": false,
"sourceFormat": "GOOGLE_SHEETS",
"sourceUris": [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxx"
],
"maxBadRecords": 1,
"googleSheetsOptions":
{
"range": "test_sheet!A1:B20",
"skipLeadingRows": 0
},
"schema" : {
"fields": [
{
"name": "col1",
"type": "string"
},
{
"name": "col2",
"type": "int64"
},
]
}
}
Related question:
bigQuery Google Drive query multiple sheets with googleSheetsOptions range
I believe that this has now (answer written in Aug-2019) been added to the table definition in bigQuery. So when defining the table via the webpage, there is now a range section as shown in the screenshot below:
specify range in BigQuery Google Sheets table connector
Related
I have slate dashboard which uses fusion sheet for backend data and fusion service api/fusion queries to retrieve data from fusion sheets.
I want write data into fusion sheet with one of the column having array type of data.
Does anyone knows how to write back array data type into fusion sheet using fusion query?
What I tried
giving this payload to fusion query result in error Invalid argument
data = {
rows: [
"columnID" : {
"type": "untypedString",
"untypedString": ['a','b']
}
]}
giving this payload to fusion query it writes data as it is (I thought it will be interpreted as array in fusion sheet cell)
data = {
rows: [
"columnID" : {
"type": "untypedString",
"untypedString": "=array('a','b')"
}
]}
WHAT I WANT
to write array data in fusion sheet
Could you try with this :
{
"rows": [
{
"columnID": {
"type": "cellValue",
"cellValue": {
"type": "stringArray",
"stringArray": [
"A", "B"
]
}
}
}
]
}
Instead of this :
{
"rows": [
{
"columnID": {
"type": "untypedString",
"untypedString": "test"
}
}
]
}
For reference, this is possible to find it in the dev docs, by looking at the specific query you are using and the objects it can take as arguments.
I'm setting up a project in NodeJS and for testing I get the information from a spreadsheet in JSON format from the following page: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/batchGet
To obtain the information of a sheet, in the spreadsheetId field I put the ID of the spreadsheet and in the range field the name of the sheet.
This is the JSON I get:
{
"spreadsheetId": "{spreadsheetId}",
"valueRanges": [
{
"range": "Parameters!A1:Z1000",
"majorDimension": "ROWS",
"values": [
[
"Country",
"COLOMBIA"
]
]
}
]
}
What I want is to show all the fields of the sheet to obtain the title and the modification and creation dates. To do this, in the fields field, I am putting the title string but I get the following error:
{
"error": {
"code": 400,
"message": "Request contains an invalid argument.",
"status": "INVALID_ARGUMENT",
"details": [
{
"#type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"field": "title",
"description": "Error expanding 'fields' parameter. Cannot find matching fields for path 'title'."
}
]
}
]
}
}
I have tried putting * but I get the same JSON.
My problem: How can I get the date of creation and the date of last modification of the spreadsheet?
you must use this endpoint to get a spreadsheet properties: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/get
I'm currently trying to extract data out of Log Analytics through its REST API. I have been successful at using a Copy Data activity to store the response in an Azure Data Lake Gen 2 account.
The format is roughly similar to the example from the Log Analytics API Reference Page.
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{
"name": "Category",
"type": "string"
},
{
"name": "count_",
"type": "long"
}
],
"rows": [
[
"Administrative",
20839
],
[
"Recommendation",
122
],
[
"Alert",
64
],
[
"ServiceHealth",
11
]
]
}
] }
My dataset is much larger with more columns more values etc but the principals are the same.
What I am trying to do is generate a new JSON file that would hold the table but multiple documents in the same file e.g.
[{
"Category": "Administrative",
"count_": 20839
},
{
"Category": "Recommendation",
"count_": 122
},
{
"Category": "Alert",
"count_": 64
},
{
"Category": "ServiceHealth",
"count_": 11
}]
The output of this would be stored back into the data lake and then ideally could be used as a source for a copy activity to go into an Azure SQL Database.
I have tried accomplishing this using Data Flows Flattening but haven't been successful with this up until this point as when trying to map the column name it doesn't see individual column names just that level of the document where the column names are defined.
How would I go about flattening the dataset so it appears as desired? Is this an unrealistic expectation of Data flows or is this task more suitable for something like Azure Databricks?
I'm trying to load a public dataset from Google Cloud into BigQuery (quickdraw_dataset). The data is in JSON format as below:
{
"key_id":"5891796615823360",
"word":"nose",
"countrycode":"AE",
"timestamp":"2017-03-01 20:41:36.70725 UTC",
"recognized":true,
"drawing":[[[129,128,129,129,130,130,131,132,132,133,133,133,133,...]]]
}
The issue that I'm running into is that the "drawing" field is a nested array. I gather from reading other posts that you can't read arrays into BigQuery? This post suggests that one way around this issue is to read in the array as a string. But, when I use the following schema, I get this error:
`
[
{
"name": "key_id",
"type": "STRING"
},
{
"name": "word",
"type": "STRING"
},
{
"name": "countrycode",
"type": "STRING"
},
{
"name": "timestamp",
"type": "STRING"
},
{
"name": "recognized",
"type": "BOOLEAN"
},
{
"name": "drawing",
"type": "STRING"
}
]
Error while reading data, error message: JSON parsing error in row starting at position 0: Array specified for non-repeated field: drawing.
Is there a way to read this dataset into BigQuery?
Thanks in advance!
Load the whole row as a CSV, then parse inside BigQuery.
Load:
bq load --F \\t temp.eraser gs://quickdraw_dataset/full/simplified/eraser.ndjson row
Query:
SELECT JSON_EXTRACT_SCALAR(row, '$.countrycode') a
, JSON_EXTRACT_SCALAR(row, '$.word') b
, JSON_EXTRACT_ARRAY(row, '$.drawing')[OFFSET(0)] c
FROM temp.eraser
I am reverse engineering an app that sends queries to
SOMESERVERNAME.analysis.windows.net/public/reports/querydata via an HTTP POST of an JSON-structured query.
Some initial lines of a sample query are at the end of this message.
I can't find any documentation on this anywhere. I don't know if this is some secret API or what. I ultimately would like to just ignore the aggregations altogether and just dump the raw data, which seems to sit in some flat-file type container on the back-end, but without some API documentation I'm stuck with just re-running the super basic handful of queries I've been able to intercept.
Note: this app is an embedded analytics page created with PowerBI, but the only REST API I can find for PowerBI has nothing to do with querying, but just basic object management.
Thanks!
{
"version": "1.0.0",
"queries": [
{
"Query": {
"Commands": [
{
"SemanticQueryDataShapeCommand": {
"Query": {
"Version": 2,
"From": [
{
"Name": "s",
"Entity": "Sheet1"
}
],
"Select": [
{
"Aggregation": {
"Expression": {
"Column": {
"Expression": {
"SourceRef": {
"Source": "s"
}
},
"Property": "Total"
}
},
"Function": 0
},
"Name": "Sum(Sheet1.Total)"
}
],
"Where": [
{
"Condition": {
"In": {
"Expressions": [
{
"Column": {
"Expression": {
"SourceRef": {
"Source": "s"
}
},
"Property": "Year"
}
}
],
"Values": [
[
{
"Literal": {
"Value": "'2018'"
}
}
]
]
}
}
},
............
I have built a client that scrapes data off a specific Power BI report using the same API, but probably you'll be able to adapt it to your use case. Maybe we can even abstract the code into a more generalized Power BI client!
Having tinkered with the API for two days, I realised that there are many ways the data can be formatted:
"nested"/multidimensional data can be unflattened, flattened by 1 degree, etc.
a primary "table" of a result dataset (in data.PH) can reference others (in data.SH)
The basics are as follows:
A dataset is structured like a multidimensional table, with cells containing values.
In a set of cells, the first always has a field S that contains the schema of its and all subsequent cells.
The schema maps a field of each cell's object with a selection from your query, e.g. the G0 field with the queried column age.
My client seems to work only with a specific type of query (SemanticQueryDataShapeCommand), a specific nr of dimensions and a specific column marked as primary (via Binding.Primary). But maybe that helps! https://github.com/derhuerst/fetch-bvg-occupancy/blob/1ebb864b1ff7130f9d2f0ab031c6d78bcabdd633/lib/parse-dataset.js
The only documented way to use this API is through the ADOMD.NET or OleDb provider.
If you want to send a DAX/MDX query and retrieve data programmatically, there's a sample of how to front-end the service with a simple REST API here.