Unknown data format returned by firebase

Unknown data format returned by firebase - json

While obtaining record from firebase in appinventor, data is returned in following format.
{ a=1 , b=2, c=3}
What is this format? Is there anyway this format can be converted into a standard format like JSON.
P.S replacing '=' with ':' do not work either.

Related

How to Access the data inside the JSON object. Some of my columns in pyspark dataframe is in JSON format

I was trying to access the data inside object. The table is in pyspark dataframe, but some table values are in JSON format. I need t access the date from it and convert into meaningful format. Any help or idea will be great relief.
This is thing i'm working out:
I was able to extract data which are in array format using:
data_df=df_deidentifieddocuments_tst.select("_id",explode("annotationId").alias("annotationId")).select("_id","annotationId.*")
The above code doesn't work for date, as its showing type mismatch error
AnalysisException: cannot resolve 'explode(createdAt)' due to data type mismatch: input to function explode should be array or map type, not struct<$date:bigint>;

Athena (Trino SQL) parsing JSON document using fields (dot notation)

Athena (Trino SQL) parsing JSON document (table column called document 1 in Athena) using fields (dot notation)
If the underlying json (table column called document 1 in Athena) is in the form of {a={b ...
I can parse it in Athena (Trino SQL) using
document1.a.b
However, if the JSON contains {a={"text": value1 ...
the quote marks will not parse correctly.
Is there a way to do JSON parsing of a 'field' with quotes?
If not, is there an elegant way of parsing the "text" and obtain the string in value 1? [Please see my comment below].
I cannot change the quotes in the json and its Athena "table" so I would need something that works in Trino SQL syntax.
The error message is in the form of: SQL Error [100071] [HY000]: [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. SYNTAX_ERROR: Expression [redacted] is not of type ROW
NOTE: This is not a duplicate of Oracle Dot Notation Question

Dot notation works only for columns types as struct<…>. You can do that for JSON data, but judging from the error and your description this seems not to be the case. I assume your column is of type string.
If you have JSON data in a string column you can use JSON functions to parse and extract parts of them with JSONPath.

googleapis / python-bigquery: BadRequest: Could not parse as DATE with message 'Unable to parse'

Given the following code:
with io.StringIO() as buf:
buf.write(df_data.to_csv(header=True, index=False, quoting=csv.QUOTE_NONNUMERIC))
buf.seek(0)
try:
job = self.client.load_table_from_file(buf, dest_table)
job.result()
except:
buf.seek(0)
LOG.error("Failed to upload dataframe as csv: \n\n%s\n", buf.read())
raise
I am trying to load a pandas DataFrame to a bigquery table via converting to a CSV first. The problem I am faced with is that the BigQuery API fails with
google.api_core.exceptions.BadRequest: 400 Error while reading data, error message: Could not parse 'date_key' as DATE for field date_key (position 3) starting at location 0 with message 'Unable to parse'
I looked at this other issue, and there's seem to be a limitation on the accepted formats for DATEs when loading a CSV file.
This being said, the prints from the above except block results in the following:
ERROR utils.database._bigquery:_bigquery.py:255 Failed to upload dataframe as csv:
"clinic_key","schedule_template_time_interval_key","schedule_template_key","date_key","schedule_owner_key","schedule_template_schedule_track_key","schedule_content_label_key","start_time_key","end_time_key","priority"
"clitest11111111111111111111111","1","1","2021-01-01","1","1","1","19:00:00","21:00:00",1
"clitest11111111111111111111111","1","1","2021-01-01","1","1","2","20:00:00","20:30:00",2
"clitest11111111111111111111111","1","1","2021-01-01","1","1","3","20:20:00","20:30:00",3
Which to me seems to be a clearly well-formatted CSV file.
So my question is: How can I make BigQuery accept my CSV? What do I have to change?
N.B: I know there's a load_dataframe_to_table method on the bigquery.client.Client object, but I faced another issue that forced me to attempt the CSV method instead. See link to other issue here.

You need to drop the header row.
That location 0 suggests it dislikes the first row.
Your other date values look correct (YYYY-MM-DD).
Because column ordering is important with CSV, BigQuery can assume the mapping to its table.

Athena - DATE column correct values from JSON

I have a S3 bucket with many JSON files.
JSON file example:
{"id":"x109pri", "import_date":"2017-11-06"}
The "import_date" field is DATE type in standard format YYYY-MM-DD.
I am creating a Database connection in Athena to link all these JSON files.
However, when I create a new table in Athena and specify this field format as DATE I get: "Internal error" with no other explanation provided. To clarify, the table gets created just fine but if I want to preview it or query, I get this error.
However, when I specify this field as STRING then it works fine.
So the question is, is this a BUG or what should be the correct value for Athena DATE format?

The date column type does not work with certain combinations of SerDe and/or data source.
For example using a DATE column with org.openx.data.jsonserde.JsonSerDe fails, while org.apache.hive.hcatalog.data.JsonSerDe works.
So with the following table definition, querying your JSON will work.
create external table datetest(
id string,
import_date date
)
ROW FORMAT serde 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://bucket/datetest'

How to serialise a spark sql row when it contains a comma

I am using Spark Jobserver https://github.com/spark-jobserver/spark-jobserver and Apache Spark for some analytic processing.
I am receiving back the following structure from jobserver when a job finishes
"status": "OK",
"result": [
"[17799.91015625,null,hello there how areyou?]",
"[50000.0,null,Hi, im fine]",
"[0.0,null,All good]"
]
The result doesnt contain valid json, as explained here:
https://github.com/spark-jobserver/spark-jobserver/issues/176
So I'm trying to convert the returned structure into a json structure, however I cant simply make the result string insert ' (single quotes) based on the comma delimiter, as sometimes the result contains a comma itself.
How can i convert a spark Sql row into a json object in the above situation?

I actually found a better way in the end,
from 1.3.0 onwards you can use .toJSON on a Dataframe to convert it to json
df.toJSON.collect()
to output a dataframes schema to json you can use
df.schema.json

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Unknown data format returned by firebase - json

While obtaining record from firebase in appinventor, data is returned in following format. { a=1 , b=2, c=3} What is this format? Is there anyway this format can be converted into a standard format like JSON. P.S replacing '=' with ':' do not work either.

Related

How to Access the data inside the JSON object. Some of my columns in pyspark dataframe is in JSON format

Athena (Trino SQL) parsing JSON document using fields (dot notation)

googleapis / python-bigquery: BadRequest: Could not parse as DATE with message 'Unable to parse'

Athena - DATE column correct values from JSON

How to serialise a spark sql row when it contains a comma

Categories

Resources