Azure Stream Analytics - JSON - json

I am trying to pass the below json through Azure Stream analytics to an Azure SQL server. The data is coming from the Azure IOT HUB and data is coming through happily.
nodes": {
"SN0013A20041E23697": {
"firmware_version": 5,
"transmission_count": 42,
"reserve_byte": 0,
"battery_level": 3.29406,
"type": 32,
"node_id": 0,
"rssi": 9,
"mass_concentration_pm_1_0": 0.88,
"mass_concentration_pm_2_5": 1.04,
"mass_concentration_pm_4_0": 1.13,
"mass_concentration_pm_10_0": 1.17,
"number_concentration_pm_0_5": 5.73,
"number_concentration_pm_1_0": 6.92,
"number_concentration_pm_2_5": 7.07,
"number_concentration_pm_4_0": 7.09,
"number_concentration_pm_10_0": 7.09,
"typical_particle_size": 0.48,
"humidity": 45.35,
"temperature": 20.84
},
"SN0013A20041E2367B": {
"firmware_version": 5,
"transmission_count": 43,
"reserve_byte": 0,
"battery_level": 2.99782,
"type": 32,
"node_id": 0,
"rssi": 16,
"mass_concentration_pm_1_0": 1.35,
"mass_concentration_pm_2_5": 1.43,
"mass_concentration_pm_4_0": 1.43,
"mass_concentration_pm_10_0": 1.43,
"number_concentration_pm_0_5": 9.13,
"number_concentration_pm_1_0": 10.77,
"number_concentration_pm_2_5": 10.83,
"number_concentration_pm_4_0": 10.83,
"number_concentration_pm_10_0": 10.83,
"typical_particle_size": 0.41,
"humidity": 45.72,
"temperature": 20.2
I can use a query like this and it will pass through one of the devices but not the other.
SELECT
"nodes"."SN0013A20041E23697"."temperature" as Temperature
, "nodes"."SN0013A20041E23697"."humidity" as Humidity
From input
Is there a way to pass through both devices in the same query?

Related

pandas json_normalize columns created as dtype object

I have a json object served from an api as follows:
{
"workouts": [
{
"id": 92527291,
"starts": "2021-06-28T15:42:44.000Z",
"minutes": 30,
"name": "Indoor Cycling",
"created_at": "2021-06-28T16:12:57.000Z",
"updated_at": "2021-06-28T16:12:57.000Z",
"plan_id": null,
"workout_token": "ELEMNT BOLT A1B3:59",
"workout_type_id": 12,
"workout_summary": {
"id": 87540207,
"heart_rate_avg": "152.0",
"calories_accum": "332.0",
"created_at": "2021-06-28T16:12:58.000Z",
"updated_at": "2021-06-28T16:12:58.000Z",
"power_avg": "185.0",
"distance_accum": "17520.21",
"cadence_avg": "87.0",
"ascent_accum": "0.0",
"duration_active_accum": "1801.0",
"duration_paused_accum": "0.0",
"duration_total_accum": "1801.0",
"power_bike_np_last": "186.0",
"power_bike_tss_last": "27.6",
"speed_avg": "9.73",
"work_accum": "332109.0",
"file": {
"url": "https://cdn.wahooligan.com/wahoo-cloud/production/uploads/workout_file/file/FPoJBPZo17BvTmSomq5Y_Q/2021-06-28-154244-ELEMNT_BOLT_A1B3-59-0.fit"
}
}
}
],
"total": 55,
"page": 1,
"per_page": 1,
"order": "descending",
"sort": "starts"
}
I want to get the data into a dataframe. However, lots of the columns seem to have a dtype of object. I assume that this is because some of the numeric values in the json are double quoted. What is the best and most efficient way to avoid this (the json potentially has many workouts elements)?
Is it to fix the returned json? Or to iterate through the dataframe columns and convert the objects to floats?
Thank you
Martyn
IIUC, you can try:
df = pd.json_normalize(json_data, meta=[
'total', 'page', 'per_page', 'order', 'sort'], record_path='workouts').convert_dtypes()
Try using pandas.to_numeric. Here are the docs.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_numeric.html

How can I insert following JSON in MongoDB collection as different documents

I need to insert data in mongo but the JSON I am getting has multiple values in every field and I don't know how can I split them to insert in different documents.
I want to insert array data in different objects in MongoDB
{
"activity_template_id": [
1,
2,
3,
4,
5,
7
],
"done_date": [
"2019-08-10",
"2019-08-10",
"2019-08-10",
"0000-01-01",
"0000-01-01",
"0000-01-01"
],
"is_prescribed": [
"N",
"N",
"N",
"N",
"N",
"Y"
],
"material_id": [
1,
5,
21,
10,
14,
0
],
"qty": [
"1",
"1",
"1",
"0",
"0",
"0"
],
"unit_id": [
1,
1,
25,
0,
0,
0
],
}
(As far as I know) there is no feature in MongoDB itself that would process input data like that. You would do that in application code before calling MongoDB.
If there is no separate application, you can use standard JavaScript functions within the mongo shell to do that.

reading and editing json file

I am new to Python and trying to edit a .json file that looks like the dictionary below:
{
"name": "MB_NDE_AX667_ECU[500-2000]",
"physical_quantity": "acceleration",
"unit": "m/s2",
"okrangelow": 0,
"okrangehigh": 10,
"input": "ch2_ECU1.rms",
"history": {
"ds": 30,
"timer": 24,
"files": 30
},
"reg": 44
},
My goal is to get another json file that will have the dictionary restructured to single line dictionary, e.g:
{"name":"MB_NDE_AX667_ECU[500-2000]","physical_quantity":"acceleration",......}
How could I do this?
Thanks

Different REST representations of the same resource

I have a following situation when designing Rest.
For example I have a list of daily prices
[
{"id": 50,
"date": "2018-01-05"},
{"id": 60,
"date": "2018-01-06"},
{"id": 70,
"date": "2018-01-10"}
]
First I want to get all the prices in certain period, for example in January with GET /prices/?startDate=2018-01-01&endDate=2018-01-31 and it would return the results as seen above.
Secondly I want to get prices for the the same period but with price=0 where no price exists, such as
[
{"price": 50,
"date": "2018-01-05"},
{"price": 60,
"date": "2018-01-06"},
{"price": 0,
"date": "2018-01-07"},
{"price": 0,
"date": "2018-01-08"},
{"price": 0,
"date": "2018-01-09"},
{"price": 70,
"date": "2018-01-10"},
{"price": 0,
"date": "2018-01-11"}
]
Could I go with a new endpoint for this such as /prices/in-range/?startDate=2018-01-01&endDate=2018-01-31.
Would that be misleading? Should REST return data that is non-existing such as price=0, or should it be left to client to maniuplate original data?
Is there a naming convention in REST for "derived" data from one resource to another? I am not choosing what data to display, I am basically creating new data here.

JSON Formatting error

I am getting this error while trying to import this JSON into google bigquery table
file-00000000: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. (error code: invalid)
JSON parsing error in row starting at position 0 at file: file-00000000. Start of array encountered without start of object. (error code: invalid)
This is the JSON
[{'instrument_token': 11192834, 'average_price': 8463.45, 'last_price': 8471.1, 'last_quantity': 75, 'buy_quantity': 1065150, 'volume': 5545950, 'depth': {'buy': [{'price': 8471.1, 'quantity': 300, 'orders': 131072}, {'price': 8471.0, 'quantity': 300, 'orders': 65536}, {'price': 8470.95, 'quantity': 150, 'orders': 65536}, {'price': 8470.85, 'quantity': 75, 'orders': 65536}, {'price': 8470.7, 'quantity': 225, 'orders': 65536}], 'sell': [{'price': 8471.5, 'quantity': 150, 'orders': 131072}, {'price': 8471.55, 'quantity': 375, 'orders': 327680}, {'price': 8471.8, 'quantity': 1050, 'orders': 65536}, {'price': 8472.0, 'quantity': 1050, 'orders': 327680}, {'price': 8472.1, 'quantity': 150, 'orders': 65536}]}, 'ohlc': {'high': 8484.1, 'close': 8336.45, 'low': 8422.35, 'open': 8432.75}, 'mode': 'quote', 'sell_quantity': 998475, 'tradeable': True, 'change': 1.6151959167271395}]
http://jsonformatter.org/ also gives parse error for this JSON block. Need help understanding where the formatting is wrong - this is the JSON from a rest API
This is not valid JSON. JSON uses double quotes, not single quotes. Also, True should be true.
If I had to guess, I would guess that this is Python code being passed off as JSON. :-)
I suspect that even once this is made into correct JSON, it's not the format Google BigQuery is expecting. From https://cloud.google.com/bigquery/data-formats#json_format, it looks like you should have a text file with one JSON object per line. Try just this:
{"mode": "quote", "tradeable": true, "last_quantity": 75, "buy_quantity": 1065150, "depth": {"buy": [{"quantity": 300, "orders": 131072, "price": 8471.1}, {"quantity": 300, "orders": 65536, "price": 8471.0}, {"quantity": 150, "orders": 65536, "price": 8470.95}, {"quantity": 75, "orders": 65536, "price": 8470.85}, {"quantity": 225, "orders": 65536, "price": 8470.7}], "sell": [{"quantity": 150, "orders": 131072, "price": 8471.5}, {"quantity": 375, "orders": 327680, "price": 8471.55}, {"quantity": 1050, "orders": 65536, "price": 8471.8}, {"quantity": 1050, "orders": 327680, "price": 8472.0}, {"quantity": 150, "orders": 65536, "price": 8472.1}]}, "change": 1.6151959167271395, "average_price": 8463.45, "ohlc": {"close": 8336.45, "high": 8484.1, "open": 8432.75, "low": 8422.35}, "instrument_token": 11192834, "last_price": 8471.1, "sell_quantity": 998475, "volume": 5545950}
OP has a valid JSON record but that wouldn't work with Biq Query, and here's why:
Google Big Query supports, JSON objects {}, one object per line. Check this out.
This basically means that you cannot supply list [] as json records and expect Big Query to detect it. You must always have one json object per line.
Here's a quick reference to what I am saying.
and there are more.
at last,
I highly recommend you read up the below and check out the link for more information on different forms of JSON structures, read this from the json.org