How can I split JSON array into its own row? - json

I am trying to split NPI into their own rows. I can get the NPI values as a comma-separated string but I need to break them down into their own rows. I am able to get all the other fields because they have an object name with single values, NPI has multiple values listed out under a single object name.
declare #json NVARCHAR(MAX)
SEt #json = N'{
"reporting_entity_name": "ABC",
"reporting_entity_type": "Third Party Administrator",
"last_updated_on": "2022-10-05",
"version": "1.0.0",
"provider_references": [
{
"provider_group_id": 19463,
"provider_groups": [
{
"npi": [
1811971955,
1013223874,
1588677066
],
"tin": {
"type": "ein",
"value": "000000000"
}
},
{
"npi": [
1245794387,
1437585882,
1932631751,
1932482296,
1508376864,
1033654181,
1093166530,
1609300672
],
"tin": {
"type": "ein",
"value": "461621659"
}
},
{
"npi": [
1245573369,
1528219359,
1083076897
],
"tin": {
"type": "ein",
"value": "132655001"
}
},
{
"npi": [
1134452170
],
"tin": {
"type": "ein",
"value": "472304826"
}
},
{
"npi": [
1194250274
],
"tin": {
"type": "ein",
"value": "113511743"
}
},
{
"npi": [
1427558378
],
"tin": {
"type": "ein",
"value": "824264835"
}
},
{
"npi": [
1972681484,
1508846932
],
"tin": {
"type": "ein",
"value": "134009634"
}
},
{
"npi": [
1578235743,
1770726788
],
"tin": {
"type": "ein",
"value": "872533474"
}
},
{
"npi": [
1619166899,
1871648949
],
"tin": {
"type": "ein",
"value": "113531019"
}
}
]
}
]
}';
drop table if exists newtable;
select provider_group_id,tin.type,tin.value,npi
into newtable
from openjson (#json)
with
(
-- reporting_entity_name nvarchar(5),
provider_references nvarchar(max) as json
) as topinfo
cross apply openjson (topinfo.provider_references)
with
(
provider_group_id varchar(10),
provider_groups nvarchar(max) as json
) as provider_references
cross apply openjson (provider_references.provider_groups)
with
( npi nvarchar(max) as json,
tin nvarchar(max) as json
) as provider_groups
cross apply openjson (provider_groups.tin)
with
( [type] varchar(3),
[value] varchar(10)
) as tin
cross apply openjson (provider_groups.npi) as npi
select * from newtable
The output is as follows

You're almost there. In this expression
select provider_group_id,tin.type,tin.value,npi
the symbol npi binds to provider_groups.npi which is a JSON array. You want the exploded values from the table aliased as npi which would be npi.value. So try
select provider_group_id,tin.type,tin.value tin, npi.value npi
into newtable
. . .

Related

How to project column name and value from JSON in KQL?

I have the following 'SetOfSignals' in KQL (using mv-expand):
"SetOfSignals": {
"name": "CompanyName",
"signals": [
{
"name": "AmbientAirTemperature",
"unit": "C",
"dataType": "Float32",
"values": [
"11.5"
]
},
{
"name": "AverageEnergyConsumption",
"unit": "W",
"dataType": "Float32",
"values": [
"780.0"
]
}
}
and now I want to project the signal names with corresponding values.
I want it to look like this:
...
AmbientAirTemperature
AverageEnergyConsumption
...
11.5
780.0
but using something like | extend AmbientAirTemperature = signals.name doesn't works since there are multiple strings within "signals" with the name "name".
Thanks.
datatable(SetOfSignals:dynamic)
[
dynamic
(
{
"name": "CompanyName",
"signals": [
{
"name": "AmbientAirTemperature",
"unit": "C",
"dataType": "Float32",
"values": [
"11.5"
]
},
{
"name": "AverageEnergyConsumption",
"unit": "W",
"dataType": "Float32",
"values": [
"780.0"
]
}
]
}
)
]
| mv-apply signal = SetOfSignals.signals on
(
summarize make_bag(bag_pack(tostring(signal.name), signal.values[0]))
)
| project-away SetOfSignals
| evaluate bag_unpack(bag_)
AmbientAirTemperature
AverageEnergyConsumption
11.5
780.0
Fiddle

Search an XML attrbiute and Group it

I need to search for an XML Attribute Value, group it and form a corresponding JSON element.
Now for example: Label is my AttributeID and I have 2 attributes with Label.
(In realtime - it can be 'n' number of times.)
Need to group 2 Labels as a single JSON element.
VendorClass, VendorDivision and VendorDept should all come inside Vendor Json attribute.
VendorClass attributeValue should be mapped to classCode inside Vendor and other 2 codes should be null.
Likewise, VendorDept attributeValue should be mapped to DeptCode., VedorDivision attrbiuteValue mapped to DivisionCode.
O/P JSON (mapping):
type - should always be in upper characters - mapped to attributeID,
name - in lower character - mapped to "attributeId_attributeValue",
id/code - is attribute_value
Below is my i/p XML
<CurrentRule Currency="USD"
CurrentStatus="ACTIVE"
Priority="0"
RuleCategory="Current"
RuleType="COMBINATION"
>
<CurrentRuleTargetAttributeValueList/>
<CurrentRuleAttributeValueList>
<CurrentRuleAttributeValue
TriggerAttributeID="Label"
TriggerAttributeValue="10"/>
<CurrentRuleAttributeValue
TriggerAttributeID="Label"
TriggerAttributeValue="1003"/>
<CurrentRuleAttributeValue
TriggerAttributeID="ABCDCode"
TriggerAttributeValue="AC"/>
<CurrentRuleAttributeValue
TriggerAttributeID="ABCDCode"
TriggerAttributeValue="FD"/>
<CurrentRuleAttributeValue
TriggerAttributeID="VendorClass"
TriggerAttributeValue="00N"/>
<CurrentRuleAttributeValue
TriggerAttributeID="VendorDept"
TriggerAttributeValue="100"/>
<CurrentRuleAttributeValue
TriggerAttributeID="VendorDivision"
TriggerAttributeValue="10"/>
<CurrentRuleAttributeValue
TriggerAttributeID="VendorMarket"
TriggerAttributeValue="QVC"/>
<CurrentRuleAttributeValue
TriggerAttributeID="PriceCode"
TriggerAttributeValue="FP"/>
<CurrentRuleAttributeValue
TriggerAttributeID="ProductNumber"
TriggerAttributeValue="A0000"/>
<CurrentRuleAttributeValue
TriggerAttributeID="Trader"
TriggerAttributeValue="1010"/>
<CurrentRuleAttributeValue
TriggerAttributeID="Trader"
TriggerAttributeValue="1046"/>
</CurrentRuleAttributeValueList>
</CurrentRule>
Expected JSON o/p:
{
"header": {
"type": "ORDER",
"name": "merch-attribute-example",
"createUser": "admin"
},
"filters": {
"Labels": [
{
"type": "LABEL",
"name": "Label_10",
"LabelId": 10
},
{
"type": "LABEL",
"name": "Label_1003",
"LabelId": 1003
}
],
"vendor": [
{
"type": "VENDOR",
"name": "vendorclass_00n",
"divisionCode": null,
"departmentCode": null,
"classCode": "00N"
},
{
"type": "VENDOR",
"name": "vendordept_100",
"divisionCode": null,
"departmentCode": "100",
"classCode": null
},
{
"type": "VENDOR",
"name": "vendordivision_10",
"divisionCode": "10",
"departmentCode": null,
"classCode": null
}
],
"abcd": [
{
"type": "ABCD",
"name": "abcdcode_ac",
"abcdCode": "AC"
},
{
"type": "ABCD",
"name": "abcdcode_fd",
"abcdCode": "FD"
}
],
"priceCodes": [
{
"type": "PRICE_CODE",
"name": "pricecode_fp",
"code": "FP"
}
],
"products": [
{
"type": "PRODUCT",
"name": "productnumber_a0000",
"productNumbers": [
"A0000"
]
}
],
"traders": [
{
"type": "TRADER",
"name": "trader_1010",
"vendorCode": "1010"
},
{
"type": "TRADER",
"name": "trader_1046",
"vendorCode": "1046"
}
]
}
}
enter code here
Based on the rules described this is the best approximation. I'll leave to you how to decide if it is an id or code and the rest of the output that is not clearly explained.
%dw 2.0
output application/json
---
{
header: {
"type": "ORDER",
"name": "merch-attribute-example",
"createUser": "admin"
},
filters: payload.CurrentRule.CurrentRuleAttributeValueList
mapObject ((value, key, index) ->
{
attr: key.#
}
)
pluck (($$):$)
groupBy ((item, index) -> item.attr.TriggerAttributeID)
mapObject ((value1, key1, index1) ->
(key1): value1 map
{
"type": upper(key1),
name: lower(key1) ++ "_" ++ $.attr.TriggerAttributeValue,
id: $.attr.TriggerAttributeValue
}
)
}
Output:
{
"header": {
"type": "ORDER",
"name": "merch-attribute-example",
"createUser": "admin"
},
"filters": {
"Label": [
{
"type": "LABEL",
"name": "label_10",
"id": "10"
},
{
"type": "LABEL",
"name": "label_1003",
"id": "1003"
}
],
"ABCDCode": [
{
"type": "ABCDCODE",
"name": "abcdcode_AC",
"id": "AC"
},
{
"type": "ABCDCODE",
"name": "abcdcode_FD",
"id": "FD"
}
],
"VendorClass": [
{
"type": "VENDORCLASS",
"name": "vendorclass_00N",
"id": "00N"
}
],
"VendorDept": [
{
"type": "VENDORDEPT",
"name": "vendordept_100",
"id": "100"
}
],
"VendorDivision": [
{
"type": "VENDORDIVISION",
"name": "vendordivision_10",
"id": "10"
}
],
"VendorMarket": [
{
"type": "VENDORMARKET",
"name": "vendormarket_QVC",
"id": "QVC"
}
],
"PriceCode": [
{
"type": "PRICECODE",
"name": "pricecode_FP",
"id": "FP"
}
],
"ProductNumber": [
{
"type": "PRODUCTNUMBER",
"name": "productnumber_A0000",
"id": "A0000"
}
],
"Trader": [
{
"type": "TRADER",
"name": "trader_1010",
"id": "1010"
},
{
"type": "TRADER",
"name": "trader_1046",
"id": "1046"
}
]
}
}

AVRO schema for JSON

I have a JSON which gets generated like this. I wanted to know what would the avro schema for this would be. The number of keys values in array list is not fixed. There are related posts but they have the keys referenced and do not change. In my case the keys change. The names of the variable keys keeps on changing.
"fixedKey": [
{
"variableKey1": 2
},
{
"variableKey2": 1
},
{
"variableKey3": 3
},
.....
{
"variableKeyN" : 10
}
]
The schema should be something like this:
{
"type": "record",
"name": "test",
"fields": [
{
"name": "fixedKey",
"type": {
"type": "array",
"items": [
{"type": "map", "values": "int"},
],
},
}
],
}
Here's an example of serializing and deserializing your example data:
from io import BytesIO
from fastavro import writer, reader
schema = {
"type": "record",
"name": "test",
"fields": [
{
"name": "fixedKey",
"type": {
"type": "array",
"items": [
{"type": "map", "values": "int"},
],
},
}
],
}
records = [
{
"fixedKey": [
{
"variableKey1": 1,
},
{
"variableKey2": 2,
},
{
"variableKey3": 3,
},
]
}
]
bio = BytesIO()
writer(bio, schema, records)
bio.seek(0)
for record in reader(bio):
print(record)

Azure Cost Management API does not allow me to select columns

I tried to use the Azure Cost Management - Query Usage API to get details (certain columns) on all costs for a given subscription. The body I use for the request is
{
"type": "Usage",
"timeframe": " BillingMonthToDate ",
"dataset": {
"granularity": "Daily",
"configuration": {
"columns": [
"MeterCategory",
"CostInBillingCurrency",
"ResourceGroup"
]
}
}
But the response I get back is this:
{
"id": "xxxx",
"name": "xxxx",
"type": "Microsoft.CostManagement/query",
"location": null,
"sku": null,
"eTag": null,
"properties": {
"nextLink": null,
"columns": [
{
"name": "UsageDate",
"type": "Number"
},
{
"name": "Currency",
"type": "String"
} ],
"rows": [
[
20201101,
"EUR"
],
[
20201102,
"EUR"
],
[
20201103,
"EUR"
],
...
]
}
The JSON continues listing all the dates with the currency.
When I use the dataset.aggregation or dataset.grouping clauses in the JSON, I do get costs returned in my JSON but then I don't get the detailed column information that I want. And of course it is not possible to combine these 2 clauses with the dataset.columns clause. Anyone have any idea what I'm doing wrong?
I found a solution without using the dataset.columns clause (which might just be a faulty clause?). By grouping the data according tot the columns I want, I can also get the data for those column values:
{
"type": "Usage",
"timeframe": "BillingMonthToDate",
"dataset": {
"granularity": "Daily",
"aggregation": {
"totalCost": {
"name": "PreTaxCost",
"function": "Sum"
}
},
"grouping": [
{
"type": "Dimension",
"name": "SubscriptionName"
},
{
"type": "Dimension",
"name": "ResourceGroupName"
}
,
{
"type": "Dimension",
"name": "meterSubCategory"
}
,
{
"type": "Dimension",
"name": "MeterCategory"
}
]
}

Invalid response of JSON data from Django QuerySet

With this query:
def high_hazard(request):
reference_high = FloodHazard.objects.filter(hazard='High')
ids_high = reference_high.values_list('id', flat=True)
flood_hazard = []
djf = Django.Django(geodjango='geom', properties=['bldg_name', 'bldg_type'])
geoj = GeoJSON.GeoJSON()
for myid in ids_high:
getgeom = FloodHazard.objects.get(id=myid).geom
response_high = BuildingStructure.objects.filter(geom__intersects=getgeom)
get_hazard = geoj.encode(djf.decode(response_high.transform(900913)))
flood_hazard.append(get_hazard)
return HttpResponse(flood_hazard, content_type='application/json')
I was able to filter the BuildingStructure model based on FloodHazard type which is in this case with "high" value. Although it returns a JSON data, the output is messed up. I guess because it tests all the geometry from the FloodHazard model during loop. So, it returns several null set or empty and lots of FeatureCollection which makes it an invalid JSON data. The output of the query above is like this:
{
"crs": null,
"type": "FeatureCollection",
"features": [
]
}{
"crs": null,
"type": "FeatureCollection",
"features": [
]
}{
"crs": null,
"type": "FeatureCollection",
"features": [
{
"geometry": {
"type": "MultiPoint",
"coordinates": [
[
13974390.863509608,
1020340.6129766875
]
]
},
"type": "Feature",
"id": 3350,
"properties": {
"bldg_name": "",
"bldg_type": ""
}
},
{
"geometry": {
"type": "MultiPoint",
"coordinates": [
[
13974400.312472697,
1020356.5477410051
]
]
},
"type": "Feature",
"id": 3351,
"properties": {
"bldg_name": "",
"bldg_type": ""
}
}
]
}
As I test it with a JSON validator, it is invalid. So, is there a way to restructure(using underscore.js or jquery) this JSON to output like below? or I need to change my query?
{
"crs": null,
"type": "FeatureCollection",
"features": [
{
"geometry": {
"type": "MultiPoint",
"coordinates": [
[
13974390.863509608,
1020340.6129766875
]
]
},
"type": "Feature",
"id": 3350,
"properties": {
"bldg_name": "",
"bldg_type": ""
}
},
{
"geometry": {
"type": "MultiPoint",
"coordinates": [
[
13974400.312472697,
1020356.5477410051
]
]
},
"type": "Feature",
"id": 3351,
"properties": {
"bldg_name": "",
"bldg_type": ""
}
}
]
}
and just ignore/remove all the FeatureCollection without values and group all with values. Here is the result of the query above for reference.
Instead of
return HttpResponse(flood_hazard, content_type='application/json')
Try
return HttpResponse(json.dumps(flood_hazard), content_type='application/json')
You will have to import json at the top.