How to group data by two values with some conditions - json

I have a usecase in which there are few transactions that needs to be grouped first by date and then by type but we don't want to group all the types we only want to group specific types and also calcaulate total amount
for example we need to group data if type is only buy or sell otherwise we don't want to group the data by type
current json response
{
"transactions": [
{
"date": "2022-06-16",
"type": "withdrawal",
"amount": 30
},
{
"date": "2022-06-16",
"type": "sell",
"amount": 30
},
{
"date": "2022-06-16",
"type": "sell",
"amount": 30
},
{
"date": "2022-06-16",
"type": "withdrawal",
"amount": 30
}
]
}
expected json response
{
"groupedTransactions": {
"2022-06-16": [
{
"type": "sell",
"aggregatedAmount": 60,
"transactions": [
{
"date": "2022-06-16",
"type": "sell",
"amount": 30
},
{
"date": "2022-06-16",
"type": "sell",
"amount": 30
}
]
},
{
"type": "withdrawal",
"amount": 30
},
{
"type": "withdrawal",
"amount": 30
}
]
}
}
I was thinking to first group all data by date and then add some condition to group by type but it is not giving the expected response.Please help or give me some pointers to explore more.

Related

How to order the json response by type in kotlin

json Response
{
"transactions": [
{
"date": "2022-06-16",
"type": "buy"
},
{
"date": "2022-06-16",
"type": "sell"
},
{
"date": "2022-06-16",
"type": "withdrawal"
}
]
}
I want to order this data by type i.e (withdrawal,sell,buy,other) but I'm not aware how can we compare these values
Expected json response
{
"transactions": [
{
"date": "2022-06-16",
"type": "withdrawal"
},
{
"date": "2022-06-16",
"type": "sell"
},
{
"date": "2022-06-16",
"type": "buy"
},
{
"date": "2022-06-16",
"type": "dividend"
}
]
}

Azure Cost Management API does not allow me to select columns

I tried to use the Azure Cost Management - Query Usage API to get details (certain columns) on all costs for a given subscription. The body I use for the request is
{
"type": "Usage",
"timeframe": " BillingMonthToDate ",
"dataset": {
"granularity": "Daily",
"configuration": {
"columns": [
"MeterCategory",
"CostInBillingCurrency",
"ResourceGroup"
]
}
}
But the response I get back is this:
{
"id": "xxxx",
"name": "xxxx",
"type": "Microsoft.CostManagement/query",
"location": null,
"sku": null,
"eTag": null,
"properties": {
"nextLink": null,
"columns": [
{
"name": "UsageDate",
"type": "Number"
},
{
"name": "Currency",
"type": "String"
} ],
"rows": [
[
20201101,
"EUR"
],
[
20201102,
"EUR"
],
[
20201103,
"EUR"
],
...
]
}
The JSON continues listing all the dates with the currency.
When I use the dataset.aggregation or dataset.grouping clauses in the JSON, I do get costs returned in my JSON but then I don't get the detailed column information that I want. And of course it is not possible to combine these 2 clauses with the dataset.columns clause. Anyone have any idea what I'm doing wrong?
I found a solution without using the dataset.columns clause (which might just be a faulty clause?). By grouping the data according tot the columns I want, I can also get the data for those column values:
{
"type": "Usage",
"timeframe": "BillingMonthToDate",
"dataset": {
"granularity": "Daily",
"aggregation": {
"totalCost": {
"name": "PreTaxCost",
"function": "Sum"
}
},
"grouping": [
{
"type": "Dimension",
"name": "SubscriptionName"
},
{
"type": "Dimension",
"name": "ResourceGroupName"
}
,
{
"type": "Dimension",
"name": "meterSubCategory"
}
,
{
"type": "Dimension",
"name": "MeterCategory"
}
]
}

Azure Data Factory Copy Activity

I have been working on this for a couple days and cannot get past this error. I have 2 activities in this pipeline. The first activity copies data from an ODBC connection to an Azure database, which is successful. The 2nd activity transfers the data from Azure table to another Azure table and keeps failing.
The error message is:
Copy activity met invalid parameters: 'UnknownParameterName', Detailed message: An item with the same key has already been added..
I do not see any invalid parameters or unknown parameter names. I have rewritten this multiple times using their add activity code template and by myself, but do not receive any errors when deploying on when it is running. Below is the JSON pipeline code.
Only the 2nd activity is receiving an error.
Thanks.
Source Data set
{
"name": "AnalyticsDB-SHIPUPS_06shp-01src_AZ-915PM",
"properties": {
"structure": [
{
"name": "UPSD_BOL",
"type": "String"
},
{
"name": "UPSD_ORDN",
"type": "String"
}
],
"published": false,
"type": "AzureSqlTable",
"linkedServiceName": "Source-SQLAzure",
"typeProperties": {},
"availability": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"external": true,
"policy": {}
}
}
Destination Data set
{
"name": "AnalyticsDB-SHIPUPS_06shp-02dst_AZ-915PM",
"properties": {
"structure": [
{
"name": "SHIP_SYS_TRACK_NUM",
"type": "String"
},
{
"name": "SHIP_TRACK_NUM",
"type": "String"
}
],
"published": false,
"type": "AzureSqlTable",
"linkedServiceName": "Destination-Azure-AnalyticsDB",
"typeProperties": {
"tableName": "[olcm].[SHIP_Tracking]"
},
"availability": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"external": false,
"policy": {}
}
}
Pipeline
{
"name": "SHIPUPS_FC_COPY-915PM",
"properties": {
"description": "copy shipments ",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "RelationalSource",
"query": "$$Text.Format('SELECT COMPANY, UPSD_ORDN, UPSD_BOL FROM \"orupsd - UPS interface Dtl\" WHERE COMPANY = \\'01\\'', WindowStart, WindowEnd)"
},
"sink": {
"type": "SqlSink",
"sqlWriterCleanupScript": "$$Text.Format('delete imp_fc.SHIP_UPS_IntDtl_Tracking', WindowStart, WindowEnd)",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "COMPANY:COMPANY, UPSD_ORDN:UPSD_ORDN, UPSD_BOL:UPSD_BOL"
}
},
"inputs": [
{
"name": "AnalyticsDB-SHIPUPS_03shp-01src_FC-915PM"
}
],
"outputs": [
{
"name": "AnalyticsDB-SHIPUPS_03shp-02dst_AZ-915PM"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetry": 0,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"name": "915PM-SHIPUPS-fc-copy->[imp_fc]_[SHIP_UPS_IntDtl_Tracking]"
},
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "$$Text.Format('select distinct ups.UPSD_BOL, ups.UPSD_BOL from imp_fc.SHIP_UPS_IntDtl_Tracking ups LEFT JOIN olcm.SHIP_Tracking st ON ups.UPSD_BOL = st.SHIP_SYS_TRACK_NUM WHERE st.SHIP_SYS_TRACK_NUM IS NULL', WindowStart, WindowEnd)"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "UPSD_BOL:SHIP_SYS_TRACK_NUM, UPSD_BOL:SHIP_TRACK_NUM"
}
},
"inputs": [
{
"name": "AnalyticsDB-SHIPUPS_06shp-01src_AZ-915PM"
}
],
"outputs": [
{
"name": "AnalyticsDB-SHIPUPS_06shp-02dst_AZ-915PM"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"name": "915PM-SHIPUPS-AZ-update->[olcm]_[SHIP_Tracking]"
}
],
"start": "2017-08-22T03:00:00Z",
"end": "2099-12-31T08:00:00Z",
"isPaused": false,
"hubName": "adf-tm-prod-01_hub",
"pipelineMode": "Scheduled"
}
}
Have you seen this link?
They get the same error message and suggest using AzureTableSink instead of SqlSink
"sink": {
"type": "AzureTableSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
}
It would make sense for you too since your 2nd copy activity is Azure to Azure
It could be a red herring but I'm pretty sure "tableName" is a require entry in the typeProperties for a sqlSource. Yours is missing this for the input dataset. Appreciate you have a join in the sqlReaderQuery so probably best to put a dummy (but real) table name in there.
Btw, not clear why you are using $$Text.Format and WindowStart/WindowEnd on your queries if you're not transposing these values into the query; you could just put the query between double quotes.

Fetch data based on key

{ "Labels": [ {"Test": 99.25341796875, "Name": "Skateboard" }, { "Test": 9.25341796875, "Name": "Sport" }, { "Test": 49.24723052978516, "Name": "People" }]}
I need to remove the Testtag on basis of below conditions
if Test value >50 then replace Test=Major
if Test value <50 then replace with Test=Minor
so here output requested is like below.
{ "Labels": [ {"High": 99.25341796875, "Name": "Skateboard" }, { "Low": 9.25341796875, "Name": "Sport" }, { "Low": 49.24723052978516, "Name": "People" }]}
jq solution:
jq '.Labels |= map(.[(if .Confidence > 50 then "High" else "Low" end)]= .Confidence | del(.Confidence))' yourfile.json
The output:
{
"Labels": [
{
"Name": "Skateboard",
"High": 99.25341796875
},
{
"Name": "Sport",
"Low": 9.25341796875
},
{
"Name": "People",
"Low": 49.24723052978516
}
]
}

Schema to load json data to google big query

I have a question for the project that we are doing...
I tried to extract this JSON to Google Big Query and not able to get JSON votes Object fields from the JSON input. I tried the "record" and the "string" types in the schema.
{
"votes": {
"funny": 10,
"useful": 10,
"cool": 10
},
"user_id": "OlMjqqzWZUv2-62CSqKq_A",
"review_id": "LMy8UOKOeh0b9qrz-s1fQA",
"stars": 4,
"date": "2008-07-02",
"text": "This is what this 4-star bar is all about.",
"type": "review",
"business_id": "81IjU5L-t-QQwsE38C63hQ"
}
Also i am not able to get the tables populated from this below JSON for the categories and neighborhood JSON arrays? What should my schema be for these inputs? The docs didn't help much unfortunately in this case or maybe i am not looking at the right place..
{
"business_id": "Iu-oeVzv8ZgP18NIB0UMqg",
"full_address": "3320 S Hill St\nSouth East LA\nLos Angeles, CA 90007",
"schools": [
"University of Southern California"
],
"open": true,
"categories": [
"Medical Centers",
"Health and Medical"
],
"neighborhoods": [
"South East LA"
]
}
I am able to get the regular fields, but that's about it... Any help is appreciated!
For business it seems you want schools to be a repeated field. Your schema should be:
"schema": {
"fields": [
{
"name": "business_id",
"type": "string"
}.
{
"name": "full_address",
"type": "string"
},
{
"name": "schools",
"type": "string",
"mode": "repeated"
},
{
"name": "open",
"type": "boolean"
}
]
}
For votes it seems you want record. Your schema should be:
"schema": {
"fields": [
{
"name": "name",
"type": "string"
}.
{
"name": "votes",
"type": "record",
"fields": [
{
"name": "funny",
"type": "integer",
},
{
"name": "useful",
"type": "integer"
},
{
"name": "cool",
"type": "integer"
}
]
},
]
}
Source
I was also stuck on this problem, but the issue I faced was because one has to remember to flag the mode as repeated for the records source
Also please note that these cannot have a null value source