I have a JSON file that contains transactions, and each transaction contains multiple items. I need to flatten the data into SQL Server tables using T-SQL.
I have tried different options to flatten it, but it looks like I am missing something. Anyone who has worked on a similar structure have any ideas how this could be accomplished?
DECLARE #json NVARCHAR(MAX);
SELECT #json = JsonPath
FROM [dbo].[stg_transactionsJson] b;
SELECT
CONVERT(NVARCHAR(50), JSON_VALUE(c.Value, '$.orderId')) AS orderId,
CONVERT(DATETIME, JSON_VALUE(c.Value, '$.openTime')) AS openTime,
CONVERT(DATETIME, JSON_VALUE(c.Value, '$.closeTime')) AS closeTime,
CONVERT(NVARCHAR(50), JSON_VALUE(c.Value, '$.operatorId')) AS operatorId,
CONVERT(NVARCHAR(50), JSON_VALUE(c.Value, '$.terminalId')) AS terminalId,
CONVERT(NVARCHAR(50), JSON_VALUE(c.Value, '$.sessionId')) AS sessionId,
CONVERT(NVARCHAR(50), JSON_VALUE(p.Value, '$.productGroupId')) AS productGroupId,
CONVERT(NVARCHAR(150), JSON_VALUE(p.Value, '$.productId')) AS productId,
CONVERT(NVARCHAR(50), JSON_VALUE(p.Value, '$.quantity')) AS quantity,
CONVERT(NVARCHAR(150), JSON_VALUE(p.Value, '$.taxValue')) AS taxValue,
CONVERT(NVARCHAR(50), JSON_VALUE(p.Value, '$.value')) AS ProductValue,
CONVERT(NVARCHAR(150), JSON_VALUE(p.Value, '$.priceBandId')) AS priceBandId,
GETDATE() AS DateUpdated
FROM
OPENJSON(#json) AS c
OUTER APPLY
OPENJSON(c.Value, '$."products"') AS p;
And the sample JSON as follows
{
"orderId": 431,
"openTime": "2022-10-31T13:12:28",
"closeTime": "2022-10-31T13:12:32",
"operatorId": 7,
"terminalId": 4,
"sessionId": 1,
"products": [
{
"productId": 2632,
"productGroupId": 162,
"quantity": 1,
"taxValue": 0.58,
"value": 3.5,
"priceBandId": 2
},
{
"productId": 3224,
"productGroupId": 164,
"quantity": 1,
"taxValue": 0.08,
"value": 0.5,
"priceBandId": 2
}
],
"tenders": [
{
"tenderId": 2,
"value": 4.0
}
],
"type": 1,
"memberId": 1
}
You can Do it like this
First declare the values in an with clause and then cross apply the products
DECLARE #json NVARCHAR(MAX);
SET #json = '{
"orderId": 431,
"openTime": "2022-10-31T13:12:28",
"closeTime": "2022-10-31T13:12:32",
"operatorId": 7,
"terminalId": 4,
"sessionId": 1,
"products": [
{
"productId": 2632,
"productGroupId": 162,
"quantity": 1,
"taxValue": 0.58,
"value": 3.5,
"priceBandId": 2
},
{
"productId": 3224,
"productGroupId": 164,
"quantity": 1,
"taxValue": 0.08,
"value": 0.5,
"priceBandId": 2
}
],
"tenders": [
{
"tenderId": 2,
"value": 4.0
}
],
"type": 1,
"memberId": 1
}
';
SELECT c.orderId
,c.openTime
,c.closeTime
,c.operatorId
,c.terminalId
,c.sessionId
, JSON_VALUE(p.Value, '$.productGroupId') productGroupId
, JSON_VALUE(p.Value, '$.productId') productId
, JSON_VALUE(p.Value, '$.quantity') quantity
, JSON_VALUE(p.Value, '$.taxValue') taxValue
, JSON_VALUE(p.Value, '$.value') value
, JSON_VALUE(p.Value, '$.priceBandId') priceBandId
,GETDATE() AS DateUpdated
FROM
OPENJSON(#json) WITH (
orderId int '$.orderId',
openTime date '$.openTime',
closeTime date '$.closeTime',
operatorId int '$.operatorId',
terminalId int '$.terminalId',
sessionId int '$.sessionId',
[products] NVARCHAR(MAX) as JSON) c
CROSS APPLY OPENJSON(c.products) p
orderId
openTime
closeTime
operatorId
terminalId
sessionId
productGroupId
productId
quantity
taxValue
value
priceBandId
DateUpdated
431
2022-10-31
2022-10-31
7
4
1
162
2632
1
0.58
3.5
2
2023-01-11 13:57:47.607
431
2022-10-31
2022-10-31
7
4
1
164
3224
1
0.08
0.5
2
2023-01-11 13:57:47.607
fiddle
The expected output defines the exact statement, but if you need to parse nested JSON content you may try a combination of:
OPENJSON() with explicit schema and the AS JSON modifier for nested JSON
additional APPLY operators for each nested level.
T-SQL:
SELECT
j1.orderId, j1.openTime, j1.closeTime, j1.operatorId, j1.terminalId, j1.sessionId, j1.type, j1.memberId,
j2.productGroupId, j2.productId, j2.quantity, j2.taxValue, j2.productValue, j2.priceBandId,
j3.tenderId, j3.tenderValue
FROM stg_transactionsJson s
OUTER APPLY OPENJSON(s.stg_transactionsJson) WITH (
orderId NVARCHAR(50) '$.orderId',
openTime DATETIME '$.openTime',
closeTime DATETIME '$.closeTime',
operatorId NVARCHAR(50) '$.operatorId',
terminalId NVARCHAR(50) '$.terminalId',
sessionId NVARCHAR(50) '$.sessionId',
products NVARCHAR(MAX) '$.products' AS JSON,
tenders NVARCHAR(MAX) '$.tenders' AS JSON,
type int '$.type',
memberId int '$.memberId'
) j1
OUTER APPLY OPENJSON(j1.products) WITH (
productGroupId NVARCHAR(50) '$.productGroupId',
productId NVARCHAR(150) '$.productId',
quantity NVARCHAR(50) '$.quantity',
taxValue NVARCHAR(150) '$.taxValue',
productValue NVARCHAR(50) '$.value',
priceBandId NVARCHAR(150)'$.priceBandId'
) j2
OUTER APPLY OPENJSON(j1.tenders) WITH (
tenderId NVARCHAR(150) '$.tenderId',
tenderValue NVARCHAR(50) '$.value'
) j3
Related
I'm parsing a JSON dataset within SQL Server and it works great with a single object, but falls down when multiple data objects are presented. I assume it's because of the CROSS APPLY statements.
Within the JSON dataset, there is only 4 records, but my current sql is returning 16 (4 duplicate sets, as there are 4 cross apply statements), but I'm not sure how to get around this?
json
{
"type": "test",
"user": {
"last_update": "2022-06-19T14:13:07.707502+00:00",
"user_id": "12345"
},
"data": [
{
"metadata": {
"start_time": "2022-06-19T00:00:00+01:00",
"end_time": "2022-06-20T00:00:00+01:00"
},
"distance_data": {
"steps": 9299,
"distance_meters": 7704.0
}
},
{
"metadata": {
"start_time": "2022-06-17T00:00:00+01:00",
"end_time": "2022-06-18T00:00:00+01:00"
},
"distance_data": {
"steps": 2546,
"distance_meters": 2143.0
}
},
{
"metadata": {
"start_time": "2022-06-16T00:00:00+01:00",
"end_time": "2022-06-17T00:00:00+01:00"
},
"distance_data": {
"steps": 4969,
"distance_meters": 4192.0
}
},
{
"metadata": {
"start_time": "2022-06-18T00:00:00+01:00",
"end_time": "2022-06-19T00:00:00+01:00"
},
"distance_data": {
"steps": 6769,
"distance_meters": 5698.0
}
}
]
}
SQL statement
SELECT
distance_meters, steps, cast(left(start_time,10) as date) startDate
FROM
OPENJSON ( #json )
WITH (
jType nvarchar(50) N'$.type',
jUser char(36) N'$.user.user_id',
data nvarchar(max) as JSON
) as a
CROSS APPLY
OPENJSON(a.data)
WITH
(
distance_data nvarchar(max) as json
) as b
CROSS APPLY
OPENJSON (b.distance_data)
WITH
(
distance_meters float,
steps int
) as c
CROSS APPLY
OPENJSON (a.data)
WITH
(
metadata nvarchar(max) as json
) as d
CROSS APPLY
OPENJSON (d.metadata)
WITH
(
start_time nvarchar(25),
end_time nvarchar(25)
) as e
ORDER BY startDate ASC;
I think you need a single APPLY operator:
SELECT j1.jType, j1.jUser, j2.*
FROM OPENJSON(#json) WITH (
jType nvarchar(50) N'$.type',
jUser char(36) N'$.user.user_id',
data nvarchar(max) as JSON
) AS j1
CROSS APPLY OPENJSON(j1.data) WITH (
start_time nvarchar(25) '$.metadata.start_time',
end_time nvarchar(25) '$.metadata.end_time',
steps numeric(10, 0) '$.distance_data.steps',
distance_meters numeric(10, 1) '$.distance_data.distance_meters'
) j2
Result:
jType
jUser
start_time
end_time
steps
distance_meters
test
12345
2022-06-19T00:00:00+01:00
2022-06-20T00:00:00+01:00
9299
7704.0
test
12345
2022-06-17T00:00:00+01:00
2022-06-18T00:00:00+01:00
2546
2143.0
test
12345
2022-06-16T00:00:00+01:00
2022-06-17T00:00:00+01:00
4969
4192.0
test
12345
2022-06-18T00:00:00+01:00
2022-06-19T00:00:00+01:00
6769
5698.0
I'm trying to parse the 'custinfo' array to rows, rather than specific columns how I have in my query (there can be none or many values in the array)
DECLARE #json NVARCHAR(MAX) ='{
"customer": [
{
"id": "123",
"history": [
{
"id": "a123",
"dates": [
{
"date": "2022-03-19",
"details": {
"custinfo": [
"male",
"married"
],
"age": 40
}}]}]}]}'
SELECT
JSON_VALUE ( j.[value], '$.id' ) AS CustId,
JSON_VALUE ( m.[value], '$.id' ) AS CustId_Hist,
JSON_VALUE ( a1.[value], '$.date' ) AS date,
JSON_VALUE ( a1.[value], '$.details.age' ) AS age,
JSON_VALUE ( a1.[value], '$.details.custinfo[0]' ) AS custinfo0,
JSON_VALUE ( a1.[value], '$.details.custinfo[1]' ) AS custinfo1
FROM OPENJSON( #json, '$."customer"' ) j
CROSS APPLY OPENJSON ( j.[value], '$."history"' ) AS m
CROSS APPLY OPENJSON ( m.[value], '$."dates"' ) AS a1
Desired results:
Like I mentioned in the comments, I would switch to using WITH clauses and defining your columns and their data types. You can then also get the values, into 2 separate rows you want, with the following. Note tbhe extra OPENJSON at the end, which treats the custinfo as the array it is; returning 2 rows (1 for each value in the array):
DECLARE #json NVARCHAR(MAX) ='{
"customer": [
{
"id": "123",
"history": [
{
"id": "a123",
"dates": [
{
"date": "2022-03-19",
"details": {
"custinfo": [
"male",
"married"
],
"age": 40
}}]}]}]}';
SELECT c.id AS CustId,
h.id AS CustId_Hist,
d.date AS date,
d.age AS age,
ci.[value] AS custinfo
FROM OPENJSON( #json,'$.customer')
WITH (id int,
history nvarchar(MAX) AS JSON) c
CROSS APPLY OPENJSON (c.history)
WITH (id varchar(10),
dates nvarchar(MAX) AS JSON) h
CROSS APPLY OPENJSON (h.dates)
WITH(date date,
details nvarchar(MAX) AS JSON,
age int '$.details.age') d
CROSS APPLY OPENJSON(d.details,'$.custinfo') ci;
I have this JSON
{
"OrderHeader": [[
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail":
{
"ProductID":2000,
"UnitPrice":350
}
}
]]
}
After the OrderHeader there is two [[.
I am not able to parse this JSON using the OPENJSON TSQL Command.
If it was just one [ then i would parse like
DECLARE #JSONData varchar(MAX) = '
{
"OrderHeader": [
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail":
{
"ProductID": 2000,
"UnitPrice": 350
}
}
]
}
'
SELECT
OrderID, CustomerID, ProductID, UnitPrice
FROM OPENJSON (#JSONData)
WITH (
OrderHeader nvarchar(MAX) AS JSON
) AS Orders
CROSS APPLY OPENJSON(Orders.OrderHeader)
WITH
(
OrderID INT,
CustomerID INT,
OrderDetail nvarchar(MAX) AS JSON
) AS OrderDetail
CROSS APPLY OPENJSON(OrderDetail.OrderDetail)
WITH
(
ProductID INT,
UnitPrice INT
) ;
Please help to Parse with its with TWO [ as in the first sample
Thank you.
One method would be to double up the OPENJSONs, though I would personally suggest, if you can, fixing the JSON:
DECLARE #JSONData nvarchar(MAX) = N'
{
"OrderHeader": [[
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail":
{
"ProductID": 2000,
"UnitPrice": 350
}
}
]]
}';
SELECT V.OrderID,
V.CustomerID,
OD.ProductID,
OD.UnitPrice
FROM OPENJSON(#JSONData)
WITH (OrderHeader nvarchar(MAX) AS JSON) OJ
CROSS APPLY OPENJSON(OJ.OrderHeader) OH
CROSS APPLY OPENJSON(OH.[value])
WITH (OrderID int,
CustomerID int,
OrderDetail nvarchar(MAX) AS JSON) V
CROSS APPLY OPENJSON(V.OrderDetail)
WITH (ProductID int,
UnitPrice int) OD;
The statement depends on the structure of the parsed JSON ( ... I hope I understand it correctly) and you may simplify your statement using only two OPENJSON() calls:
JSON (with one additional item in the nested JSON array):
DECLARE #JSONData varchar(MAX) = '
{
"OrderHeader": [[
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail":
{
"ProductID":2000,
"UnitPrice":350
}
},
{
"OrderID": 200,
"CustomerID": 3000,
"OrderDetail":
{
"ProductID":2000,
"UnitPrice":350
}
}
]]
}'
Statement:
SELECT j2.*
FROM OPENJSON(#JSONData, '$.OrderHeader') j1
CROSS APPLY OPENJSON(j1.[value]) WITH (
OrderID INT '$.OrderID',
CustomerID INT '$.CustomerID',
ProductID INT '$.OrderDetail.ProductID',
UnitPrice INT' $.OrderDetail.UnitPrice'
) j2
Result:
OrderID CustomerID ProductID UnitPrice
100 2000 2000 350
200 3000 2000 350
Finally, if the $.OrderHeader JSON array always has a single item, the following statement is also an option:
SELECT *
FROM OPENJSON(#JSONData, '$.OrderHeader[0]') WITH (
OrderID INT '$.OrderID',
CustomerID INT '$.CustomerID',
ProductID INT '$.OrderDetail.ProductID',
UnitPrice INT' $.OrderDetail.UnitPrice'
)
So, I have a simple view that looks like this:
Name | Type | Product | QuantitySold
------------------------------------------------------
Walmart | Big Store | Gummy Bears | 10
Walmart | Big Store | Toothbrush | 6
Target | Small Store | Toothbrush | 2
Without using nested queries, using sql's FOR JSON clause, can this be easily converted to this json.
[
{
"Type": "Big Store",
"Stores": [
{
"Name": "Walmart",
"Products": [
{
"Name": "Gummy Bears",
"QuantitySold": 10
},
{
"Name": "Toothbrush",
"QuantitySold": 6
}
]
}
]
},
{
"Type": "Smaller Store",
"Stores": [
{
"Name": "Target",
"Products": [
{
"Name": "Toothbrush",
"QuantitySold": 2
}
]
}
]
}
]
Essentially Group by Type, Store then, line items. My attempt so far below. Not sure how to properly group the rows.
SELECT Type, (
SELECT Store,
(SELECT Product,QuantitySold from MyTable m3 where m3.id=m2.id for json path) as Products
FROM MyTable m2 where m1.ID = m2.ID for json path) as Stores
) as Types FROM MyTable m1
You can try something like this:
DECLARE #Data TABLE (
Name VARCHAR(20), Type VARCHAR(20), Product VARCHAR(20), QuantitySold INT
);
INSERT INTO #Data ( Name, Type, Product, QuantitySold ) VALUES
( 'Walmart', 'Big Store', 'Gummy Bears', 10 ),
( 'Walmart', 'Big Store', 'Toothbrush', 6 ),
( 'Target', 'Small Store', 'Toothbrush', 2 );
SELECT DISTINCT
t.[Type],
Stores
FROM #Data AS t
OUTER APPLY (
SELECT (
SELECT DISTINCT [Name], Products FROM #Data x
OUTER APPLY (
SELECT (
SELECT Product AS [Name], QuantitySold FROM #Data n WHERE n.[Name] = x.[Name]
FOR JSON PATH
) AS Products
) AS p
WHERE x.[Type] = t.[Type]
FOR JSON PATH
) AS Stores
) AS Stores
ORDER BY [Type]
FOR JSON PATH;
Returns
[{
"Type": "Big Store",
"Stores": [{
"Name": "Walmart",
"Products": [{
"Name": "Gummy Bears",
"QuantitySold": 10
}, {
"Name": "Toothbrush",
"QuantitySold": 6
}]
}]
}, {
"Type": "Small Store",
"Stores": [{
"Name": "Target",
"Products": [{
"Name": "Toothbrush",
"QuantitySold": 2
}]
}]
}]
If you had normalized data structure you could use a another approach.
--Let's assume that Types are stored like this
DECLARE #Types TABLE (
id int,
Type nvarchar(20)
);
INSERT INTO #Types VALUES (1, N'Big Store'), (2, N'Small Store');
--Stores in separate table
DECLARE #Stores TABLE (
id int,
Name nvarchar(10),
TypeId int
);
INSERT INTO #Stores VALUES (1, N'Walmart', 1), (2, N'Target', 2),
(3, N'Tesco', 2); -- I added one more just for fun
--Products table
DECLARE #Products TABLE (
id int,
Name nvarchar(20)
);
INSERT INTO #Products VALUES (1, N'Gummy Bears'), (2, N'Toothbrush'),
(3, N'Milk'), (4, N'Ball') -- Added some here
-- And here comes the sales
DECLARE #Sales TABLE (
StoreId int,
ProductId int,
QuantitySold int
);
INSERT INTO #Sales VALUES (1, 1, 10), (1, 2, 6), (2, 2, 2),
(3, 4, 15), (3, 3, 7); -- I added few more
Now we can join the tables a get result that you need
SELECT Type = Type.Type,
Name = [Stores].Name,
Name = Products.Product,
QuantitySold = Products.QuantitySold
FROM (
SELECT s.StoreId,
p.Name Product,
s.QuantitySold
FROM #Sales s
INNER JOIN #Products p
ON p.id = s.ProductId
) Products
INNER JOIN #Stores Stores
ON Stores.Id = Products.StoreId
INNER JOIN #Types [Type]
ON Stores.TypeId = [Type].id
ORDER BY Type.Type, [Stores].Name
FOR JSON AUTO;
Output:
[
{
"Type": "Big Store",
"Stores": [
{
"Name": "Walmart",
"Products": [
{
"Name": "Gummy Bears",
"QuantitySold": 10
},
{
"Name": "Toothbrush",
"QuantitySold": 6
}
]
}
]
},
{
"Type": "Small Store",
"Stores": [
{
"Name": "Target",
"Products": [
{
"Name": "Toothbrush",
"QuantitySold": 2
}
]
},
{
"Name": "Tesco",
"Products": [
{
"Name": "Ball",
"QuantitySold": 15
},
{
"Name": "Milk",
"QuantitySold": 7
}
]
}
]
}
]
I want to display the parameter ST and NextTime from table #json. The parameters id and Timestamp appear normally. I try the following but does not show any effect.Any possible answers?
My Json
{
"PCol": [{
"Id": 15,
"TimeStamp": "2018-02-1",
"Val": {
"States": [{
"Numbers": {
"Number": [5, 8]
},
"CS": {
"ST": "25"
},
"Changes": [{
"NextTime": 1
}]
}]
}
}]
}
My Sql Query
SELECT * FROM OPENJSON((select * from #json),N'$.PCol')
WITH (
[Id] INT N'$.Id ',
[TimeStamp] NVARCHAR(MAX) N'$.TimeStamp',
**[ST] INT N'$.Val.States.CS.ST'**
)
Thanks
Not going to lie, my OPENJSON knowledge is poor (so this might be able to be more succinct), however, this works:
DECLARE #JSON nvarchar(MAX) =
N' {
"PCol": [{
"Id": 15,
"TimeStamp": "2018-02-1",
"Val": {
"States": [{
"Numbers": {
"Number": [5, 8]
},
"CS": {
"ST": "25"
}
}]
}
}]
} ';
SELECT P.Id,
P.TimeStamp,
V.ST
FROM OPENJSON(#JSON,N'$.PCol') WITH ([Id] INT N'$.Id ',
[TimeStamp] NVARCHAR(MAX) N'$.TimeStamp',
[Vals] nvarchar(MAX) N'$.Val' AS JSON) P
CROSS APPLY OPENJSON(P.Vals,N'$.States') WITH (ST int N'$.CS.ST') V
For the latest JSON, this works:
SELECT P.Id,
P.TimeStamp,
V.ST,
C.NextTime
FROM OPENJSON(#JSON,N'$.PCol') WITH ([Id] INT N'$.Id ',
[TimeStamp] NVARCHAR(MAX) N'$.TimeStamp',
[Vals] nvarchar(MAX) N'$.Val' AS JSON) P
CROSS APPLY OPENJSON(P.Vals,N'$.States') WITH (ST int N'$.CS.ST',
[Changes] nvarchar(MAX) N'$.Changes' AS JSON) V
CROSS APPLY OPENJSON(V.[Changes]) WITH (NextTime int '$.NextTime') C