How to parse JSON into relational format in SQL Server 2016? - json

I have some Json stored in SQL Server 2016 table as under (partitial)
{
"AFP": [
{
"AGREEMENTID": "29040400001330",
"LoanAccounts": {
"Product": "OD003",
"BUCKET": 0,
"ZONE": "MUMBAI ZONE",
"Region": "MUMBAI METRO-CENTRAL REGION",
"STATE": "GOA",
"Year": 2017,
"Month": 10,
"Day": 13
},
"FeedbackInfo": {
"FeedbackDate": "2017-10-13T12:07:44.2317198",
"DispositionDate": "2017-10-13T12:07:44.2317198",
"DispositionCode": "PR"
},
"PaymentInfo": {
"ReceiptNo": "2000000170",
"ReceiptDate": "2017-10-13T12:07:42.1218299",
"PaymentMode": "Cheque",
"Amount": 200,
"PaymentStatus": "CollectionBatchCreated"
}
}
]
}
table schema as under
create table tblHistoricalDataDemo(
AGREEMENTID nvarchar(40)
,Year_Json nvarchar(4000)
)
I would like to fetch the records from JSON into relational format as
AgreementID Product Bucket .... PaymentStatus
I tried with below but something wrong i am doing for which I am not able to get the result
SELECT AGREEMENTID,
JSON_VALUE(Year_Json, '$.LoanAccounts') AS records
FROM tblHistoricalDataDemo

Use the OPENJSON built in table value function:
SELECT *
FROM tblHistoricalDataDemo
CROSS APPLY
OPENJSON(Year_Json, '$.AFP') WITH
(
-- You don't have to specify the json path
-- if the column name is the same as the json name
AGREEMENTID bigint
)
As afp
CROSS APPLY
OPENJSON(Year_Json, '$.AFP') WITH
(
Product varchar(10) '$.LoanAccounts.Product',
bucket int '$.LoanAccounts.BUCKET'
)
As LoanAccounts

In case the array in JSON has a fixed number of element, use
$.P1[x]
If AFP has only 1 element,
SELECT t.AGREEMENTID,
JSON_Value(Year_Json, '$.AFP[0].LoanAccounts.Product') Product,
JSON_Value(Year_Json, '$.AFP[0].LoanAccounts.BUCKET') Bucket,
JSON_Value(Year_Json, '$.AFP[0].PaymentInfo.PaymentStatus') PaymentStatus
FROM tblHistoricalDataDemo t
Run it in SQLFiddle, thx Jacob H.

Related

Full text search in concrete node in json

I has table "Product" with two columns:
Id - Bigint primary key
data - Jsonb
Here example of json:
{
"availability": [
{
"qty": 10,
"price": 42511,
"store": {
"name": "my_best_store",
"hours": null,
"title": {
"en": null
},
"coords": null,
"address": null,
I insert json to column "data".
Here sql get find "my_best_store"
select *
from product
where to_tsvector(product.data) ## to_tsquery('my_best_store')
Nice. It's work fine.
But I need to find "my_best_store" only in section "availability".
I try this but result is empty:
select *
from product
where to_tsvector(product.data) ## to_tsquery('availability & my_best_store')
Assuming you want to search in the name attribute, you can do the following:
select p.*
from product p
where exists (select *
from jsonb_array_elements(p.data -> 'availability') as t(item)
where to_tsvector(t.item -> 'store' ->> 'name') ## to_tsquery('my_best_store'))
With Postgres 12, you can simplify that to:
select p.*
from product p
where to_tsvector(jsonb_path_query_array(data, '$.availability[*].store.name')) ## to_tsquery('my_best_store')

Convert MySQL query result to JSON

The project I'm working on requires to save all of the DB operations. So when there will be added new user I've to log the date, operation type like 'INSERT', 'UPDATE', 'DELETE' and all user data. The project is in the development phase so the columns in User table are changing.
This what I plan to do is to select the new user data from the Users table and insert them to UserLog table as a JSON column.
Is it possible to convert SELECT * FROM table_name to JSON format?
I know that there is a possibility to convert separated columns by JSON_OBJECT function, but as I mentioned above, the columns are floating so I would be forced to change the JSON_OBJECT names each time I change anything in the main table. And there are a lot of tables!
It should work like this:
CREATE TABLE Users (
id INT(1) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
firstName VARCHAR(30) NOT NULL,
lastName VARCHAR(30) NOT NULL,
email VARCHAR(50),
)
The query:
SELECT * FROM Users;
Should return:
[
{
"id": 1,
"firstName": "Lucas",
"lastName": "Smith",
"email": "lucas#def.com"
},
{
"id": 2,
"firstName": "Ben",
"lastName": "Kovalsky",
"email": "ben#def.com"
},
...
]
Is there a simple solution to solve this problem? If not, what is your strategy for logging DB operations?
I'm not up to date with MySQL as I switched over to PostgreSQL but I found that the recent MySQL, from version 8, supports JSON:
SELECT JSON_ARRAYAGG(
JSON_OBJECT(
'id', `id`,
'firstName', `firstName`,
'lastName', `lastName`,
'email', `email`
)
)
FROM Users;
should work.
Edit, sources:
https://dev.mysql.com/doc/refman/8.0/en/json.html#json-values
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_json-arrayagg
I know this is an old thread, but for anyone still facing this issue, there is a way to convert the result set into json without knowing the column names. The key is to get the names of the columns in a string like 'column_1', column_1, 'column_2', column_2, ... and then use this string in a prepared query.
SET #column_name_string_for_query = "";
SHOW COLUMNS
FROM your_table_name
WHERE #column_name_string_for_query := TRIM(", " FROM CONCAT("'", Field, "', ", Field, ", ", #column_name_string_for_query));
SET #query_string = concat("
SELECT JSON_ARRAYAGG(JSON_OBJECT(", #column_name_string_for_query, "))
FROM your_table_name"
);
PREPARE statement FROM #query_string;
EXECUTE statement;
DEALLOCATE PREPARE statement;
You could also get the column names from INFORMATION_SCHEMA.COLUMNS, but that only works for tables that are not temporary tables. The solution above works for both temporary tables and normal tables.
You could also save this as a stored procedure for ease of use.
Normally converting the output to JSON or any other format is a job for the programming language or your mySQL IDE, but there is a way also from mySQL.
https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-json-output.html
https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-json-wrapping.html
Directly from documentation:
MySQL localhost:33060+ ssl world_x JS > shell.options.set('resultFormat','json')
MySQL localhost:33060+ ssl world_x JS > session.sql("select * from city where countrycode='AUT'")
{
"ID": 1523,
"Name": "Wien",
"CountryCode": "AUT",
"District": "Wien",
"Info": {
"Population": 1608144
}
}
{
"ID": 1524,
"Name": "Graz",
"CountryCode": "AUT",
"District": "Steiermark",
"Info": {
"Population": 240967
}
}
{
"ID": 1525,
"Name": "Linz",
"CountryCode": "AUT",
"District": "North Austria",
"Info": {
"Population": 188022
}
}
{
"ID": 1526,
"Name": "Salzburg",
"CountryCode": "AUT",
"District": "Salzburg",
"Info": {
"Population": 144247
}
}
{
"ID": 1527,
"Name": "Innsbruck",
"CountryCode": "AUT",
"District": "Tiroli",
"Info": {
"Population": 111752
}
}
{
"ID": 1528,
"Name": "Klagenfurt",
"CountryCode": "AUT",
"District": "Kärnten",
"Info": {
"Population": 91141
}
}
6 rows in set (0.0031 sec)
Also, adding the JSON_OBJECT, that is available from 5.7+, see the answer here.
mysql> SELECT JSON_OBJECT('id', 87, 'name', 'carrot');
+-----------------------------------------+
| JSON_OBJECT('id', 87, 'name', 'carrot') |
+-----------------------------------------+
| {"id": 87, "name": "carrot"} |
+-----------------------------------------

Using T-SQL to retrieve results from Json file and not iterating multiple sub-objects

Following MS documentation, I can get a simple example of loading json file to SQL results. The problems occur when I have more than one sub-object. This code will traverse all elements if at root level. Because I have 2 objects under "Purchase" I have to explicitly reference them. Is there an easier way to return results for all sub-objects? In this case I would like two rows of Order info.
Also have to hard code the filename to OPENROWSET instead of using (#file). Any ideas on syntax to pass in a variable for file?
Code
USE TempDB
DECLARE #json AS NVARCHAR(MAX)
DECLARE #file AS NVARCHAR(MAX)
SET #file = 'c:\temp\test.json';
SELECT #json = BulkColumn FROM OPENROWSET (BULK 'c:\temp\test2.json', SINGLE_CLOB) AS j
SELECT *
FROM OPENJSON ( #json )
WITH (
Number varchar(200) '$.Purchase[0].Order.Number' ,
Date datetime '$.Purchase[0].Order.Date',
Customer varchar(200) '$.Purchase[0].AccountNumber',
Quantity int '$.Purchase[0].Item.Quantity'
)
File contents:
{
"Purchase": [
{
"Order": {
"Number": "SO43659",
"Date": "2011-05-31T00:00:00"
},
"AccountNumber": "AW29825",
"Item": {
"Price": 2024.9940,
"Quantity": 1
}
},
{
"Order": {
"Number": "SO43661",
"Date": "2011-06-01T00:00:00"
},
"AccountNumber": "AW73565",
"Item": {
"Price": 2024.9940,
"Quantity": 3
}
}
]
}
Reference:
https://learn.microsoft.com/en-us/sql/relational-databases/json/convert-json-data-to-rows-and-columns-with-openjson-sql-server?view=sql-server-2017#option-2---openjson-output-with-an-explicit-structure
Thanks,
Bill
To get both rows, you need to use the second argument of the OPENJSON function, like this:
SELECT *
FROM OPENJSON ( #json,'$.Purchase' )
WITH (
Number varchar(200) '$.Order.Number' ,
Date datetime '$.Order.Date',
Customer varchar(200) '$.AccountNumber',
Quantity int '$.Item.Quantity'
)
This way you are telling SQL Server that you want all the nodes under the '$.Purchase' path (and it finds two rows). Without that, you would get all the nodes under root (and it finds just one row, the Purchase node).

TSQL Select basic hierarchical data from JSON

This is a rediculously simple question, but I cannot find a single working example anywhere. MSDN hints it is possible (here and here), but misses the actual example, and Google presents a myriad of examples outputting JSON from TSQL, whereas I need the reverse.
Taking a most basic JSON structure:
DECLARE #json nvarchar(max) = N'[{
"Id": 1,
"name": "John",
"skills": [
{"title": "Azure" },
{"title": "VB" },
{"title": "JavaScript" }]
}, {
"Id": 2,
"name": "Jane",
"skills": [
{"title": "Azure" },
{"title": "SQL" },
{"title": "C#" }]
}]';
I figured how to get the highest-level values, such as Id and name:
SELECT
*
FROM
OPENJSON(#json) WITH (
ID int '$.Id',
name nvarchar(50) '$.name'
);
What I'd like is to output PersonId, and the respective skill titles for each, e.g.
PersonId SkillTitle
-----------------------
1 Azure
1 VB
1 JavaScript
2 Azure
2 SQL
2 C#
Google only provides me with the reverse logic. My badly-broken attempt based on what I can find is here:
SELECT
*
FROM
OPENJSON(#json, '$.skills') WITH (
PersonId int './Id',
SkillTitle nvarchar(50) '$.title'
);
The below code snippet would give you the required results -
SELECT
JSON_Value (c.value, '$.Id') as ID,
JSON_Value (p.value, '$.title') as SkillTitle
FROM OPENJSON (#json) as c
CROSS APPLY OPENJSON(c.value,'$.skills') as p
Have implemented the same by CROSS APPLYing the JSON child node with the parent node and using the JSON_Value() function.

Issue in JOIN query in apache drill

File stored in Hive:
[
{
"occupation": "guitarist",
"fav_game": "football",
"name": "d1"
},
{
"occupation": "dancer",
"fav_game": "chess",
"name": "k1"
},
{
"occupation": "traveller",
"fav_game": "cricket",
"name": "p1"
},
{
"occupation": "drummer",
"fav_game": "archery",
"name": "d2"
},
{
"occupation": "farmer",
"fav_game": "cricket",
"name": "k2"
},
{
"occupation": "singer",
"fav_game": "football",
"name": "s1"
}
]
CSV file in hadoop:
name,age,city
d1,23,delhi
k1,23,indore
p1,23,blore
d2,25,delhi
k2,30,delhi
s1,25,delhi
I queried them individually, it's working fine. Then, I tried join query:
select * from hdfs.`/demo/distribution.csv` d join hive.demo.`user_details` u on d.name = u.name
I got the following issue:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: DrillRuntimeException: Join only supports implicit casts between 1. Numeric data 2. Varchar, Varbinary data 3. Date, Timestamp data Left type: INT, Right type: VARCHAR. Add explicit casts to avoid this error Fragment 0:0 [Error Id: b01db9c8-fb35-4ef8-a1c0-31b68ff7ae8d on IMPETUS-DSRV03.IMPETUS.CO.IN:31010]
Please refer this https://drill.apache.org/docs/data-type-conversion/
We need to do explicit typecasting to deal with such scenario.
Consider we have a JSON file employee.json and a csv file sample.csv. In order to query on both at the same time , in one query we need to do type casting.
0: jdbc:drill:zk=local> select emp.employee_id, dept.department_description, phy.columns[2], phy.columns[3] FROM cp.`employee.json` emp , cp.`department.json` dept, dfs.`/tmp/sample.csv` phy where CAST(emp.employee_id AS INT) = CAST(phy.columns[0] AS INT) and emp.department_id = dept.department_id;
Here we are typecasting CAST(emp.employee_id AS INT) = CAST(phy.columns[0] AS INT) so that equality does not fail.
Refer this for more detail:- http://www.devinline.com/2015/11/apache-drill-setup-and-SQL-query-execution.html#multiple_src
You need to cast even though by default it has taken varchar. Try this:
select * from hdfs.`/demo/distribution.csv` d join hive.demo.`user_details` u on cast(d.name as VARCHAR) = cast(u.name as VARCHAR)
But you cannot refer to column name directly from csv. you need to consider columns[0] for name.