SQL Server dynamic JSON using within Analysis Services? - json

I am trying to get my head around which direction to even start with the following..
Imaging a dynamic form (JSON) that I store in SQL Server 2016+. So far, I have seen / tried a couple of dynamic queries to take the dynamic JSON and flatten out as columns.
Given the "dynamic" nature, it is hard to "store" that flatten out data. I have been looking at temporary/temporal/memory tables to store that dynamic flattened data for a "relatively short period" of time (say an hour or two).
I have also been asked if it is possible to use the dynamic JSON data in building a cube within Analysis Services.. again given the dynamic nature of this, would something like this even be possible?
I guess my question is two-fold:
Pointers to flatten out dynamic JSON within SQL Server
Is it possible to take dynamic JSON, flatten out to columns and somehow use within Analysis Services? i.e. ultimately to use within a cube?
Realise the above is a bit vague, but any pointers to get me going in the correct direction would be appreciated!
Many thanks.

Dynamically converting JSON into columns can get tricky. Especially if you are NOT certain of the structure. That said, have you considered converting the JSON into a hierarchy via a Recursive CTE?
Example
declare #json varchar(max)='
[
{
"url": "https://www.google.com",
"image-url": "https://www.google.com/imghp",
"labels": [
{
"source": "Bob, Inc",
"name": "Whips",
"info": "Ouch"
},
{
"source": "Weezles of Oregon",
"name": "Chains",
"info": "Let me go"
}
],
"Fact": "Fictional"
}
]';
;with cte0 as (
Select *
,[Level]=1
,[Path]=convert(varchar(max),row_number() over(order by (select null)))
From OpenJSON(#json,'$')
Union All
Select R.*
,[Level]=p.[Level]+1
,[Path]=concat(P.[Path],'\',row_number() over(order by (select null)))
From cte0 p
Cross Apply OpenJSON(p.value,'$') R
Where P.[Type]>3
)
Select [Level]
,[Path]
,Title = replicate('|---',[Level]-1)+[Key]
,Item = [Key]
,Value = case when [type]<4 then Value else null end
From cte0
Order By [Path]
Returns

Related

Using json_extract in sqlite to pull data from parent and child objects

I'm starting to explore the JSON1 library for sqlite and have been so far successful in the basic queries I've created. I'm now looking to create a more complicated query that pulls data from multiple levels.
Here's the example JSON object I'm starting with (and most of the data is very similar).
{
"height": 140.0,
"id": "cp",
"label": {
"bind": "cp_label"
},
"type": "color_picker",
"user_data": {
"my_property": 2
},
"uuid": "948cb959-74df-4af8-9e9c-c3cb53ac9915",
"value": {
"bind": "cp_color"
},
"width": 200.0
}
This json object is buried about seven levels deep in a json structure and I pulled it from the larger json construct using an sql statement like this:
SELECT value FROM forms, json_tree(forms.formJSON, '$.root')
WHERE type = 'object'
AND json_extract(value, '$.id') = #sControlID
// In this example, #sControlID is a variable that represents the `id` value we're looking for, which is 'cp'
But what I really need to pull from this object are the following:
the value from key type ("color_picker" in this example)
the values from keys bind ("cp_color" and "cp_label" in this example)
the keys value and label (which have values of {"bind":"<string>"} in this example)
For that last item, the key name (value and label in this case) can be any number of keywords, but no matter the keyword, the value will be an object of the form {"bind":"<some_string>"}. Also, there could be multiple keys that have a bind object associated with them, and I'd need to return all of them.
For the first two items, the keywords will always be type and bind.
With the json example above, I'd ideally like to retrieve two rows:
type key value
color_picker value cp_color
color_picker label cp_label
When I use json_extract methods, I end up retrieving the object {"bind":"cp_color"} from the json_tree table, but I also need to retrieve the data from the parent object. I feel like I need to do some kind of union, but my attempts have so far been unsuccessful. Any ideas here?
Note: if the {"bind":"<string>"} object doesn't exist as a child of the parent object, I don't want any rows returned.
Well, I was on the right track and eventually figured out it. I created a separate query for each of the items I was looking for, then INNER JOINed all the json_tree tables from each of the queries to have all the required fields available. Then I json_extracted the required data from each of the json fields I needed data from. In the end, it gave me exactly what I was looking for, though I'm sure it could be written more efficiently.
For anyone interested, this is what hte final query ended up looking like:
SELECT IFNULL(json_extract(parent.value, '$.type'), '_window_'), child.key, json_extract(child.value, '$.bind') FROM (SELECT json_tree.* FROM nui_forms, json_tree(nui_forms.formJSON, '$') WHERE type = 'object' AND json_extract(nui_forms.formJSON, '$.id') = #sWindowID) parent INNER JOIN (SELECT json_tree.* FROM nui_forms, json_tree(nui_forms.formJSON, '$') WHERE type = 'object' AND json_extract(value, '$.bind') != 'NULL' AND json_extract(nui_forms.formJSON, '$.id') = #sWindowID) child ON child.parent = parent.id;
If you have any tips on reducing its complexity, feel free to comment!

T-SQL - search in filtered JSON array

SQL Server 2017.
Table OrderData has column DataProperties where JSON is stored. JSON example stored there:
{
"Input": {
"OrderId": "abc",
"Data": [
{
"Key": "Files",
"Value": [
"test.txt",
"whatever.jpg"
]
},
{
"Key": "Other",
"Value": [
"a"
]
}
]
}
}
So, it's an object with Input object, which has Data array that's KVP - full of objects with Key string and Value array of strings.
And my problem - I need to query for rows based on values in Files in example JSON - simple LIKE that matches %text%.
This query works:
SELECT TOP 10 *
FROM OrderData CROSS APPLY OPENJSON(DataProperties,'$.Input.Data') dat
WHERE JSON_VALUE(dat.value, '$.Key') = 'Files' and dat.[key] = 0
AND JSON_QUERY(dat.value, '$.Value') LIKE '%2%'
Problem is that this query is very slow, unsurprisingly.
How to make it faster?
I cannot create computed column with JSON_VALUE, because I need to filter in an array.
I cannot create computed column with JSON_QUERY on "$.Input.Data" or "$.Input.Data[0].Values" - because I need specific array item in this array with Key == "Files".
I've searched, but it seems that you cannot create computed column that also filters data, like with this attempt:
ALTER TABLE OrderData
ADD aaaTest AS (select JSON_QUERY(dat.value, '$.Value')
OPENJSON(DataProperties,'$.Input.Data') dat
WHERE JSON_VALUE(dat.value, '$.Key') = 'Files' and dat.[key] = 0 );
Error: Subqueries are not allowed in this context. Only scalar expressions are allowed.
What are my options?
Add Files column with an index and use INSERT/UPDATE triggers that populate this column on inserts/updates?
Create a view that "computes" this column? Can't add index, will still be slow
So far only option 1. has some merit, but I don't like triggers and maybe there's another option?
You might try something along this:
Attention: I've added a 2 to the text2 to fullfill your filter. And I named both to the plural "Values":
DECLARE #mockupTable TABLE(ID INT IDENTITY, DataProperties NVARCHAR(MAX));
INSERT INTO #mockupTable VALUES
(N'{
"Input": {
"OrderId": "abc",
"Data": [
{
"Key": "Files",
"Values": [
"test2.txt",
"whatever.jpg"
]
},
{
"Key": "Other",
"Values": [
"a"
]
}
]
}
}');
The query
SELECT TOP 10 *
FROM #mockupTable t
CROSS APPLY OPENJSON(t.DataProperties,'$.Input.Data')
WITH([Key] NVARCHAR(100)
,[Values] NVARCHAR(MAX) AS JSON) dat
WHERE dat.[Key] = 'Files'
AND dat.[Values] LIKE '%2%';
The main difference is the WITH-clause, which is used to return the properties inside an object in a typed way and side-by-side (similar to a naked OPENJSON with a PIVOT for all columns - but much better). This avoids expensive JSON methods in your WHERE...
Hint: As we return the Value with NVARCHAR(MAX) AS JSON we can continue with the nested array and might proceed with something like this:
SELECT TOP 10 *
FROM #mockupTable t
CROSS APPLY OPENJSON(t.DataProperties,'$.Input.Data')
WITH([Key] NVARCHAR(100)
,[Values] NVARCHAR(MAX) AS JSON) dat
WHERE dat.[Key] = 'Files'
--we read the array again with `OPENJSON`:
AND 'test2.txt' IN(SELECT [Value] FROM OPENJSON(dat.[Values]));
You might use one more CROSS APPLY to add the array's values and filter this at the WHERE directly.
SELECT TOP 10 *
FROM #mockupTable t
CROSS APPLY OPENJSON(t.DataProperties,'$.Input.Data')
WITH([Key] NVARCHAR(100)
,[Values] NVARCHAR(MAX) AS JSON) dat
CROSS APPLY OPENJSON(dat.[Values]) vals
WHERE dat.[Key] = 'Files'
AND vals.[Value]='test2.txt'
Just check it out...
This is an old question, but I would like to revisit it. There isn't any mention of how the source table is actually constructed in terms of indexing. If the original author is still around, can you confirm/deny what indexing strategy you used? For performant json document queries, I've found that having a table using the COLUMSTORE indexing strategy yields very performant JSON queries even with large amounts of data.
https://learn.microsoft.com/en-us/sql/relational-databases/json/store-json-documents-in-sql-tables?view=sql-server-ver15 has an example of different indexing techniques. For my personal solution I've been using COLUMSTORE albeit on a limited NVARCAHR document size. It's fast enough for any purposes I have even under millions of rows of decently sized json documents.

Constructing nested json arrays in T-SQL

Given the table:
C1 C2 C3
----------------
1 'v1' 1.1
2 'v2' 2.2
3 'v3' 3.3
Is there any "easy" way to return JSON in this format:
{
"columns": [ "C1", "C2", "C3" ],
"rows": [
[ 1, "v1", 1.1 ],
[ 2, "v2", 2.2 ],
[ 3, "v3", 3.3 ]
]
}
To generate an array with single values from a table there is a neat trick like this:
SELECT JSON_QUERY(REPLACE(REPLACE(
(
SELECT id
FROM table a
WHERE pk in (1,2)
FOR JSON PATH
), '{"id":',''),'}','')) 'ids'
Which generates
"ids": [1,2]
But to construct the nested array above the replacing gets really tedious, anyone know a good way to achieve this?
Well, you ask for an easy way but the following will not be easy :-)
The tricky part is to know which values need to be qouted and which can remain naked.
This needs generic type-analysis to find, which values are strings.
The only way I know to get on meta data (besides building dynamic sql using meta views like INFORMATIONSCHEMA.COLUMNS) is XML together with an AUTO-schema.
This XML is very near to your needs actually. There is a list of columns at the beginning, followed by a list of rows. But it is not JSON of course...
Try this out:
--This is a mockup table with the values you provided.
DECLARE #mockup TABLE(C1 INT,C2 VARCHAR(100),C3 DECIMAL(4,2));
INSERT INTO #mockup VALUES
(1,'v1',1.1)
,(2,'v2',2.2)
,(3,'v3',3.3);
--Now we create an XML out of this
DECLARE #xml XML =
(
SELECT *
FROM #mockup t
FOR XML RAW,XMLSCHEMA,TYPE
);
--Check the XML's content with SELECT #xml to see how it is looking internally
--Now the real query can start:
SELECT '{"columns":[' +
STUFF(#xml.query('declare namespace xsd="http://www.w3.org/2001/XMLSchema";
for $col in /xsd:schema/xsd:element//xsd:attribute
return
<x>,{concat("""",xs:string($col/#name),"""")}</x>
').value('.','nvarchar(max)'),1,1,'') +
'],"rows":[' +
STUFF(
(
SELECT
',[' + STUFF(b.query(' declare namespace xsd="http://www.w3.org/2001/XMLSchema";
for $attr in ./#*
return
<x>,{if(/xsd:schema/xsd:element//xsd:attribute[#name=local-name($attr)]//xsd:restriction/#base="sqltypes:varchar") then
concat("""",$attr,"""")
else
xs:string($attr)
}
</x>
').value('.','nvarchar(max)'),1,1,'') + ']'
FROM #xml.nodes('/*:row') B(b)
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)'),1,1,'') +
']}';
The result
{"columns":["C1","C2","C3"],"rows":[[3,"v3",3.30],[1,"v1",1.10],[2,"v2",2.20]]}
Some explanation:
The first part will use XQuery to find all columns (xsd:attribute within XML-schema) and create the array of column names.
The second part will againt use XQuery in order to run through all rows and write their column values in a concatenated string. Each value can refer to its type within the schema. Whenever this type is sqltypes:varchar the value will be quoted. All other values remain naked.
This will not solve each and any case generically...
To be honest, this was more for my own curiosity :-) Wanted to find out, how one can solve this.
Quite probably the best answer is: Use another tool. SQL-Server is not the best choice here ;-)

Select and insert JSON file into SQL Server table

I'm trying to import the entirety of a JSON file into a table of mine in SQL Server.
The JSON data looks like this:
{
"category": "General Knowledge",
"type": "multiple",
"difficulty": "hard",
"question": "Electronic music producer Kygo's popularity skyrocketed after a certain remix. Which song did he remix?",
"correct_answer": "Ed Sheeran - I See Fire",
"incorrect_answers": [
"Marvin Gaye - Sexual Healing",
"Coldplay - Midnight",
"a-ha - Take On Me"
]
},
With multiple entries like this.
I'm attempting to use OPENROWSET and OPENJSON to accomplish this using the following query:
SELECT value
FROM OPENROWSET (BULK 'C:\Users\USERNAME\Desktop\general_questions.json', SINGLE_CLOB) as j
CROSS APPLY OPENJSON(BulkColumn)
However, the output I'm getting only shows the first question object in the file. I have a two part question:
How can I get my query to select ALL of the objects in the file and then insert all of those objects into a table in my SQL Server db?
If I understood you right, is it this:
SELECT value
FROM OPENROWSET (BULK 'C:\Users\USERNAME\Desktop\general_questions.json', SINGLE_CLOB) as j
CROSS APPLY OPENJSON(BulkColumn)
WITH
(
CATEGORY VARCHAR2(MAX),
...
) AS JSON_TABLE
Also, am not sure what you mean by "However, the output I'm getting only shows the first question object in the file."? Do you mean the question object has multiple attributes?

Using N1QL with document keys

I'm fairly new to couchbase and have tried to find the answer to a particular query I'm trying to create with not much success so far.
I've debated between using a view or N1QL for this particular case and settled with N1QL but haven't managed to get it to work so maybe a view is better after all.
Basically I have the document key (Group_1) for the following document:
Group_1
{
"cbType": "group",
"ID": 1,
"Name": "Group Atlas 3",
"StoreList": [
2,
4,
6
]
}
I also have 'store' documents, their keys are listed in this document's storelist. (Store_2, Store_4, Store_6 and they have a storeID value that is 2, 4 and 6) I basically want to obtain all 3 documents listed.
What I do have that works is I obtain this document with its id by doing:
var result = CouchbaseManager.Bucket.Get<dynamic>(couchbaseKey);
mygroup = JsonConvert.DeserializeObject<Group> (result.ToString());
I can then loop through it's storelist and obtain all it's stores in the same manner, but i don't need anything else from the group, all i want are the stores and would have prefered to do this in a single operation.
Does anyone know how to do a N1QL directly unto a specified document value?
Something like (and this is total imaginary non working code I'm just trying to clearly illustrate what I'm trying to get at):
SELECT * FROM mycouchbase WHERE documentkey IN
Group_1.StoreList
Thanks
UPDATE:
So Nic's solution does not work;
This is the closest I get to what I need atm:
SELECT b from DataBoard c USE KEYS ["Group_X"] UNNEST c.StoreList b;
"results":[{"b":2},{"b":4},{"b":6}]
Which returns the list of IDs of the Stores I want for any given group (Group_X) - I haven't found a way to get the full Stores instead of just the ID in the same statement yet.
Once I have, I'll post the full solution as well as all the speed bumps I've encountered in the process.
I apologize if I have a misunderstanding of your question, but I'm going to give it my best shot. If I misunderstood, please let me know and we'll work from there.
Let's use the following scenario:
group_1
{
"cbType": "group",
"ID": 1,
"Name": "Group Atlas 3",
"StoreList": [
2,
4,
6
]
}
store_2
{
"cbType": "store",
"ID": 2,
"name": "some store name"
}
store_4
{
"cbType": "store",
"ID": 4,
"name": "another store name"
}
store_6
{
"cbType": "store",
"ID": 6,
"name": "last store name"
}
Now lets say you wan't to query the stores from a particular group (group_1), but include no other information about the group. You essentially want to use N1QL's UNNEST and JOIN operators.
This might leave you with a query like so:
SELECT
stores.name
FROM `bucket-name-here` AS groups
UNNEST groups.StoreList AS groupstore
JOIN `bucket-name-here` AS stores ON KEYS ("store_" || groupstore.ID)
WHERE
META(groups).id = 'group_1';
A few assumptions are made in this. Both your documents exist in the same bucket and you only want to select from group_1. Of course you could use a LIKE and switch the group id to a percent wildcard.
Let me know if something doesn't make sense.
Best,
Try this query:
select Name
from buketname a join bucketname b ON KEYS a.StoreList
where Name="Group Atlas 3"
Based on your update, you can do the following:
SELECT b, s
FROM DataBoard c USE KEYS ["Group_X"]
UNNEST c.StoreList b
JOIN store_bucket s ON KEYS "Store_" || TO_STRING(b);
I have a similar requirement and I got what I needed with a query like this:
SELECT store
FROM `bucket-name-here` group
JOIN `bucket-name-here` store ON KEYS group.StoreList
WHERE group.cbType = 'group'
AND group.ID = 1