I'm trying to construct a JSON-serialized list of key/value pair items from my SQL database (compat level 140). The trick is that the values can be anything: numbers, strings, null, or other JSON objects.
It should be able to look something like this:
[{"key":"key1","value":"A String"},{"key":"key2","value":{"InnerKey":"InnerValue"}}]
However, SQL seems to be forcing me to select either a string or an object.
SELECT
[key] = kvp.[key],
[value] = CASE
WHEN ISJSON(kvp.[value]) = 1 THEN JSON_QUERY(kvp.[value])
ELSE '"' + kvp.[value] + '"' -- See note below
END
FROM (VALUES
('key1', 'This value is a string')
,('key2', '{"description":"This value is an object"}')
,('key3', '["This","value","is","an","array","of","strings"]')
,('key4', NULL)
-- Without these lines, the above 4 work fine; with either of them, even those 4 are broken
--,('key5', (SELECT [description] = 'This value is a dynamic object' FOR JSON PATH, WITHOUT_ARRAY_WRAPPER))
--,('key6', JSON_QUERY((SELECT [description] = 'This value is a dynamic object' FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)))
) AS kvp([key], [value])
FOR JSON PATH
Am I trying to do something that SQL can't support, or am I just missing the proper syntax for making this work?
*Note that the addition of the double-quotes seems like it shouldn't be necessary. But without those, SQL fails to wrap the string and generates bad JSON:
[{"key":"key1","value":This value is a string},...
If your query is modified to this, it works:
SELECT
[key] = kvp.[key],
[value] = ISNULL(
JSON_QUERY(CASE WHEN ISJSON(kvp.[value]) = 1 THEN kvp.[value] END),
'"' + STRING_ESCAPE(kvp.[value], 'json') + '"'
)
FROM (VALUES
('key1', 'This value is a "string"')
,('key2', '{"description":"This value is an object"}')
,('key3', '["This","value","is","an","array","of","strings"]')
,('key4', NULL)
-- These now work
,('key5', (SELECT [description] = 'This value is a dynamic object' FOR JSON PATH, WITHOUT_ARRAY_WRAPPER))
,('key6', JSON_QUERY((SELECT [description] = 'This value is a dynamic object' FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)))
) AS kvp([key], [value])
FOR JSON PATH, INCLUDE_NULL_VALUES
Of course, this wouldn't be sufficient if value was an int. Also, I can't really explain why yours doesn't work.
Related
I have a dimension table which has JSON column.I am using SCD Type 1 to update values in it.
string1={ "Name":"Suneel","Age":23,}
String2={"Name":"Suneel Kumar","Age":23,"City":"Banglore"}
I need JSON which is as below:
{"Name":"Suneel Kumar","Age":23,"City":"Banglore"}
Note: Since it is part of dynamic stored procedure the properties preset in the JSON may vary
string1={ "Name":"Suneel","Age":23,}
String2={"Name":"Suneel Kumar","Age":23,"City":"Banglore"}
I need JSON which is as below:
{"Name":"Suneel Kumar","Age":23,"City":"Banglore"}
There is no JSON merge function in SQL Server.
You can hack it by breaking open the JSONs using OPENJSON and full-joining them, but it is made significantly more complicated by the fact there is no JSON_OBJ_AGG function.
SELECT *
FROM Original o
JOIN NewData nd ON someCondition
CROSS APPLY (
SELECT '{' +
STRING_AGG(
CONCAT(
'"',
ISNULL(j2.[key], j1.[key]),
'":',
CASE WHEN ISNULL(j2.type, j1.type) = 2 THEN '"' END,
CASE WHEN ISNULL(j2.type, j1.type) = 0 THEN 'null' ELSE ISNULL(j2.value, j1.value) END,
CASE WHEN ISNULL(j2.type, j1.type) = 2 THEN '"' END
),
','
) + '}'
FROM OPENJSON(o.Json) j1
FULL JOIN OPENJSON(nd.Json) j2 ON j2.[key] = j1.[key]
) merged(Json);
I would love the option to turn the event_params nested BQ field into a JSON field?
My desired output should look like this:
{"sessionId":123456789,"version":"1.005"}
Consider below
select *, (
select '{' || string_agg(format('%s:%s',
json_extract(kv, '$.key'),
json_extract(kv, '$.string_value')
)) || '}'
from unnest(json_extract_array(to_json_string(event_params))) kv
) json
from `project.dataset.table`
if applied to sample data in your question - output is
Update: I realized you changed/fixed data sample - so see updated query below
select *, (
select '{' || string_agg(format('%s:%s',
json_extract(kv, '$.key'),
json_extract(kv, '$.value.string_value')
)) || '}'
from unnest(json_extract_array(to_json_string(event_params))) kv
) json
from `project.dataset.table`
with output
I made a version where you can define number fields in the JSON object with proper format, and you can filter for certain keys to end up in the JSON object:
with t as (
-- fake example data with same format
select * from unnest([
struct([
struct('session_id' as key, struct('123' as string_value) as value),
('timestamp', struct('1234567')),
('version', struct('2.23.65'))
] as event_params)
,struct([struct('session_id',struct('645')),('timestamp',struct('7653365')),('version',struct('3.675.34'))])
])
)
-- actual query
select
event_params, -- original data for comparison
format('{ %s }', -- for each row create one json object:
(select -- string_agg will return one string with all key-value pairs comma-separated
string_agg( -- within aggregation create key-value pairs
if(key in ('timestamp','session_id'), -- if number fields
format('"%s" : %s',key,value.string_value), -- then number format
format('"%s" : "%s"',key,value.string_value)) -- else string format
, ', ')
from unnest(event_params) -- unnest turns array into a little table per row, so we can run SQL on it
where key in ('session_id','version') -- filter for certain keys
) -- subquery end
) as json
from t
I have a table with about 900,000 rows. One of the columns has a 7 character value that I need to search for client-side validation of user input. I'm currently using ajax as the user types, but most of the users can out-run the ajax round trips and end up having to wait until the all the validation calls return. So I want to shift the wait time to the initial load of the app and take advantage of browser caching. So I'll bundle minify and gzip the json file with webpack. I'll probably make it an entry that I can then require/ensure as the app loads.
To make the validation super fast on the client side I want to produce a json file that has a single json structure with the first two characters of the 7 character column as the key to the object with an array of all values that start with the first two characters as an array in the value for said key (see example below). I can then use indexOf to find the value within this segmented list and it will be very quick.
As mentioned above I'm currently using ajax as the user types.
I'm not going to show my code because it's too complex. But I basically keep track of the pending ajax requests and when the request that started with the last value the user entered returns (the value currently sitting in the text box), then I can show the user if the entry exists or not. That way if the request return out of order, I'm not giving false positives.
I'm using SQL Server 2016 so I want to use for json to produce my desired output. But here's what I want to produce:
{
"00": [ "0000001", "0000002", ... ],
"10": [ "1000000", "1000001", ... ],
//...
"99": [ "9900000", "9900001", ...]
}
So far I have been unable to figure out how to use substring( mySevenDigitCol, 1, 2 ) as the key in the json object.
I'm not sure if this can be done using FOR JSON AUTO (["0000001", "0000002", ... ] is the difficult part), but next approach based on string manipulation is one possible solution to your problem:
Input:
CREATE TABLE #Data (
SevenDigitColumn varchar(7)
)
INSERT INTO #Data
(SevenDigitColumn)
VALUES
('0000001'),
('0000002'),
('0000003'),
('0000004'),
('0000005'),
('1000001'),
('1000002'),
('9900001'),
('9900002')
T-SQL:
;WITH JsonData AS (
SELECT
SUBSTRING(dat.SevenDigitColumn, 1, 2) AS [Key],
agg.StringAgg AS [Values]
FROM #Data dat
CROSS APPLY (
SELECT STUFF(
(
SELECT CONCAT(',"', SevenDigitColumn, '"')
FROM #Data
WHERE SUBSTRING(SevenDigitColumn, 1, 2) = SUBSTRING(dat.SevenDigitColumn, 1, 2)
FOR XML PATH('')
), 1, 1, '') AS StringAgg
) agg
GROUP BY SUBSTRING(dat.SevenDigitColumn, 1, 2), agg.StringAgg
)
SELECT CONCAT(
'{',
STUFF(
(
SELECT CONCAT(',"', [Key], '": [', [Values], ']')
FROM JsonData
FOR XML PATH('')
), 1, 1, ''),
'}')
Output:
{"00": ["0000001","0000002","0000003","0000004","0000005"],"10": ["1000001","1000002"],"99": ["9900001","9900002"]}
Notes:
With SQL Server 2017+ you can use STRING_AGG() function:
SELECT
CONCAT(
'{',
STRING_AGG(KeyValue, ','),
'}'
)
FROM (
SELECT CONCAT(
'"',
SUBSTRING(dat.SevenDigitColumn, 1, 2),
'": [',
STRING_AGG('"' + SevenDigitColumn + '"', ','),
']'
) AS KeyValue
FROM #Data dat
GROUP BY SUBSTRING(dat.SevenDigitColumn, 1, 2)
) JsonData
Notes:
If your column data is not only digits, you should use STRING_ESCAPE() with 'json' as second parameter to escape special characters.
Say I have two JSON strings as follows:
[{"RowId":102787,"UserId":1,"Activity":"This is another test","Timestamp":"2017-11-25T14:37:30.3700000"}]
[{"RowId":102787,"UserId":2,"Activity":"Testing the Update function","Timestamp":"2017-11-25T14:37:30.3700000"}]
Both have the same properties but two of the properties in the second string have different values than the first (UserId and Activity). Is it possible, in Azure SQL Database T-SQL, to generate a third JSON string that contains the values in the second string that are different from the first? In other words, I'd like a string returned that looks like this:
[{"UserId":2,"Activity":"Testing the Update function"}]
Also, the solution should assume that the properties in the JSON strings are not known. I need this to be a generic solution for any two JSON strings.
Have not tried this on Azure, but it seems to work on SQL Server 2017
There is probably a more elegant way to get to the final JSON string other than through string manipulation, perhaps we can update the answer as better ways are found.
-- Expected : [{"UserId":2,"Activity":"Testing the Update function"}]
DECLARE #jsonA NVARCHAR(MAX) = '[{"RowId":102787,"UserId":1,"Activity":"This is another test","Timestamp":"2017-11-25T14:37:30.3700000"}]'
,#jsonB NVARCHAR(MAX) = '[{"RowId":102787,"UserId":2,"Activity":"Testing the Update function","Timestamp":"2017-11-25T14:37:30.3700000"}]'
,#result NVARCHAR(MAX) = ''
SELECT #jsonA = REPLACE(REPLACE(#jsonA, ']', ''), '[', '')
,#jsonB = REPLACE(REPLACE(#jsonB, ']', ''), '[', '')
;WITH DSA AS
(
SELECT *
FROM OPENJSON(#jsonA)
)
,DSB AS
(
SELECT *
FROM OPENJSON(#jsonB)
)
SELECT #result += CONCAT (
'"', B.[key], '":'
,IIF(B.[type] = 2, B.[value], CONCAT('"', B.[value], '"')) -- havent checked types other than 1 and 2; think there's a bool type?
,','
)
FROM DSA A
JOIN DSB B ON A.[key] = B.[key]
WHERE A.[value] != B.[value]
SELECT CONCAT('[{', LEFT(#result, LEN(#result) - 1), '}]')
I've looked through a few different post trying to find a solution for this. I have a column that contains descriptions that follow the following format:
String<Numeric>
However the column isn't limited to one set of the previous mentioned format it could be something like
UNI<01> JPG<84>
JPG<84> UNI<01>
JPG<84>
UNI<01>
And other variations without any controlled pattern.
What I am needing to do is extract the number between <> into a separate column in another table based on the string before the <>. So UNI would qualify the following numeric to go to a certain table.column, while JPG would qualify to another table etc. I have seen functions to extract the numeric but not qualifying and only pulling the numeric if it is prefaced with a given qualifier string.
Based on the scope limitation mentioned in the question's comments that only one type of token (Foo, Bar, Blat, etc.) needs to be found at a time: you could use an expression in a Derived Column to find the token of interest and then extract the value between the arrows.
For example:
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1) == 0)?
NULL(DT_WSTR, 1) :
SUBSTRING([InputColumn],
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1)
+ LEN(#[User::SearchToken]) + 1,
FINDSTRING(
SUBSTRING([InputColumn],
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1)
+ LEN(#[User::SearchToken]) + 1,
LEN([InputColumn])
), ">", 1) - 1
)
First, the expression checks whether the token specified in #[User::SearchToken] is used in the current row. If it is, SUBSTRING is used to output the value between the arrows. If not, NULL is returned.
The assumption is made that no token's name will end with text matching the name of another token. Searching for token Bar will match Bar<123> and FooBar<123>. Accommodating Bar and FooBar as distinct tokens is possible but the requisite expression will be much more complex.
You could use an asynchronous Script Component that outputs a row with type and value columns for each type<value> token contained in the input string. Pass the output of this component through a Conditional Split to direct each type to the correct destination (e.g. table).
Pro: This approach gives you the option of using one data flow to process all tag types simultaneously vs. requiring one data flow per tag type.
Con: A Script Component is involved, which it sounds like you'd prefer to avoid.
Sample Script Component Code
private readonly string pattern = #"(?<type>\w+)<(?<value>\d+)>";
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
foreach (Match match in Regex.Matches(Row.Data, pattern, RegexOptions.ExplicitCapture))
{
Output0Buffer.AddRow();
Output0Buffer.Type = match.Groups["type"].Value;
Output0Buffer.Value = match.Groups["value"].Value;
}
}
Note: Script Component will need an output created with two columns (perhaps named Type and Value) and then have the output's SynchronousInputID property set to None).
I ended up writing a CTE for a view to handle the data manipulation and then handled the joins and other data pieces in the SSIS package.
;WITH RCTE (Status_Code, lft, rgt, idx)
AS ( SELECT a.Status_code
,LEFT(a.Description, CASE WHEN CHARINDEX(' ', a.Description)=0 THEN LEN(a.Description) ELSE CHARINDEX(' ', a.Description)-1 END)
,SUBSTRING(a.Description, CASE WHEN CHARINDEX(' ', a.Description)=0 THEN LEN(a.Description) ELSE CHARINDEX(' ', a.Description)-1 END + 1, DATALENGTH(a.Description))
,0
FROM [disp] a WHERE NOT( Description IS NULL OR Description ='')
UNION ALL
SELECT r.Status_Code
,CASE WHEN CHARINDEX(' ', r.rgt) = 0 THEN r.rgt ELSE LEFT(r.rgt, CHARINDEX(' ', r.rgt) - 1) END
,CASE WHEN CHARINDEX(' ', r.rgt) > 0 THEN SUBSTRING(r.rgt, CHARINDEX(' ', r.rgt) + 1, DATALENGTH(r.rgt)) ELSE '' END
,idx + 1
FROM RCTE r
WHERE DATALENGTH(r.rgt) > 0
)
SELECT Status_Code
-- ,lft,rgt -- Uncomment to see whats going on
,SUBSTRING(lft,0, CHARINDEX('<',lft)) AS [Description]
,CASE WHEN ISNUMERIC(SUBSTRING(lft, CHARINDEX('<',lft)+1, LEN(lft)-CHARINDEX('<',lft)-1)) >0
THEN CAST (SUBSTRING(lft, CHARINDEX('<',lft)+1, LEN(lft)-CHARINDEX('<',lft)-1) AS INT) ELSE NULL END as Value
FROM RCTE
where lft <> ''