Merging JSON objects in SQL Server - json

We have a base table which has custom column_values column
Table_Structure:
User_Dimension-
Userid,username,addresss,Custom_value
Userid is the primary key and customer can map the fields present in the file using our UI.
If any of the columns present in the files doesn't fit in the column present in our base tables, we will create a custom column and store it in the form of json.
Userid,username,addresss,Custom_value
234,AK4140,BANGLORE,{"Pin:"522413","State":"Maharastra"}
The data will be stored as mentioned in the above in a staging table
Note: for the table User_Dimension there can be data from multiple files so my custom values are different for each file and that information is stored in a meta data table.
We are using SCD Type 1 for dimension tables
The problem is to merge JSON column.
Consider this scenario:
User_Dimension
Userid,username,addresss,Service_Type,User_Type,Custom_value
234,ak4140,banglore,null,null,{"Pin:"522413","State":"Maharastra"}
The above entry was present in my user_dimension from File1
Now I need to push below value to my table from File2
Userid,username,addresss,Service_Type,User_Type,Custom_value
234,NULL,NULL,Customer,DVV,{"Birthdate:"19-09-1995","State":"Karnataka"}
I am merging both the values based on the Userid.
The problem is Custom_Value column. From the above entries I need to update this column as shown here:
Userid,username,addresss,Service_Type,User_Type,Custom_value
234,ak4140,banglore,Customer,DVV,{"Pin:"522413","State":"Karnataka","Birthdate":"19-09-1995"}

I wrote a function which can perhaps be used to merge two json files:
create or alter function dbo.FN_JSON_MERGE(#pJson1 NVARCHAR(MAX), #pJson2 NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
AS
BEGIN
-- Get keys...
declare #t table ([key] nvarchar(max) collate database_default, [value] nvarchar(max) collate database_default, row_id int identity)
insert into #t
select [key], [value]
from OPENJSON(#pJson1)
-- Merge values from #pjson2
update t
set value = oj.value
from #t t
inner join OPENJSON(#pJson2) oj
ON oj.[key] collate database_default = t.[key] collate database_default
insert into #t
select [key], [value]
from OPENJSON(#pJson2) o
where not exists(
select 1
from #t t2
where t2.[key] collate database_default = o.[key] collate database_default
)
-- Finally generate new json...
set #pJson2 = ''
select #pJson2 = #pJson2 + ',' + '"' + [key] + '": "' + (value) + '"'
from #t
order by [row_id]
return '{' + stuff(#pJson2, 1, 1, '') + '}'
END
Test code:
select dbo.FN_JSON_MERGE('{"Pin":"522413","State":"Maharastra"}'
, '{"Birthdate":"19-09-1995","State":"Karnataka"}')
-- returns {"Pin": "522413","State": "Karnataka","Birthdate": "19-09-1995"}
But there are lot of BUTs. It might not handle very long strings / strings with quotes or other more weird jsons etc.
Also, it doesn't always keep the same attribute order as original json.
It's likely to be very slow.
Finally, it doesn't handle if you want to merge data from 3 files.
Right now, the second argument values always overwrite first.
But maybe it can be of some use. You can always create a procedure which does this for better performance.

Related

Remove params from JSON in SQL Server

There is a JSON column in SQL Server tables with data like:
["1","2","3","4"]
and I want to delete "3" or ("2","4") (for example) from it.
Can I do it with Json_Modify or anything else?
JSON modify can modify by PATH if you have not any key to modify and just a simple list like that you can do this:
DECLARE #JsonList NVARCHAR(1000) = N'["1","2","3","4"]';
DECLARE #NewList NVARCHAR(1000);
SET #NewList =
(
SELECT CONCAT('[', STRING_AGG(CONCAT('"', oj.Value, '"'), ','), ']')
FROM OPENJSON(#JsonList) AS oj
WHERE oj.Value NOT IN ( 2, 4 )
);
PRINT #NewList

Json to table without explicit key names

I have a table with an VARCHAR(MAX) column which stores JSON key value pairs.
The JSON document schema is simply a different number of key value pairs, no nesting, nor arrays in there.
I wish to build a query, which gives back the JSON in a tabular format,
which is easy with the named elements (see WITH clause below):
DECLARE #MYJSONTABLE TABLE (ID INT IDENTITY NOT NULL PRIMARY KEY, MYDATA NVARCHAR(max) null)
INSERT INTO #MYJSONTABLE
(
MYDATA
)
VALUES
(N'{"id": 2, "info": "some info", "age": 25}'),
(N'{"id": 5, "info": "other info", "dob": "2005-11-04T12:00:00"}')
SELECT p.ID,MYDATA.*
FROM #MYJSONTABLE p
CROSS APPLY
OPENJSON(p.MYDATA)
WITH (
id INT 'strict $.id',
info NVARCHAR(50) '$.info',
age INT,
dateOfBirth DATETIME2 '$.dob'
) AS MYDATA
While the output is exactly what I want,
my issue with the above solution, that I don't know the key names in the JSON document neither, and how many are there, but still wish to return them all in the same tabular format.
If I omit the WITH clause above, the query do return all key value pairs, but the output goes "vertical" and each key in the JSON generates a new row.
Could the above query be modified to be dynamic, and return all key value pairs without explicitly specifying the JSON key names?
Perhaps something like this will work for you.
This uses a CTE to get the DISTINCT key's from your JSON. Then string aggregation to create a dynamic statement, which you can see from the PRINT statement.
Note that for your sample data, the column dob is not returned because it is outside of the initial JSON defined. If the first right brace (}) is removed, the column appears.
DECLARE #SQL nvarchar(MAX),
#CRLF nchar(2) = NCHAR(13) + NCHAR(10);
DECLARE #Delimiter nvarchar(50) = N',' + #CRLF + N' ';
WITH Keys AS(
SELECT DISTINCT J.[key]
FROM dbo.YourTable YT
CROSS APPLY OPENJSON(YT.JsonColumn) J)
SELECT #SQL = N'SELECT YT.ID,' + #CRLF +
N' J.*' + #CRLF +
N'FROM dbo.YourTable YT' + #CRLF +
N' CROSS APPLY OPENJSON(YT.JsonColumn)' + #CRLF +
N' WITH(' +
STRING_AGG(QUOTENAME(K.[key]) + N' nvarchar(100)', #Delimiter) + N') J;'
FROM Keys K;
PRINT #SQL;
EXEC sys.sp_executesql #SQL;
Note, this will not work with a table variable, unless you create a table type and then pass the TYPE as a parameter to sys.sp_executesql. This is why the above assumes a real table.

SQL server Json with single array

I am storing ids in comma separated string.
e.g
1,2,3,4
How can I store this in JSON in the column and should be able to insert delete any particular value?
Thanks
Part of the following answer comes from here, so all credits go there: https://stackoverflow.com/a/37844117/2695832
Here's a solution that enables you to store your string values in a JSON array in a table column. However, the should be able to insert delete any particular value part of your question is not totally clear to me.
DECLARE #source VARCHAR(20);
SET #source = '1,2,3,4';
DECLARE #values TABLE
(
[Id] VARCHAR(20)
);
INSERT INTO #values
(
[Id]
)
SELECT
value
FROM [STRING_SPLIT](#source, ',')
WHERE RTRIM(value) <> '';
INSERT INTO #values ([Id]) VALUES ('5')
DELETE FROM #values WHERE Id = 2
SELECT
JSON_QUERY('[' + STUFF(( SELECT ',' + '"' + Id + '"'
FROM #values FOR XML PATH('')),1,1,'') + ']' ) ids
FOR JSON PATH , WITHOUT_ARRAY_WRAPPER
This produces the following JSON object:
{"ids":["1","3","4","5"]}
The code might need some tweaking to completely match your needs since you're probably not using a table variable and also maybe want to use a numeric data type for your id values.

How to copy data from one table to another "EXCEPT" one field

How to INSERT into another table except specific field
e.g
TABLE A
ID(auto_inc) CODE NAME
1 001 TEST1
2 002 TEST2
I want to insert CODE and NAME to another table, in this case TABLE B but except ID because it is auto increment
Note: I don't want to use "INSERT INTO TABLE B SELECT CODE, NAME FROM TABLE A", because I have an existing table with around 50 fields and I don't want to write it one by one
Thanks for any suggests and replies
This can't be done without specifying the columns (excludes the primary key).
This question might help you. Copy data into another table
You can get all the columns using information_schema.columns:
select group_concat(column_name separator ', ')
from information_schema.columns c
where table_name = 'tableA' and
column_name <> 'id';
This gives you the list. Then past the list into your code. You can also use a prepared statement for this, but a prepared statement might be overkill.
If this is a one time thing?
If yes, do the insert into tableA (select * from table B)
then Alter the table to drop the column that your dont need.
I tried to copy from a table to another one with one extra field.
source table is TERRITORY_t
* the principle is to create a temp table identical to the source table, adjust column fields of the temp table and copy the content of the temp table to the destination table.
This is what I did:
create a temp table called TERRITORY_temp
generate SQL by running export
CREATE TABLE IF NOT EXISTS TERRITORY_temp (
Territory_Id int(11) NOT NULL,
Territory_Name varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (Territory_Id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
copy over with
INSERT INTO TERRITORY_temp (Territory_Id, Territory_Name) VALUES
(1, 'SouthEast'),
(2, 'SouthWest'),
(3, 'NorthEast'),
(4, 'NorthWest'),
(5, 'Central');
or
INSERT INTO TERRITORY_temp
SELECT * from TERRITORY_t
add the extra field(s) to match with the new table
copy from the temp table to the destination table
INSERT INTO TERRITORY_new
SELECT * from TERRITORY_temp
Please provide feedback.
Step 1. Create stored procedure
CREATE PROCEDURE CopyDataTable
#SourceTable varchar(255),
#TargetTable varchar(255),
#SourceFilter nvarchar(max) = ''
AS
BEGIN
SET NOCOUNT ON;
DECLARE #SourceColumns VARCHAR(MAX)=''
DECLARE #TargetColumns VARCHAR(MAX)=''
DECLARE #Query VARCHAR(MAX)=''
SELECT
#SourceColumns = ISNULL(#SourceColumns +',', '') + T.COLUMN_NAME
FROM
(
select name as COLUMN_NAME from sys.all_columns
where object_id = (select object_id from sys.tables where name = #SourceTable)
and is_identity = 0
)T
SELECT
#TargetColumns = ISNULL(#TargetColumns +',', '') + T.COLUMN_NAME
FROM
(
select name as COLUMN_NAME from sys.all_columns
where object_id = (select object_id from sys.tables where name = #TargetTable)
and is_identity = 0
)T
set #Query = 'INSERT INTO ' + #TargetTable + ' (' + SUBSTRING(#TargetColumns,2 , 9999) + ') SELECT ' + SUBSTRING(#SourceColumns,2 , 9999) + ' FROM ' + #SourceTable + ' ' + #SourceFilter;
PRINT #Query
--EXEC(#Query)
END
GO
Step 2. Run stored procedure
use YourDatabaseName
exec dbo.CopyDataTable 'SourceTable','TargetTable'
Explanations
a) dbo.CopyDataTable will transfer all data from SourceTable to TargetTable, except field with Identity
b) You can apply filter when call stored procedure, in order to transfer only row based on criteria
exec dbo.CopyDataTable 'SourceTable','TargetTable', 'WHERE FieldName=3'
exec dbo.CopyDataTable 'SourceTable','TargetTable', 'WHERE FieldName=''TextValue'''
c) Remove -- from --EXEC(#Query) WHEN finish

How to check for a character in a string and replace that character before insert

Ok, this question involves one part of a complicated stored procedure which inserts new entities into several tables.
The part that I'm currently having difficulty with needs to work like so:
insert entity with original name
check if name of new entity contains any special characters listed in table A 'Characters'
if yes, than replace that character with a 'replacement character' from table A
EDIT: I've gotten this to partially work but still not finished. I'm still having a problem showing each combination of character replacements. Also in the case of a replacement character occurring more than once, such as the '.', the substitutions needs to happen independently of one another.
ex: #www.test&aol.com -> #wwwtest&aol.com, #www.test&aolcom
Here's a rough start, I know parts of this aren't going to work, but I thought it was a decent starting point:
declare #test varchar(50)
set #test = '#www.test&aol.com'
declare #len int, #ctr int
set #len = LEN(#test)
set #ctr = 1
declare #newName varchar(50)
declare #matchedChar table(match varchar(10),replaceChar varchar(10),processed int default(0))
declare #alternateEntities table(name varchar(50))
declare #repChar varchar(10)
declare #selectedChar varchar(1)
while #ctr<=#len
begin
--Insert matching characters and replacement characters into table variable,
--this is necessary for the # character, which has multiple replacement characters
insert into #matchedChar (match,replaceChar) select Character,ReplacementCharacter from tblTransliterations where Character = SUBSTRING(#test,#ctr,1)
--loop
while (select COUNT(*) from #matchedChar where processed = 0)>0
begin
--get the top character from table variable
set #selectedChar = (select top 1 match from #matchedChar where processed = 0)
--get replacement character
set #repChar = (select top 1 replaceChar from #matchedChar where processed = 0)
--replace character in name string
--set #newName = (select Replace(#test,#selectedChar,#repChar))
set #newName = (select STUFF(#test,CHARINDEX(#selectedChar,#test),1,#repChar))
--update table variable to move onto next character
update #matchedChar set processed = 1 where #repChar = replaceChar
--add name with replaced character to alternate entities table
insert into #alternateEntities (name) values (#newName)
end
set #ctr = #ctr+1
set #len = LEN(#test)
end
select * from #alternateEntities
Instead of looping, use set-based approach.
Create a temp table and populate column 'Words' of type NVARCHAR(100), call the temp table Invalid_Words
Create a column on Invalid_Words for each token and make the col type = bit
Update the temp table bit columns if a word contains the token through a series of update statements
You now have defined which tokens were matched for each word.
The next part is to replace.