Minimize Mysql Json output - mysql

Mysql has a JSON_PRETTY() function to print human-readable JSON. I'm looking for the opposite functionality, to minimize JSON columns, getting rid of unnecessary whitespace, but haven't been able to find anything that does that. Is it possible to accomplish that with some combination of Mysql commands? I do need to do this in the SQL, not in application code.
INSERT INTO MY_TABLE(jsonPara) VALUES ('{"lamp": "hello world", "chair": "5"}');
select jsonPara, json_pretty(jsonPara) from MY_TABLE;
+-----------------------------------------+-------------------------------------------------+
|jsonPara |json_pretty(jsonPara) |
+-----------------------------------------+-------------------------------------------------+
|{"lamp": "hello world", "chair": "5"} |{ |
| |"lamp": "6", |
| |"chair": "5" |
| |} |
+-----------------------------------------+-------------------------------------------------+
I would like a result like {"lamp":"hello world","chair":"5"} (no spaces/new lines between keys and values and key/value pairs, array elements, etc.)

The best I can suggest is CAST(<expr> AS JSON). This reduces whitespace to a "normal" amount: one space after : and ,.
Here's your example document:
mysql> set #j = '{"lamp": "hello world", "chair": "5"}';
You know JSON_PRETTY() adds newlines and indentation:
mysql> select json_pretty(#j) as j;
+---------------------------------------------+
| j |
+---------------------------------------------+
| {
"lamp": "hello world",
"chair": "5"
} |
+---------------------------------------------+
Casting that expression back to JSON removes the extra whitespace:
mysql> select cast(json_pretty(#j) as json) as j;
+---------------------------------------+
| j |
+---------------------------------------+
| {"lamp": "hello world", "chair": "5"} |
+---------------------------------------+

Related

SQL json_extract returns null

I am attempting to extract from my json object
hits = [{“title”: “Facebook”,
“domain”: “facebook.com”},
{“title”: “Linkedin”,
“domain”: “linkedin.com”}]
When I use:
json_extract(hits,'$.title') as title,
nothing is returned. I would like the result to be: [Facebook, Linkedin].
However, when I extract by a scalar value, ex.:
json_extract_scalar(hits,'$[0].title') as title,
it works and Facebook is returned.
hits contains a lot of values, so I need to use json_extract in order to get all of them, so I can't do each scalar individually. Any suggestions to fix this would be greatly appreciated.
I get INVALID_FUNCTION_ARGUMENT: Invalid JSON path: '$.title' as an error for $.title (double stars). When I try unnest I get INVALID_FUNCTION_ARGUMENT: Cannot unnest type: varchar as an error and INVALID_FUNCTION_ARGUMENT: Cannot unnest type: json. I get SYNTAX_ERROR: line 26:19: Column '$.title' cannot be resolved when I try double quotes
Correct json path to exract all titles is $.[*].title (or $.*.title), though it is not supported by athena. One option is to cast your json to array of json and use transform on it:
WITH dataset AS (
SELECT * FROM (VALUES
(JSON '[{"title": "Facebook",
"domain": "facebook.com"},
{"title": "Linkedin",
"domain": "linkedin.com"}]')
) AS t (json_string))
SELECT transform(cast(json_string as ARRAY(JSON)), js -> json_extract_scalar(js, '$.title'))
FROM dataset
Output:
_col0
[Facebook, Linkedin]
Fits you have an array. So $.title doesn't exist see below
Second, you have not a valid json, is must have double quotes " like the example shows
SET #a := '[{
"title": "Facebook",
"domain": "facebook.com"
},
{
"title": "Linkedin",
"domain": "linkedin.com"
}
]'
SELECT json_extract(#a,'$[0]') as title
| title |
| :---------------------------------------------- |
| {"title": "Facebook", "domain": "facebook.com"} |
SELECT JSON_EXTRACT(#a, "$[0].title") AS 'from'
| from |
| :--------- |
| "Facebook" |
SELECT #a
| #a |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [{<br> "title": "Facebook",<br> "domain": "facebook.com"<br> },<br> {<br><br> "title": "Linkedin",<br> "domain": "linkedin.com"<br> }<br>] |
db<>fiddle here

Is there an opposite of MySQL's JSON_ARRAY_APPEND?

I used UPDATE table SET col = JSON_ARRAY_APPEND(col, '$', 'BAZ') to add a value to my json column:
Before: ["FOO", "BAR"]
After: ["FOO", "BAR", "BAZ"]
How can I now remove the value, i.e. perform the reverse of JSON_ARRAY_APPEND? I've tried the following but it doesn't seem to pick up the value.
UPDATE table SET col = JSON_REMOVE(col, '$.BAZ')
You can remove array elements by position, not by value.
select json_remove('["FOO", "BAR", "BAZ"]', '$[2]') as array;
+----------------+
| array |
+----------------+
| ["FOO", "BAR"] |
+----------------+
But you can find the position with JSON_SEARCH():
select json_search('["FOO", "BAR", "BAZ"]', 'one', 'BAZ') as path;
+--------+
| path |
+--------+
| "$[2]" |
+--------+
You can see it strangely puts JSON double-quotes around that path. So you have to unquote it:
select json_unquote(json_search('["FOO", "BAR", "BAZ"]', 'one', 'BAZ')) as path;
+------+
| path |
+------+
| $[2] |
+------+
Then put it all together:
select json_remove('["FOO", "BAR", "BAZ"]', json_unquote(json_search('["FOO", "BAR", "BAZ"]', 'one', 'BAZ'))) as array;
+----------------+
| array |
+----------------+
| ["FOO", "BAR"] |
+----------------+
This would be a lot easier if you didn't use JSON arrays. Instead of using an array, put multi-valued attributes in a child table, with one value per row. Then you can delete using traditional SQL:
DELETE FROM child_table WHERE somvalue = 'BAZ';
I have answered a bunch of questions about using JSON in MySQL here on Stack Overflow, and I have yet to see an instance where using JSON is easier than using normalized tables.

Parsing JSON data from SQL Server table column

I am trying to parse JSON data from a table in SQL Server 2017. I have a view that returns this data:
| Debrief Name | Version | Answer Question | Answer Options |
+-------------------+-----------+--------------------------+--------------------------------------------------------------------------------------------------------------------------+
| Observer Report | 7 | Division: | {"Options":[{"Display":"Domestic","Value":"Domestic"},{"Display":"International","Value":"International"}]} |
| Observer Report | 7 | Are you on reserve? | {"Options":[{"Display":"Yes - Long Call Line","Value":"Yes"},{"Display":"No","Value":"No"}]} |
| Observer Report | 11 | Crew Position: | {"Options":[{"Display":"CA","Value":"CA"},{"Display":"RC","Value":"RC"},{"Display":"FO","Value":"FO"}]} |
| Observer Report | 11 | Domicile: | {"VisibleLines":2,"Options":[{"Display":"BOS","Value":"BOS"},{"Display":"CLT","Value":"CLT"}]} |
| Training Debrief | 12 | TRAINING CREW POSITION | {"VisibleLines":2,"Options":[{"Display":"CA","Value":"CA"},{"Display":"FO","Value":"FO"}]} |
| Training Debrief | 12 | AIRCRAFT | {"VisibleLines":2,"Options":[{"Display":"777","Value":"777"},{"Display":"767","Value":"767"}]} |
| Security Debrief | 9 | Aircraft Type | {"Options":[{"Display":"MD-80","Value":"MD-80"},{"Display":"777","Value":"777"},{"Display":"767/757","Value":"767/757"}]}|
| News Digest | 2 | Do you read Digest? | {"Options":[{"Display":"Yes","Value":"Yes"},{"Display":"No","Value":"No"}]} |
The Debrief Name column can have multiple records for same debrief name and Version. Also there are multiple versions for each debrief. And for each debrief name and version combination, there are set of Answer Questions and related Answer Options. Now the column Answer Options contain JSON record which I need to parse.
So my initial query that is something like below:
SELECT *
FROM [dbo].<MY VIEW>
WHERE [Debrief Name] = 'Observer Report' AND Version = 11
which would return below data:
| Debrief Name | Version | Answer Question | Answer Options |
+---------------------+--------------+-----------------------+-----------------------------------------------------------------------------------------------------------------+
| Observer Report | 11 | Crew Position: | {"Options":[{"Display":"CA","Value":"CA"},{"Display":"RC","Value":"RC"}]} |
| Observer Report | 11 | Domicile: | {"VisibleLines":2,"Options":[{"Display":"BOS","Value":"BOS"},{"Display":"CLT","Value":"CLT"}]} |
| Observer Report | 11 | Fleet: | {"Options":[{"Display":"330","Value":"330"},{"Display":"320","Value":"320"}]} |
| Observer Report | 11 | Division: | {"Options":[{"Display":"Domestic","Value":"Domestic"},{"Display":"International","Value":"International"}]} |
| Observer Report | 11 | Are you on reserve? | {"Options":[{"Display":"Yes - Long Call Line","Value":"Yes - Long Call Line"},{"Display":"No","Value":"No"}]} |
Now from this returned result, for each Answer Question I need to parse the related Answer Options JSON data and extract the Value field for all the display attribute. So for example the JSON string in Answer Options for question "Are you on reserver?" looks like this:
"Options":[
{
"Display":"330",
"Value":"330",
"Selected":false
},
{
"Display":"320",
"Value":"320",
"Selected":false
},
{
"Display":"S80",
"Value":"S80",
"Selected":false
}
]
So I need to extract "Value" fields and return something like an array with values {330, 320, 195}.
In conclusion I want to construct a query where when I provide the Debrief Name and VersionNumber, it returns me the Answer Question and all the Answer Option values.
I am thinking of using a stored procedure like below:
CREATE PROCEDURE myProc
#DebriefName NVARCHAR(255),
#Version INT
AS
SELECT *
FROM [dbo].[myView]
WHERE [Debrief Name] = #DebriefName
AND Version = #Version
GO;
And then have another stored procedure that will capture this result from myProc and then do the JSON parsing:
CREATE PROCEDURE parseJSON
#DebriefName NVARCHAR(255),
#Version INT
AS
EXEC myProc #DebriefName, #Version; //Need to capture the result data in a temp table or something
// Parse the JSON data for each question item in temp table
GO;
I am not an expert in SQL so not sure how to do this. I read about Json parsing in SQL here and feel like I can use that but not sure how to in my context.
If you want to parse JSON data in Answer Options column and extract the Value field, you may try with the following approach, using OPENJSON() and STRING_AGG():
DECLARE #json nvarchar(max)
SET #json = N'{
"Options": [
{
"Display": "330",
"Value": "330",
"Selected": false
},
{
"Display": "320",
"Value": "320",
"Selected": false
},
{
"Display": "195",
"Value": "195",
"Selected": false
}
]
}'
SELECT STRING_AGG(x.[value], ', ') AS [Values]
FROM OPENJSON(#json, '$.Options') j
CROSS APPLY (SELECT * FROM OPENJSON(j.[value])) x
WHERE x.[key] = 'Value'
Output:
Values
330, 320, 195
If you want to build your statement using stored procedure, use this approach:
CREATE TABLE myTable (
DebriefName nvarchar(100),
Version int,
AnswerQuestion nvarchar(1000),
AnswerOptions nvarchar(max)
)
INSERT INTO myTable
(DebriefName, Version, AnswerQuestion, AnswerOptions)
VALUES
(N'Observer Report', 7, N'Division:' , N'{"Options":[{"Display":"Domestic","Value":"Domestic"},{"Display":"International","Value":"International"}]}'),
(N'Observer Report', 7, N'Are you on reserve?' , N'{"Options":[{"Display":"Yes - Long Call Line","Value":"Yes"},{"Display":"No","Value":"No"}]}'),
(N'Observer Report', 11, N'Crew Position:' , N'{"Options":[{"Display":"CA","Value":"CA"},{"Display":"RC","Value":"RC"},{"Display":"FO","Value":"FO"}]}'),
(N'Observer Report', 11, N'Domicile:' , N'{"VisibleLines":2,"Options":[{"Display":"BOS","Value":"BOS"},{"Display":"CLT","Value":"CLT"}]}'),
(N'Training Debrief', 12, N'TRAINING CREW POSITION', N'{"VisibleLines":2,"Options":[{"Display":"CA","Value":"CA"},{"Display":"FO","Value":"FO"}]}'),
(N'Training Debrief', 12, N'AIRCRAFT' , N'{"VisibleLines":2,"Options":[{"Display":"777","Value":"777"},{"Display":"767","Value":"767"}]}'),
(N'Security Debrief', 9, N'Aircraft Type' , N'{"Options":[{"Display":"MD-80","Value":"MD-80"},{"Display":"777","Value":"777"},{"Display":"767/757","Value":"767/757"}]}'),
(N'News Digest', 2, N'Do you read Digest?' , N'{"Options":[{"Display":"Yes","Value":"Yes"},{"Display":"No","Value":"No"}]}')
SELECT
t.AnswerQuestion,
STRING_AGG(x.[value], ', ') AS [Values]
FROM myTable t
CROSS APPLY (SELECT * FROM OPENJSON(t.AnswerOptions, '$.Options')) j
CROSS APPLY (SELECT * FROM OPENJSON(j.[value])) x
WHERE
DebriefName = N'Observer Report' AND
t.Version = 11 AND
x.[key] = 'Value'
GROUP BY
t.DebriefName,
t.Version,
t.AnswerQuestion
Output:
AnswerQuestion Values
Crew Position: CA, RC, FO
Domicile: BOS, CLT

Pad JSON array with JQ to obtain rectangular result

I have json that looks like this (jq play in the link), and I want to build csv in the end looking like this (reproducible sample at the bottom).
"SO302993",items1,item2,item3.1,item3.2,item3.3, item3.4,...
"SO302994",items1,item2,item3.1,item3.2, , ,...
"SO302995",items1,item2,item3.1,item3.2,item3.3, ,...
item3 elements are in an array and my current solution:
.[] | [.number, .item1, item2, item3[]?]
gives me this:
"SO302993",items1,item2,item3.1,item3.2,item3.3, item3.4,...
"SO302994",items1,item2,item3.1,item3.2,...
"SO302995",items1,item2,item3.1,item3.2,item3.3,...
which will create an uneven number of columns in the csv.
I tried adding .item3[:]? in a Python flavor-style, but it didn't work.
Any help would be much appreciated! And if I wasn't clear do ask to clarify! My snippet and toy data are in the link above.
{
"items": [
{
"name": "Mr Simon Mackin",
"country_of_residence": "Scotland",
"natures_of_control": [
"voting-rights-25-to-50-percent-limited-liability-partnership",
"significant-influence-or-control-limited-liability-partnership"
],
"premises": "4"
}
]
}
{
"items": [
{
"name": "Mrs Simonne Mackinni",
"country_of_residence": "France",
"natures_of_control": [
"significant-influence-or-control-limited-liability-partnership"
],
"premises": "4"
}
]
}
with this query:
.items[] | [.name, .country_of_residence, .natures_of_control[]?, .premises] | #csv
I get this results
"Mr Simon Mackin","Scotland","voting-rights","significant-influence","4"
"Mrs Simonne Mackinni","France","significant-influence","4"
But I'd like to get this (second line has extra comma after "significant-influence).
"Mr Simon Mackin","Scotland","voting-rights","significant-influence","4"
"Mrs Simonne Mackinni","France","significant-influence",,"4"
Since you want a rectangular result, you will have to "pad" the "natures_of_control" array. Based on the sample input, you will need to "slurp" the input in order to obtain a global maximum.
To pad the array, you could use the helper function:
# emit a stream of exactly $n items
def pad($n): range(0;$n) as $i | .[$i];
The solution to the problem as posted on jqplay then becomes:
([.[] | .items[] | .natures_of_control | length] | max) as $mx
| .[]
| (.active_count) as $active_count
| (.ceased_count) as $ceased_count
| (.links.self | split("/")[2]) as $companyCode
| .items[]
| [$companyCode, $active_count, $ceased_count, .name, .country_of_residence, .nationality, .notified_on, (.natures_of_control | pad($mx))]
| #csv
Invocation
The appropriate invocation would look like this:
jq -sr -f program.jq input.json
Handling missing data
To ignore objects that have no "items" you could tweak the above, e.g. as follows:
([.[] | .items[]? | .natures_of_control | length] | max) as $mx
| .[]
| select(.items)
| (.active_count) as $active_count
| (.ceased_count) as $ceased_count
| (.links.self | split("/")[2]) as $companyCode
| .items[]
| [$companyCode, $active_count, $ceased_count, .name, .country_of_residence, .nationality, .notified_on, (.natures_of_control | pad($mx))]
| #csv

mysql - avoid escaping double quotes in json functions

When I issue...
select JSON_REPLACE('{"tbl" : "cnf"}', '$', '{"tbl":"cnf4"}');
I get the following :
+--------------------------------------------------------+
| JSON_REPLACE('{"tbl" : "cnf"}', '$', '{"tbl":"cnf4"}') |
+--------------------------------------------------------+
| "{\"tbl\":\"cnf4\"}" |
+--------------------------------------------------------+
And it gets stored in my database the same say, with backslashes. I want to have no backslashes in my database. How can I achieve that?
I expect a reponse like:
{"tbl":"cnf4"}
Wrap in JSON_UNQUOTE
select JSON_UNQUOTE(JSON_REPLACE('{"tbl" : "cnf"}', '$', '{"tbl":"cnf4"}'));
+----------------------------------------------------------------------+
| JSON_UNQUOTE(JSON_REPLACE('{"tbl" : "cnf"}', '$', '{"tbl":"cnf4"}')) |
+----------------------------------------------------------------------+
| {"tbl":"cnf4"} |
+----------------------------------------------------------------------+
1 row in set (0.0005 sec)
This helped me to unescape when I pushed new object to an existing array
json_array_append(data, '$', cast(? as json))