Related
I have a table with the following schema in a SQL Azure DB (2019 compat level):
CREATE TABLE dbo.Properties
(
PropertyId int,
PropertyName nvarchar(100),
PropertyValue nvarchar(1000)
)
I'd like to take the data within this table and turn it into JSON using the value within the PropertyName column as the name of the JSON property, and obviously the PropertyValue value as the JSON property value.
EDIT 12/10/2021:
Importantly, the values within the PropertyName column will not be predictable ahead of time.
For example, consider this data in the table (3 rows):
1, "Color", "Blue"
1, "Name", "John"
1, "Cost", 5
The above would be turned into the following JSON:
{"Color":"Blue", "Name":"John", "Cost":5}
I'm obviously able to do this with a STRING_AGG function like the following:
SELECT '{' + STRING_AGG( '"' + p.PropertyName + '": ''' + p.PropertyValue,''',')
WITHIN GROUP (ORDER BY p.PropertyName) + '}' AS MyJson
FROM dbo.Properties p
GROUP BY p.Id
But I was hoping to use one of the build in JSON functions rather than hack together a big string.
FOR JSON AUTO works from the column names, so one method to get your desired result would be to PIVOT the property names into columns. For example:
SELECT Color, [Name], Cost
FROM dbo.Properties
PIVOT ( MAX( PropertyValue ) For PropertyName In ( [Color], [Name], Cost ) ) pvt
FOR JSON AUTO;
My results:
Of course this is only convenient if your JSON attributes / column names are always known and it's a simple example. For more complex examples, you are probably looking at dynamic pivot, or dynamic SQL and your STRING_AGG example isn't so bad.
I have a table which contains a column "owners", which has json data in it like this:
[
{
"first":"bob",
"last":"boblast"
},
{
"first":"mary",
"last": "marylast"
}
]
I would like to write a query that would return for each row that contains data like this, a column that has all of the first names concatenated with a comma.
i.e.
id owners
----------------------------
1 bob,mary
2 frank,tom
Not on mysql 8.0 yet.
You can get the values as a JSON array:
SELECT JSON_EXTRACT(owners, '$[*].first') AS owners ...
But that returns in JSON array format:
+-----------------+
| owners |
+-----------------+
| ["bob", "mary"] |
+-----------------+
JSON_UNQUOTE() won't take the brackets and double-quotes out of that. You'd have to use REPLACE() as I show in a recent answer here:
MYSQL JSON search returns results in square brackets
You should think about not storing data in JSON format if it doesn't support the way you need to query them.
Here is another option, get a helper table with running numbers up to the max json array length, and extract values by individual index, after that group_concat the values, something like this:
SELECT g.id, GROUP_CONCAT(g.name)
FROM (
SELECT a.id, JSON_UNQUOTE(JSON_EXTRACT(a.owners, CONCAT('$[', n.idx, '].first'))) name
FROM running_numbers n
JOIN mytable a
) g
GROUP BY g.id
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=d7453c9edf89f79ca4ab2f63578b320c
I have a db-table containing json formated strings:
CREATE TABLE `template` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`TemplateData` longtext NOT NULL,
PRIMARY KEY (`Id`)
);
INSERT INTO template (Id, TemplateData) VALUES
(1, '[]'),
(2, '[{"type":"template","id":1}]'),
(3, '[{"type":"other", "id":1}]'),
(4, '[{"type":"template","id":3},{"type":"template","id":1}]'),
(5, '[{"type":"template","id":2}]');
http://sqlfiddle.com/#!9/739f3a
For background: these records are templates for a frontend to build dynamic views. Every template is able to include another template. So based on above data, record #2 is a template using the other template #1 inside. View it like reusable parts.
Inside the json I have an array containing multiple types of objects. In my example are two different variants: {type: "template", id: number} and {"type": "other", "id": number}.
Server-Architecture
Production:
MySQL Server Version 8.0.21.
Development:
MariaDB Server Version 10.4.11
What i want to retrieve by SELECT
I need a list of all templates, which are using a specific other template. I want to select all records, which contain an object of $[*].type='template' AND $[*].id=1.
Based on the given records, i want to retrieve rows #2 and #4, because both contain an object matching both arguments. Complication is on #4, having the record at array index 1.
I don't want #1 because no element inside the array
I don't want #3 because $[0].type is not template
What I already tried
I made some tryouts using JSON_SEARCH() and JSON_EXTRACT(), but could not handle to get my expected rows:
SELECT
Id,
JSON_EXTRACT(TemplateData,
JSON_UNQUOTE(
REPLACE(JSON_SEARCH(TemplateData,
'all',
'template'),
'.type"',
'.id"'))) AS includedTemplateId
FROM template
HAVING includedTemplateId = 1
returns only one record with Id:2 but not record with Id:4 because JSON_SEARCH with 'all' delivers an array of paths, but JSON_EXTRACT does not allow path to be an array.
What is not possible
I also tried using a simple LIKE expression, but ended on the problem, if the order or the objects argument differ (p.e.: {id: number, type: "template"}) or a space or different quotes are used the like does not match.
Additional goal
It would be the most perfekt result, if i get record #5 too for a search after template-id #1, because #5 uses #2, which uses #1. But this would be next level.
The solution for MySQL 8.0.21:
SELECT template.id
FROM template
CROSS JOIN JSON_TABLE( template.TemplateData,
"$[*]" COLUMNS( type VARCHAR(254) PATH "$.type",
id INT PATH "$.id" )
) AS jsontable
WHERE jsontable.type = 'template'
AND jsontable.id = 1;
fiddle
If template objects may be duplicated in separate value then add DISTINCT.
Any suggestion in regard of MariaDB?
Draft solution applicable to MariaDB.
WITH RECURSIVE
cte1 AS ( SELECT MAX(LENGTH(TemplateData) - LENGTH(REPLACE(TemplateData, '{', ''))) max_obj_count
FROM template ),
cte2 AS ( SELECT 1 num
UNION ALL
SELECT num + 1
FROM cte2
WHERE num < ( SELECT max_obj_count
FROM cte1 ) )
SELECT DISTINCT
template.id
FROM template
CROSS JOIN cte2
WHERE LOCATE('"type":"template"' ,SUBSTRING_INDEX(SUBSTRING_INDEX(template.TemplateData, '}', cte2.num), '{', -1))
AND LOCATE('"id":1' ,SUBSTRING_INDEX(SUBSTRING_INDEX(template.TemplateData, '}', cte2.num), '{', -1))
The problem - this code searches for '"type":"template"' and '"id":1' substrings strictly - i.e. it will not find the rows where the value is written as, for example, '"type" : "template"' (excess space chars) or '"id":"1"' (the value is quoted).
If you want to eliminate this problem then you must get SUBSTRING_INDEX(SUBSTRING_INDEX(template.TemplateData, '}', cte2.num), '{', -1) in one more CTE, clear it from all []{} chars, then wrap with {} and process this value in WHERE as JSON object.
Solution for MySql 5.7 (and I think mariaDb too)
select tp.id,tp.TemplateData
from template tp
where json_contains( tp.TemplateData ,json_object('type','template','id',1))
;
(This is an extension to this question, but my reputation is too low to comment or ask more questions on that topic...)
We work on bigquery, hence limited in importing packages or using other languages. And, as per the link above, js is a solution, but not what I'm looking for here. I implemented it in js, and it was too slow for our needs.
Suppose one of our columns is a string that look like this (array of json):
[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[85.0002,20.0055],"t":3}]
I want to extract from the column the json for which "t":2
Where:
some columns don't have elements "t":2
Some columns have several elements "t":2
The number of json elements in each string can change
element "t":2 is not always in second position.
I don't know regexp well enough for this. We tried regexp_extract with this pattern: r'(\{.*?\"t\":2.*?\})')), but that doesn't work. It extracts everything that precedes "t":2, including the json for "t":2. We only want the json of element "t":2.
Could you advise a regexp pattern that would work?
EDIT:
I have a preference for a solution that gives me 1 match. Suppose I have this string:
[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[55.33,141.785],"t":2}],
I would prefer receiving only 1 answer, the first one.
In that case perhaps regexp is less appropriate, but I'm really not sure?
How about this:
(?<=\{)(?=.*?\"t\"\s*:\s*2).*?(?=\})
As seen here
There is another solution but it is not regexp based (as I had originally asked). So this should not count as the final answer to my own question, nonetheless could be useful.
It is based on a split of the string in array and then chosing the element in the array that satisfies my needs.
Steps:
transform the string into something better for splits (using '|' as seperator):
replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}')
split it using split(), which yields an array of strings (each one a json element)
find the relevant element ("t":2) - in my case, the first one is good enough, so I limit the query to 1: array( select data from unnest(split(replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}'),'|')) as data where data like '%"t":2%' limit 1)
Convert that into a useable string with array_to_string() and use json_extract on that string to extract the relevant info from the element that I need (say for example, location coordinate x).
So putting it all together:
round(safe_cast(json_extract(array_to_string(array( select data from unnest(split(replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}'),'|')) as data where data like '%"t":2%' limit 1),''),'$.location[0]') as float64),3) loc_x
May 1st, 2020 Update
A new function, JSON_EXTRACT_ARRAY, has been just added to the list of JSON
functions. This function allows you to extract the contents of a JSON document as
a string array.
so in below you can replace use of json2array UDF with just in-built function JSON_EXTRACT_ARRAY as in below example
#standardSQL
SELECT id,
(
SELECT x
FROM UNNEST(JSON_EXTRACT_ARRAY(json, '$')) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
==============
Below is for BigQuery Standard SQL
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
SELECT id,
(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
You can test, play with above using dummy data as in below example
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
WITH `project.dataset.table` AS (
SELECT 1 id, '[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[85.0002,20.0055],"t":3}]' json UNION ALL
SELECT 2, '[{"location":[22.99902,66.000],"t":11},{"location":[85.0002,20.0055],"t":13}]'
)
SELECT id,
(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
with output
Row id extracted
1 1 {"location":[55.32168,140.556],"t":2}
2 2 null
Above assumes that there is no more than one element with "t":2 in json column. In case if there can be more than one - you should add ARRAY as below
SELECT id,
ARRAY(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
Even though, you have posted a work around your issue. I believe this answer will be informative. You mentioned that one of the answer selected more than what you needed, I wrote the query below to reproduce your case and achieve aimed output.
WITH
data AS (
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":2},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j ),
refined_data AS (
SELECT
REGEXP_EXTRACT(string_j, r"\{\"\w*\"\:\[\d*\.\d*\,\d*\.\d*\]\,\"t\"\:2\}") AS desired_field
FROM
data )
SELECT
*
FROM
refined_data
WHERE
desired_field IS NOT NULL
Notice that I have used the dummy described in the temp table, populated inside the WITH method. As below:
Afterwords, in the table refined_data, I used the REGEXP_EXTRACT to extract the desired string from the column. Observe that for the rows which there is not a match expression, the output is null. Thus, the table refined_data is as follows :
As you can see, now it is just needed a simple WHERE filter to obtain the desired output, which was done in the last select.
In addition you can see the information about the regex expression I provided here.
Below is the table of animals on various floors.
ID,FLOOR_LEVEL,ANIMAL [column names]
01,A,CAT
02,A,DOG
03,B,DOG
04,B,CAT
05,B,CAT
06,C,CAT
I want to label the types of animal(i.e cat will be labelled as 1, dog will be labelled as 2....) like shown below by creating a new column LABEL.
ID,FLOOR_LEVEL,ANIMAL,LABEL [column names]
01,A,CAT,1
02,A,DOG,2
03,B,DOG,2
04,B,CAT,1
05,B,CAT,1
06,C,CAT,1
It can be done by writing query such as
INSERT INTO table_name (LABEL)
VALUES (1,2,2,1,1,1);
But, how can this be generalised for a huge no. of different type of animals in MySQL by writing query? Please help.
Your INSERT statement makes no sense:
INSERT INTO table_name (LABEL) VALUES (1,2,2,1,1,1);
Will insert 1 row into a table called table_name with 6 columns. It would do nothing for you.
Instead make a new table to store the animal id and animal name:
CREATE TABLE animals ( id int, animal_name VARCHAR(50));
INSERT INTO animals VALUES (1, 'cat'),(2, 'dog'),(3,'tardigrade'),(4,'liger');
And then join that into your query:
SELECT t1.floor_level, t1.animal, t2.id
FROM table t1
INNER JOIN animals t2 ON
t1.animal = t2.animal_name;
Optionally you could use a case statement to do this within the query. It will get a little laborious if you have a ton of animals though. And you will have to rewrite it every time you query this table.
SELECT floor_level,
animal,
CASE WHEN animal = 'cat' THEN 1
WHEN animal = 'dog' THEN 2
WHEN animal = 'tardigrade' THEN 3
WHEN animal = 'liger' THEN 4
END as animal_id
FROM table;
If your requirement is not a numeric label, try using md5() to generate a hash label for the animals. Its value would be same for animals with same name (case sensitive).
SELECT ID, FLOOR_LEVEL, ANIMAL, MD5(UPPER(ANIMAL))
FROM table;
Please note that MD5() string are case sensitive. i.e. DOG and Dog would generate a different output, hence I have used UPPER() method.
Another option is to use below function, which will convert the alphanumeric md5() value to a long type. But the numeric value would not be starting from 1, 2 etc.
cast(conv(substring(md5(upper(animal)), 1, 16), 16, 10) as unsigned integer)
EDIT - Based on OP comment
But is there any way to reduce the length of generated values?
To reduce the length of the generated value, Use rand() function wisely to reduce the length of the integer. The only concern is that you may not want a collision of generated values.
round(cast(conv(substring(md5(upper(animal)), 1, 16), 16, 10) as unsigned integer) * (rand() * rand()))
A more sophisticated, optimized and non-colliding value may be generated using Hive UDF.
The simplest way to solve your problem is with a DECODE
SELECT
DECODE(
ANIMAL,
'CAT',1,
'DOG',2,
'OTHER'
)
FROM ...
Here is more info:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions040.htm
Good Luck!
The command you are looking for is CASE
SELECT ID, FLOOR_LEVEL, ANIMAL,
CASE
WHEN ANIMAL = 'CAT' THEN 1
WHEN ANIMAL = 'DOG' THEN 2
END
FROM table;