How to search a number JSON array in mysql? - mysql

I have a JSON in my MYSQL like [1,2,3].
And how can i get where 1 is.
I tried to use this:
SELECT JSON_SEARCH('[1,2,3]','one',1);
But the result of this is NULL
I expect the output to be $[0]

Your JSON array elements should be in double quotes:
WITH yourTable AS (
SELECT '["1","2","3"]' AS json
)
SELECT
json,
JSON_SEARCH(json, 'one', '1')
FROM yourTable;
This returns "$[0]" for the match of '1' against the JSON text.
Demo
As to why it doesn't work with number literals (which should in fact be valid literal JSON values), it seems that the third parameter to JSON_SEARCH is a search string, and it only works against actual text, not numbers.

Related

Bigquery: Extract data from an array of json

(This is an extension to this question, but my reputation is too low to comment or ask more questions on that topic...)
We work on bigquery, hence limited in importing packages or using other languages. And, as per the link above, js is a solution, but not what I'm looking for here. I implemented it in js, and it was too slow for our needs.
Suppose one of our columns is a string that look like this (array of json):
[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[85.0002,20.0055],"t":3}]
I want to extract from the column the json for which "t":2
Where:
some columns don't have elements "t":2
Some columns have several elements "t":2
The number of json elements in each string can change
element "t":2 is not always in second position.
I don't know regexp well enough for this. We tried regexp_extract with this pattern: r'(\{.*?\"t\":2.*?\})')), but that doesn't work. It extracts everything that precedes "t":2, including the json for "t":2. We only want the json of element "t":2.
Could you advise a regexp pattern that would work?
EDIT:
I have a preference for a solution that gives me 1 match. Suppose I have this string:
[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[55.33,141.785],"t":2}],
I would prefer receiving only 1 answer, the first one.
In that case perhaps regexp is less appropriate, but I'm really not sure?
How about this:
(?<=\{)(?=.*?\"t\"\s*:\s*2).*?(?=\})
As seen here
There is another solution but it is not regexp based (as I had originally asked). So this should not count as the final answer to my own question, nonetheless could be useful.
It is based on a split of the string in array and then chosing the element in the array that satisfies my needs.
Steps:
transform the string into something better for splits (using '|' as seperator):
replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}')
split it using split(), which yields an array of strings (each one a json element)
find the relevant element ("t":2) - in my case, the first one is good enough, so I limit the query to 1: array( select data from unnest(split(replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}'),'|')) as data where data like '%"t":2%' limit 1)
Convert that into a useable string with array_to_string() and use json_extract on that string to extract the relevant info from the element that I need (say for example, location coordinate x).
So putting it all together:
round(safe_cast(json_extract(array_to_string(array( select data from unnest(split(replace(replace(replace(my_field,'},{','}|{'),'[{','{'),'}]','}'),'|')) as data where data like '%"t":2%' limit 1),''),'$.location[0]') as float64),3) loc_x
May 1st, 2020 Update
A new function, JSON_EXTRACT_ARRAY, has been just added to the list of JSON
functions. This function allows you to extract the contents of a JSON document as
a string array.
so in below you can replace use of json2array UDF with just in-built function JSON_EXTRACT_ARRAY as in below example
#standardSQL
SELECT id,
(
SELECT x
FROM UNNEST(JSON_EXTRACT_ARRAY(json, '$')) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
==============
Below is for BigQuery Standard SQL
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
SELECT id,
(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
You can test, play with above using dummy data as in below example
#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
WITH `project.dataset.table` AS (
SELECT 1 id, '[{"location":[22.99902,66.000],"t":1},{"location":[55.32168,140.556],"t":2},{"location":[85.0002,20.0055],"t":3}]' json UNION ALL
SELECT 2, '[{"location":[22.99902,66.000],"t":11},{"location":[85.0002,20.0055],"t":13}]'
)
SELECT id,
(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
with output
Row id extracted
1 1 {"location":[55.32168,140.556],"t":2}
2 2 null
Above assumes that there is no more than one element with "t":2 in json column. In case if there can be more than one - you should add ARRAY as below
SELECT id,
ARRAY(
SELECT x
FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.t') = '2'
) extracted
FROM `project.dataset.table`
Even though, you have posted a work around your issue. I believe this answer will be informative. You mentioned that one of the answer selected more than what you needed, I wrote the query below to reproduce your case and achieve aimed output.
WITH
data AS (
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":2},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j
UNION ALL
SELECT
" [{ \"location\":[22.99902,66.000]\"t\":1},{\"location\":[55.32168,140.556],\"t\":3},{\"location\":[85.0002,20.0055],\"t\":3}] " AS string_j ),
refined_data AS (
SELECT
REGEXP_EXTRACT(string_j, r"\{\"\w*\"\:\[\d*\.\d*\,\d*\.\d*\]\,\"t\"\:2\}") AS desired_field
FROM
data )
SELECT
*
FROM
refined_data
WHERE
desired_field IS NOT NULL
Notice that I have used the dummy described in the temp table, populated inside the WITH method. As below:
Afterwords, in the table refined_data, I used the REGEXP_EXTRACT to extract the desired string from the column. Observe that for the rows which there is not a match expression, the output is null. Thus, the table refined_data is as follows :
As you can see, now it is just needed a simple WHERE filter to obtain the desired output, which was done in the last select.
In addition you can see the information about the regex expression I provided here.

Avoid escaping characters when converting from tabular data to json

I have some problems converting tabular data to JSON using the FOR JSON PATH syntax:
If i do a standard query:
SELECT b.Name FROM dbo
I get results of the form: 12/5-A-1. I need this converted to JSON data without escaping the backslash character. However, when i convert it to JSON:
SELECT b.Name FROM dbo FOR JSON PATH, WITHOUT ARRAY_WRAPPER
the result is of the form: {"Name": "12\/5-A-1"}
How can i do this transformation without escaping the backslash character and get the result {"Name": "12/5-A-1"}?
One option is to use a common table expression to generate the json, and then simply use replace when selecting from the common table expression.
First, create and populate sample data (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
[Name] nvarchar(10)
)
INSERT INTO #T ([Name]) VALUES ('12/5-A-1');
The cte:
WITH CTE(Escaped) AS
(
SELECT [Name]
FROM #T
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
)
The final select:
SELECT REPLACE(Escaped, '\/','/') As Result
FROM CTE
Result:
{"Name":"12/5-A-1"}

How to query something from an array using WHERE in sql

I tried the code below, but it does not work.
spark.sql("""SELECT categories, business_id
FROM business_data
WHERE categories = 'Ice Cream'
""").show(150, truncate=False)
It seems like there is a different way to query from an array but I cant figure it out.
This is what my data looks like.
Sample data:
Thank you
MySQL Specific:
FIND_IN_SET(str,strlist)
FROM DOCS:
Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by , characters. If the first argument is a constant string and the second is a column of type SET, the FIND_IN_SET() function is optimized to use bit arithmetic. Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL. This function does not work properly if the first argument contains a comma (,) character.
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
So in your case...
spark.sql("""SELECT categories, business_id
FROM business_data
WHERE Find_In_set('Ice Cream',categories)>1
""").show(150, truncate=False)
Normally if you want to query something out of an Array you would use array_contains, as such:
SELECT business_id, categories
FROM business_data
WHERE array_contains(categories,'Ice Cream & Frozen Yogurt')

Querying a JSON String in SQL

I have a table that is a single column made up of a JSON string. The JSON has multiple key pairs, and is a string because it is the raw table.
One of the keys is "Ticket" and has dollar amount values. I am not certain if prices are in __.__ format, or just ____. I want to query the column to return me the entire string if this "Ticket" ends in a 6, as in 96 cents, or 66 cents, etc.
This is my query:
SELECT json FROM tablename
WHERE json RLIKE '%"TICKET": "___6",%'
OR json RLIKE '%"TICKET": "__._6",%'
This currently returns as blank.
How can I get the entire string if the dollar amount ends in a 6 (as in 6 cents)?
The search strings you are using are what you would use for LIKE
So you could use LIKE :
select * from tablename
where (json LIKE '%"TICKET": "___6"%' or json LIKE '%"TICKET": "__._6"%')
Or a RLIKE with a regex:
select * from tablename
where json RLIKE '"TICKET":[ ]*"[0-9.]+6"'

How do I get the type of a variable in MySQL?

I'm trying to change a table field that contains decimal numbers from varchar(255) to decimal(12,2). And before I do that, I'd like to find out if there is information that would get deleted in the process: are there any rows where this field contains something other than a decimal(12,2).
I'm stumped how to do this. Apparently there isn't a string function like is_numeric() in PHP. I already tried casting the field to decimal and then comparing it with the original string, but this returns TRUE even for obvious cases where it should not:
select ('abc' = convert('abc', decimal(12,2)));
returns 1
Any help? How do I find out if a string contains something other than a decimal in MySQL? Thanks.
Stupid me, I have to cast twice (to decimal and back to char), which makes it work:
select ('abc' = convert(convert('abc', decimal(12,2)), char(255)));
returns 0
Thanks.
If you want to examine if the strings are actually floating points numbers, you could also use a regular expression. The following regex can help :)
SELECT '31.23' REGEXP '^[[:digit:]]+([.period.][[:digit:]]+)?$'; # returns 1
SELECT '31' REGEXP '^[[:digit:]]+([.period.][[:digit:]]+)?$'; # returns 1
SELECT 'hey' REGEXP '^[[:digit:]]+([.period.][[:digit:]]+)?$'; # returns 0