How to loop through an array in SQL where clause? - mysql

I have a MySQL table which has the following columns and records:
| Name | Total | GivenBy |
| ---- | -------- | ------------ |
| Z | 200 |['A','B','C'] |
| X | 240 |['A','D','C'] |
I would like to extract Record No. 1 on the basis of 3rd column where the SQL query would be like:
SELECT * FROM mytable WHERE GivenBy='B';
Is there a way I can loop through the list in third column and take out the respective string as required in the SQL WHERE clause in a single query?
Please note that I cannot add more columns in the table.
If you can please provide the query as MySQL compatible, I would really appreciate it.

The "array" you show isn't quite valid JSON, but if you use double-quotes instead of single-quotes, you can use JSON_TABLE() to do this:
CREATE TABLE MyTable
(
Name CHAR(1) PRIMARY KEY,
Total INT NOT NULL,
GivenBy JSON NOT NULL
);
INSERT INTO MyTable VALUES
('Z', 200, '["A","B","C"]'),
('X', 240, '["A","D","C"]');
SELECT Name, Total, g.Value
FROM MyTable
CROSS JOIN JSON_TABLE(GivenBy, '$[*]' COLUMNS(Value CHAR(1) PATH '$')) AS g;
+------+-------+-------+
| name | total | value |
+------+-------+-------+
| X | 240 | A |
| X | 240 | D |
| X | 240 | C |
| Z | 200 | A |
| Z | 200 | B |
| Z | 200 | C |
+------+-------+-------+
But the best choice is not to store "arrays" in MySQL. Store the values one per row in a second table.

You can use the "like" keyword with regex to match your requirements in the third column.
select * from table where givenBy like "%B%";
Something similar would work.

You need to run a script:
Retrieve the list of unique values in the GivenBy column using the following query:
SELECT DISTINCT JSON_EXTRACT(GivenBy, '$[*]') AS GivenByValues
FROM mytable;
Loop through the list of unique values, and for each value, run a query that uses that value in the WHERE clause:
SELECT *
FROM mytable
WHERE JSON_SEARCH(GivenBy, 'one', [current_value_from_loop]) IS NOT NULL;

Related

Looping through an array in SQL column

I have a SQL table that looks something like this:
| ID | Value |
| --- | ----------------------------------------------------- |
| 1 | {"name":"joe", "lastname":"doe", "age":"34"} |
| 2 | {"name":"jane", "lastname":"doe", "age":"29"} |
| 3 | {"name":"michael", "lastname":"dumplings", "age":"40"}|
How can I using SQL select function, select only the rows where "age" (in value column) is above 30?
Thank you.
The column Value as it is it contains valid JSON data.
You can use the function JSON_EXTRACT() to get the the age and convert it to a numeric value by adding 0:
SELECT *
FROM tablename
WHERE JSON_EXTRACT(Value, "$.age") + 0 > 30;
See the demo.

How to find data based on comma separated parameter in comma separated data in my SQL query

We have below data,
plant table
----------------------------
| name | classification |
| A | 1,4,7 |
| B | 2,3,7 |
| C | 3,4,9,8 |
| D | 1,5,6,9 |
Now from front end side, they will send multiple parameter like "4,9",
and the objective output should be like this
plant table
---------------------------
| name | classification |
| A | 1,4,7 |
| C | 3,4,9,8 |
| D | 1,5,6,9 |
Already tried with FIND_IN_SET code, but only able to fetch only with 1 parameter
select * from plant o where find_in_set('4',classification ) <> 0
Another solution is by doing multiple queries, for example if the parameter is "4,9" then we do loop the query two times with parameter 4 and 9, but actually that solution will consume so much resources since the data is around 10000+ rows and the parameter itself actually can be more than 5 params
If the table design is in bad practice then OK but we are unable to change it since the table is in third party
Any solution or any insight will be appreciated,
Thank you
Schema (MySQL v8.0)
CREATE TABLE broken_table (name CHAR(12) PRIMARY KEY,classification VARCHAR(12));
INSERT INTO broken_table VALUES
('A','1,4,7'),
('B','2,3,7'),
('C','3,4,9,8'),
('D','1,5,6,9');
Query #1
WITH RECURSIVE cte (n) AS
(
SELECT 1
UNION ALL
SELECT n + 1 FROM cte WHERE n < 5
)
SELECT DISTINCT x.name, x.classification FROM broken_table x JOIN cte
WHERE SUBSTRING_INDEX(SUBSTRING_INDEX(classification,',',n),',',-1) IN (4,9);
name
classification
A
1,4,7
C
3,4,9,8
D
1,5,6,9
View on DB Fiddle
EDIT:
or, for older versions...
SELECT DISTINCT x.name, x.classification FROM broken_table x JOIN
(
SELECT 1 n UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5
) cte
WHERE SUBSTRING_INDEX(SUBSTRING_INDEX(classification,',',n),',',-1) IN (4,9)
Let's just avoid the CSV altogether and fix your table design:
plant table
----------------------------
| name | classification |
| A | 1 |
| A | 4 |
| A | 7 |
| B | 2 |
| B | 3 |
| B | 7 |
| ... | ... |
Now with this design, you may use the following statement:
SELECT *
FROM plant
WHERE classification IN (?);
To the ? placeholder, you may bind your collection of values to match (e.g. (4,9)).
You want or so you can use regular expressions. If everything were one digit:
where classification regexp replace('4,9', ',', '|')
However, this would match 42 and 19, which I'm guessing you do not want. So, make this a little more complicated so you have comma delimiters:
where classification regexp concat('(,|^)', replace('4,9', ',', ',|,'), '(,|$)')

Exotic GROUP BY In MySQL

Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)

SQL query in MySQL containing mathematical comparison

I need to have a SQL that finds values from table B using (randomize) values on table A in comparative manner. Table A values has been produces in randomize manner. Table B values have been order in a way of cumulative distribution function. What is needed is that SQL will get the first row from table B which satisfy the criteria.
Table A:
+----+-------+
| ID | value |
+----+-------+
| 1 | 0.1234|
| 2 | 0.8923|
| 3 | 0.5221|
+----+-------+
Table B:
+----+-------+------+
| ID | value | name |
+----+-------+------+
| 1 | 0.2000| Alpha|
| 2 | 0.5000| Beta |
| 3 | 0.7500| Gamma|
| 4 | 1.0000| Delta|
+----+-------+------+
Result should be:
+----+-------+------+
| ID | value | name |
+----+-------+------+
| 1 | 0.1234| Alpha|
| 2 | 0.8923| Delta|
| 3 | 0.5221| Gamma|
+----+-------+------+
Value 0.1234 is smaller than all the values of B, but Alpha has smallest value.
Value 0.8923 is smaller than 1.000 --> Delta.
Value 0.5221 is smaller than both 0.7500 and 1.000 but 0.7500 is smallest --> Gamma.
This query works only if table A has one value:
select value, name
from B
where (select value from A) &lt value;
Any ideas how to get this work with full table A?
You can use subquery to get the data you need:
SELECT a.ID, a.value,
(SELECT b.name FROM TableB b WHERE a.value < b.value ORDER BY b.ID ASC LIMIT 1) as name
FROM TableA a
In this case for each row in table A you find the first record in table B, that has larger number in column value. Depending on your requirements the operator < might beed to be updated to <= - it depends on your requirements

Dynamically display contents of a table without duplicates

How do I display all the contents of a table without duplicates?
Example:
Table 1
______________________
| ID | VALUE |
| 0 | A |
| 1 | A |
| 2 | B |
| 3 | C |
Expected output:
A
B
C
What about
SELECT DISTINCT VALUE FROM table
or use GROUP BY statement with VALUE column
there is not any duplicate rows. you can use distinct keyword only with values like
SELECT DISTINCT VALUE FROM XYZ;
you can do the same using group by
SELECT VALUE FROM XYZ GROUP BY VALUE;
but this is bullshit. a good developer will not use it.