Select multiple distinct columns from same table separately - mysql

Let's have this sample data:
+-------+---------+
| col1 | col2 |
+-------+---------+
| 111 | hello |
| 222 | hello |
| 111 | blabla |
| 444 | there |
| 555 | blabla |
| 555 | there |
+-------+---------+
I need a SQL returning distinct values for each columns separately (as feed for dropdown values for filtering).
Thus the result should be:
+-------+---------+
| col1 | col2 |
+-------+---------+
| 111 | hello |
| 222 | blabla |
| 444 | there |
| 555 | |
+-------+---------+
The results need not be in this format; it's more important that I have the distinct values for easy access and iteration.
The closest I got is from here:
https://stackoverflow.com/a/12188117/169252
select (SELECT group_concat(DISTINCT col1) FROM testtable) as col1, (SELECT group_concat(DISTINCT col2) FROM testtable) as col2;
This returns:
+-----------------+-------------------+
| col1 | col2 |
+-----------------+-------------------+
| 111,222,444,555 | velo,hallo,blabla |
+-----------------+-------------------+
That's pretty close, and I'll choose this one if no better solution comes up; it's not optimal as values are comma separated and I need to split the results.
I also tried:
SELECT DISTINCT from col1,col2 FROM testtable;
This returns the distint values of BOTH columns, not what I want.
Also:
select col1,col2 from testtable group by col1,col2;
which has been suggested elsewhere doesn't return what I need, but returns each column in-distinct :)

One problem with what you're asking is that you're expected resultset doesn't really make any sense from a relational database standpoint. Every column in a row of data should have a relationship with the other columns in the row.
The best way to approach this, IMO, is to return two result sets and process each one for each of your drop down boxes:
SELECT DISTINCT column_1 FROM My_Table
and
SELECT DISTINCT column_2 FROM My_Table
I'd also look into why you have that data in the same table to begin with if the two columns are not related. If they are related and you're trying to have a drop down for one column that then filters the items in the second drop down list then you really should return the full set of rows and let your front end application handle the filtering (and displaying unique results). Most drop down widgets should allow this kind of linking.

Try
Select (SELECT DISTINCT col1 from testtable) AS col1, (SELECT DISTINCT col2 from testtable) AS col2;
I think the group_concat is putting the commas into the returned table.

Related

How to find data based on comma separated parameter in comma separated data in my SQL query

We have below data,
plant table
----------------------------
| name | classification |
| A | 1,4,7 |
| B | 2,3,7 |
| C | 3,4,9,8 |
| D | 1,5,6,9 |
Now from front end side, they will send multiple parameter like "4,9",
and the objective output should be like this
plant table
---------------------------
| name | classification |
| A | 1,4,7 |
| C | 3,4,9,8 |
| D | 1,5,6,9 |
Already tried with FIND_IN_SET code, but only able to fetch only with 1 parameter
select * from plant o where find_in_set('4',classification ) <> 0
Another solution is by doing multiple queries, for example if the parameter is "4,9" then we do loop the query two times with parameter 4 and 9, but actually that solution will consume so much resources since the data is around 10000+ rows and the parameter itself actually can be more than 5 params
If the table design is in bad practice then OK but we are unable to change it since the table is in third party
Any solution or any insight will be appreciated,
Thank you
Schema (MySQL v8.0)
CREATE TABLE broken_table (name CHAR(12) PRIMARY KEY,classification VARCHAR(12));
INSERT INTO broken_table VALUES
('A','1,4,7'),
('B','2,3,7'),
('C','3,4,9,8'),
('D','1,5,6,9');
Query #1
WITH RECURSIVE cte (n) AS
(
SELECT 1
UNION ALL
SELECT n + 1 FROM cte WHERE n < 5
)
SELECT DISTINCT x.name, x.classification FROM broken_table x JOIN cte
WHERE SUBSTRING_INDEX(SUBSTRING_INDEX(classification,',',n),',',-1) IN (4,9);
name
classification
A
1,4,7
C
3,4,9,8
D
1,5,6,9
View on DB Fiddle
EDIT:
or, for older versions...
SELECT DISTINCT x.name, x.classification FROM broken_table x JOIN
(
SELECT 1 n UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5
) cte
WHERE SUBSTRING_INDEX(SUBSTRING_INDEX(classification,',',n),',',-1) IN (4,9)
Let's just avoid the CSV altogether and fix your table design:
plant table
----------------------------
| name | classification |
| A | 1 |
| A | 4 |
| A | 7 |
| B | 2 |
| B | 3 |
| B | 7 |
| ... | ... |
Now with this design, you may use the following statement:
SELECT *
FROM plant
WHERE classification IN (?);
To the ? placeholder, you may bind your collection of values to match (e.g. (4,9)).
You want or so you can use regular expressions. If everything were one digit:
where classification regexp replace('4,9', ',', '|')
However, this would match 42 and 19, which I'm guessing you do not want. So, make this a little more complicated so you have comma delimiters:
where classification regexp concat('(,|^)', replace('4,9', ',', ',|,'), '(,|$)')

Exotic GROUP BY In MySQL

Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)

How to run alternative where condition if row is zero

Suppose i have a table
table
+-------------+
| id | name |
+-------------+
| 1 | xabx |
| 2 | abxd |
| 3 | axcd |
| 4 | azyx |
| 5 | atyl |
| 6 | aksd |
| 7 | baabc|
| 8 | aabcd|
+-------------+
first i have to get data if matches first some char like :
if name = aab
then have to run select * from table where name like 'aab%'
then it returns
+-------------+
| 8 | aabcd |
+-------------+
which execatlly i want
but if i have only abc
then the above query return 0 row
then i have to search from middle like :
select * from table where name like '%abc%'
then it returns which is the alternative
+-------------+
| 7 | baabc|
| 8 | aabcd|
+-------------+
i have no much knowledge about mysql is there any query which can do like if first where condition don't have row then run alternative where condition
i have tried this but didn't work as i want.
select * from table where name like 'abc%' or name like '%abc%'
fiddle
thanks in advance
This is somewhat your desired result:
select *from t
where (case
when name like 'abc' then 1
when name like 'bc%' then 1
when name like '%bc' then 1
when name like '%bc%' then 1
else null
end)
order by name
limit 1;
I just put all the combinations as conditions.
You can interchange their sequence or remove unnecessary condition.
limit 1 makes only 1 row visible for whichever condition satisfies.
Here is the answer from your fiddle. Check it out
Hope it helps!
This is a possible solution:
Left joining the table on itself, where the table is initially filtered by the more inclusive %bx% and then the join is filtered by the more restrictive bx%.
This allows you to use the joined name if it exists, but revert to the original if not:
SELECT t1.id, IF(t2.name IS NULL, t1.name, t2.name) name
FROM test t1
LEFT JOIN test t2 ON t2.id = t1.id AND t1.name like 'bx%'
WHERE t1.name LIKE '%bx%'
This may/may not be ideal depending on the size or your dataset.
COUNT checking may work
select *
from table
where name like 'aab%' or
((select count(*) from table where name like 'aab%') = 0 and name like '%abc%')
I guess that it would be a good idea to compute the count value into a variable first, however, the optimizer may recognize independent subquery anyway and run it once.

Deleting almost duplicate rows in MySQL?

I have seen a few different answers for this question, but none really hit exactly what I needed to do in MySQL.
I did find a thread for MS SQL that is exactly to what I need to do here but nothing min MySQL.
Data Example
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Peaches| Outdoor | 2 |
| Apples | Indoor | 3 |
| Apples | Indoor | 4 |
+--------+----------+--------+
Desired Output
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Apples | Indoor | 3 |
+--------+----------+--------+
Your way is OK. You only forgot the KEYWORD TABLE
CREATE TABLE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
but the structure can be different from the original table
Do this way:
CREATE TABLE NewTable like t;
then add a unique key:
ALTER TABLE NewTable ADD KEY (Col1,col2);
and now copy old data in new table with ON DUPLICATE KEY UPDATE
INSERT INTO NewTable
SELECT *
from t
ON DUPLICATE KEY UPDATE Col3=GREATEST(Col3,VALUES(Col3));
so you copy every row and the duplicates tests for maximum
Im going to post the answer to the answer provided above so its clear...it is just one simple query:
CREATE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
Just querying max was the trick...so simple.
Thank you!

Mysql select IN return null if id not exists

I have a table like this:
+----+---------+---------+
| Id | column1 | column2 |
+----+---------+---------+
| 1 | a | b |
| 2 | a | b |
+----+---------+---------+
and a query like this SELECT * FROM table WHERE id IN (1,2,3)
what query do I need to get a result like this(I need to get null values for nonexisten id's):
+----+---------+---------+
| Id | column1 | column2 |
+----+---------+---------+
| 1 | a | b |
| 2 | a | b |
| 3 | null | null |
+----+---------+---------+
EDIT
Thanks for the responses so far.
Is there a more 'dynamic way' to do this, the query above it's just an example.
In reality I need to check around 1000 id's!
You could use something like this:
SELECT ids.ID, your_table.column1, your_table.column2
FROM
(SELECT 1 as ID
UNION ALL SELECT 2
UNION ALL SELECT 3) ids left join your_table
on ids.ID = your_table.ID
First subquery returns each value you need in a different row. Then you can try to join each row with your_table. If you use a left join, all values from the first subquery are shown, and if there's a match with your_table, values from your_table are shown, otherwise you will get nulls.
That is not the way SQL works unfortunately. I would think it would be pretty trivial for your application to determine the differences between the id's it asked for and the id's returned.
So rather than hack or some weird query to mock up your result, why not have your application handle it?
I still can't understand though what the use case might be to where you would be querying rows on teh database by id's that may or may not exist.