How do I obtain the rows that contain a specific minimum value?

How do I obtain the rows that contain a specific minimum value? - mysql

I have a table with 3 rows and 3 columns. For all the rows with the same name, I want to retrieve the one that has the minimum value in the position column. in this example here. The result should be (apple, red, 3) and (melon, big, null).
null value in the 'position' column means that fruit is not in the list.
name category position
apple fruit 5
apple red 3
melon big null

The null makes this tricky. I'm not sure if it should be considered "high" or "low". Let me assume "low":
select t.*
from t
where coalesce(t.position, -1) = (select min(coalesce(t2.position, -1))
from t t2
where t2.name = t.name
);

SELECT
f.*
FROM
(
SELECT
name,
MIN(IFNULL(position,0)) as min_position
FROM
fruits
GROUP BY
name
) tmp
LEFT JOIN
fruits f ON
f.name = tmp.name AND
IFNULL(f.position,0) = min_position
-- GROUP BY name
-- optional if multiple (name, position) are possible for example
-- [apple,fruit,5], [apple,red,5]

Related

SQL- preserve order from 'IN' clause and return null for non matching clauses

Let's say I have a table Person with columns id, name, and phone. I want to fetch all records matching a list of pairs of names and phone numbers while preserving the order from the 'IN' clause and returning null or any default value for the mismatching clause.
For instance, if the Person table has the following records:
id
name
phone
1
Name1
1234
2
Name2
2345
3
Name3
4532
I want the query to return the ids of people matching pairs of names and phone numbers.
When queried with
('Name2', 2345), ('NonExistingName', 34543), ('Name1', 1234) should return a list [2, <null or a default value>, 1]
I am aware that I can use IN clause to find the matching rows,
SELECT id
FROM Person
WHERE (name, phone) in (('Name2', 2345),
('NonExistingName', 34543),
('Name1', 1234));
however, this alone doesn't fulfill what I want. The rows returned do not preserve the order and do not allow me to add a default value for nonexisting ids.

Relational databases explicitly disclaim any responsibility to ever preserve order unless you specify an ORDER BY clause. Therefore you will need to include the order information as part of the data in a way where you can reference it in the ORDER BY clause.
For example:
WITH source AS (
SELECT 'Name2' Name, 2345 Phone, 0 Ordinal
UNION
SELECT 'NonExistingName', 34543, 1
UNION
SELECT 'Name1', 1234, 2
)
SELECT p.id
FROM source s
LEFT JOIN Person p ON s.Name = p.Name and s.Phone = p.Phone
ORDER BY s.Ordinal
Or:
SELECT p.id
FROM (VALUES
ROW ('Name2', 2345, 0),
ROW ('NonExistingName', 34543, 1),
ROW ('Name1', 1234, 2)
) s
LEFT JOIN Person p ON s.column_0 = p.Name and s.column_1 = p.Phone
ORDER BY s.column_2

Count group by enum including possible enum values that have 0 count

I have a table of items. One of the fields is a category (represented by an enum). Some categories have zero items.
So I did this:
select category, count(*) as total from items group by category;
+------------+-------+
| category | total |
+------------+-------+
| one | 6675 |
+------------+-------+
I want to generate a table like this (where two is the other possible enum value):
+------------+-------+
| category | total |
+------------+-------+
| one | 6675 |
+------------+-------+
| two | 0 |
+------------+-------+
How do I do this with an mysql SQL query?

Enum datatype is generally preferred for those cases where possible options (values) are not too many (prefer <= 10), and you are not going to add new options in future (atleast not very frequently). So, a good use-case for Enum is gender: (m, f, n). In your case, it would be generally better to have a Master table of all possible Categories, instead of using Enum for them. Then it is easier to do a LEFT JOIN from the Master table.
However, as asked by you:
A solution uses the enum type to generate the table, and includes 0
entries
Works for all MySQL/MariaDB versions:
We will need to get the list of all possible Enum values from INFORMATION_SCHEMA.COLUMNS:
SELECT
SUBSTRING(COLUMN_TYPE, 6, CHAR_LENGTH(COLUMN_TYPE) - 6) AS enum_values
FROM
information_schema.COLUMNS
WHERE
TABLE_NAME = 'items' -- your table name
AND
COLUMN_NAME = 'category' -- name of the column
AND
TABLE_SCHEMA = 'your_db' -- name of the database (schema)
But then, this query will give you all the enum values in comma-separated string, like below:
'one','two','three','four'
Now, we will need to convert this string into multiple rows. To achieve that, we can use a Sequence (Number series) table. You can define a permanent table in your database storing integers ranging from 1 to 100 (you may find this table helpful in many other cases as well) (OR, another approach is to use a Derived Table - check this to get an idea: https://stackoverflow.com/a/58052199/2469308).
CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1), (2), ...... , (99), (100);
Now, we will do a JOIN between "enum values string" and seq table, based on the position of comma, to extract enum values into different rows. Note that instead of just using , (comma) to extract enum values, we would use ',' (to avoid cases when there might be a comma inside the value string). String operations utilizing Substring_Index(), Trim(), Char_Length() etc functions can be used to extract enum values. You can check this answer to get a general idea about this technique:
Schema (View on DB Fiddle)
CREATE TABLE items
(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
category ENUM('one','two','three','four'),
item_id INT UNSIGNED) ENGINE=InnoDB;
INSERT INTO items (category, item_id)
VALUES ('one', 1),
('two', 2),
('one', 2),
('one', 3);
CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL,
PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1),(2),(3),(4),(5);
Query #1
SELECT Trim(BOTH '\'' FROM Substring_index(Substring_index(e.enum_values,
'\',\'',
seq.n),
'\',\'', -1)) AS cat
FROM (SELECT Substring(column_type, 6, Char_length(column_type) - 6) AS
enum_values
FROM information_schema.columns
WHERE table_name = 'items'
AND column_name = 'category'
AND table_schema = 'test') AS e
JOIN seq
ON ( Char_length(e.enum_values) - Char_length(REPLACE(e.enum_values,
'\',\'',
''))
) / 3 >= seq.n - 1
| cat |
| ----- |
| one |
| two |
| three |
| four |
Now, the hard part is done. All we need to do is do a LEFT JOIN from this subquery (having all category enum values) to your items table, to get Count per category.
The final query follows (View on DB Fiddle):
SELECT all_cat.cat AS category,
Count(i.item_id) AS total
FROM (SELECT Trim(BOTH '\'' FROM Substring_index(
Substring_index(e.enum_values,
'\',\'',
seq.n),
'\',\'', -1)) AS cat
FROM (SELECT Substring(column_type, 6, Char_length(column_type) - 6)
AS
enum_values
FROM information_schema.columns
WHERE table_name = 'items'
AND column_name = 'category'
AND table_schema = 'test') AS e
JOIN seq
ON ( Char_length(e.enum_values) - Char_length(
REPLACE(e.enum_values,
'\',\'',
''))
) / 3 >= seq.n - 1) AS all_cat
LEFT JOIN items AS i
ON i.category = all_cat.cat
GROUP BY all_cat.cat
ORDER BY total DESC;
Result
| category | total |
| -------- | ----- |
| one | 3 |
| two | 1 |
| three | 0 |
| four | 0 |

Here is some fun with MySQL 8.0 and JSON_TABLE():
select c.category, count(i.category) as total
from information_schema.COLUMNS s
join json_table(
replace(replace(replace(trim('enum' from s.COLUMN_TYPE),'(','['),')',']'),'''','"'),
'$[*]' columns (category varchar(50) path '$')
) c
left join items i on i.category = c.category
where s.TABLE_SCHEMA = 'test' -- replace with your db/schema name
and s.TABLE_NAME = 'items'
and s.COLUMN_NAME = 'category'
group by c.category
It converts the ENUM type definition from information_schema to a JSON array, which is then converted by JSON_TABLE() to a table, which you then can use for a LEFT JOIN.
See demo on db-fiddle
Note: The categories should not contain any characters from ()[]'".
But seriously – Just create the categories table. There are more reasons to do that. For example you might want to render a drop-down menu with all possible categories. That would be simple with
select category from categories

I would say that it's basically bad practice to encode your enumerations into the script. Therefore, create a table with the enumerations present (and their relative keys), then it's a simple case of grouping a left joined query...
SELECT
cat.enum_name,
COUNT(data.id) AS total
FROM
category_table cat
LEFT JOIN
data_table data
ON cat.cate_id = data.cat_id
GROUP BY
cat.enum_name

using in-select subquery
select cat.categoryname
(
select count(*) -- count total
from items as i
where i.category = cat.category -- connect
) as totalcount
from cat
order by cat.categoryname

You can make a fictive dataset of the different categories and do a left join with your original table as shown below.
SELECT A.category, count(*) total FROM
(SELECT 'one' as Category
UNION ALL
SELECT 'two' as Category) A
LEFT JOIN items B
ON A.Category=B.Category
GROUP BY B.Category;
If you would prefer to get a list of all the category dynamically, then save them in another table (say All_category_table) then do a join as shown below:
SELECT A.category, count(*) total FROM
(SELECT Category FROM All_category_table) A
LEFT JOIN items B
ON A.Category=B.Category
GROUP BY B.Category;

This answer is applicable for when you do not have another table holding the possible category values.
Let's say you have a table called real_table with a not null & value constrained column category. In this column you know you can theoretically encounter 5 different values: 'CATEGORY_0', 'CATEGORY_1', 'CATEGORY_2', 'CATEGORY_3', 'CATEGORY_4':
CREATE TABLE real_table
(
id VARCHAR(255) NOT NULL
PRIMARY KEY,
category VARCHAR(255) NOT NULL
CONSTRAINT category_in CHECK (
category in ('CATEGORY_0',
'CATEGORY_1',
'CATEGORY_2',
'CATEGORY_3',
'CATEGORY_4')
)
);
But your actual data set in the table does not include any row with value 'CATEGORY_0'. So when you run a query such as:
SELECT real_table.category AS category, COUNT(*) AS cnt
FROM real_table
GROUP BY real_table.category;
you will see, that you get result like this:
category
cnt
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
Hmm, the 'CATEGORY_0' is omitted. Not good.
Since your categories are not backed by another table, then you must create an artificial dataset of the possible categories that looks as below:
SELECT 'CATEGORY_0' AS category_entry
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry;
You can use this in your original query as a table to do a right join on:
SELECT all_categories.category_entry AS category,
COUNT(real_table.id) AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
RIGHT JOIN
(SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry) all_categories
ON real_table.category = all_categories.category_entry
GROUP BY all_categories.category_entry;
Now when you run the query, you should get the desired output:
category
cnt
CATEGORY_0
0
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
The 'CATEGORY_0' is now included with zero cnt. Nice.
Now let's say that the category column is not not null constrained and can also possibly include some other unexpected category values (e.g. 'CATEGORY_66'):
CREATE TABLE real_table
(
id VARCHAR(255) NOT NULL
PRIMARY KEY,
category VARCHAR(255) -- nullable and no constraint for valid values
);
We would like to include these null and unexpected category counts in the result set as well.
Then we must prepare the artificial dataset of the possible categories differently:
SELECT DISTINCT all_categories.category_entry
FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry
UNION ALL
SELECT DISTINCT category
FROM real_table AS category_entry) all_categories;
and use it as before:
SELECT distinct_categories.category_entry AS category,
COUNT(real_table.id) AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
RIGHT JOIN
(SELECT DISTINCT all_categories.category_entry
FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry
UNION ALL
SELECT DISTINCT category
FROM real_table AS category_entry) all_categories) distinct_categories
ON real_table.category = distinct_categories.category_entry
GROUP BY distinct_categories.category_entry;
Now when you run the query, the output should also include counts for additional categories and null categories
category
cnt
CATEGORY_0
0
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
CATEGORY_66
13
10
Both unexpected 'CATEGORY_66' (with 13 entries) as well as null category (with 10 entries) are now included in the result set
I cannot vouch for the performance of the provided queries - somebody more experienced might weigh in on that?

MySQL query returning no rows

I have two MySQL tables. What I am trying to do is to export the information where Value 1 is 1 less than Value 2 AND where ID_1 does not have its Value 1 and Value 2 equal.
Note:
Fields Value 1 and 2 are just integers.
Each distinct ID_A has the same Value_2
If there are two Value_1s that are one less than Value_2, look to Value_3 and select one that is higher
The reason why I have two tables here is because I am going to output information from both tables
We can write a script for this, but I need to do this in a single command for bonus points (which my instructor declared is possible)... I haven't even started a script for this, as I don't really know how to do that...
tableA looks like this:
ID_1 ID_2
A A
A B
B A
B B
C A
C B
C C
tableB looks like this:
ID_1 ID_2 Value_1 Value_2 Value_3
A A 2 3 NULL
A B 3 3 NULL
B A 4 5 NULL
B B 7 5 NULL
C A 7 8 98
C B 3 8 NULL
C C 7 8 56
The query should return this:
ID_1 ID_2
B A
C A
Here is what I have so far... And it keeps returning no hits, which is making me confused. I believe it is the AND clause after the first WHERE statement where I need to fix
SELECT CONCAT(...)
INTO OUTFILE '/tmp/outfile.tab'
FIELDS TERMINATED BY '\t'
ESCAPED BY ''
FROM tableA
INNER
JOIN tableB
ON tableA.ID_1 = tableB.ID_1
AND tableA.ID_2 = tableB.ID_2
WHERE tableB.Value_1 - 1 = tableB.Value_2
AND tableA.ID_1 !=
( SELECT DISTINCT
ID_1
FROM tableB
WHERE ID_1 = tableA.ID_1
AND Value_1 = Value_2
)
;
One final note: we issue all commands through putty, in which we can access MySQL

To be honest, I still don't understand exactly what you're trying to do, but I can explain why your query is returning no rows.
Look at this clause:
AND tableA.ID_1 !=
( SELECT DISTINCT
ID_1
FROM tableB
WHERE ID_1 = tableA.ID_1
AND Value_1 = Value_2
)
The subquery will necessarily always return either tableA.ID_1 or NULL. (Do you see why?) So the comparison is never "true"; it's always either "false" (because tableA.ID_1 != tableA.ID_1 is necessarily "false") or "null/indeterminate" (because tableA.ID_1 != NULL is "null/indeterminate"). Therefore, this clause filters out all results from your query — regardless of what the rest of your query might say.

I am not 100% sure of the question, but if I get it right, the first row of tableB (Line 25 in http://imgur.com/a/r3Qy5#1) should NOT be selected, because ID_1=A has Value_1=3 in the second row (Line 26 in http://imgur.com/a/r3Qy5#1), which is the same as Value_1 of the first row (Line 25 in http://imgur.com/a/r3Qy5#1).
So you could start with something like
SELECT .... FROM
tableA NATURAL JOIN tableB
WHERE Value_1=Value_2-1
AND Value_2 NOT IN (SELECT tb.Value_1 from tableB AS tb WHERE tb.ID_1=tableB.ID_1)
which fullfills requirements #1 and #2. For requirement #3 (if there are two rows for an ID_1, chose the one with the highest Value_3), we need to sort that on Value_3 and wrap it in a superquery for grouping:
SELECT .... FROM (
SELECT * FROM
tableA NATURAL JOIN tableB
WHERE Value_1=Value_2-1
AND Value_2 NOT IN (SELECT tb.Value_1 from tableB AS tb WHERE tb.ID_1=tableB.ID_1)
ORDER BY Value_3 DESC
) AS innerview
GROUP BY Value_1,Value_2
which gives the correct answer for the test data in your example.

You'll FIRST have to apply a test for your "Value_3" criteria grouped by the respective "ID_1" classification and Value1, value2. By applying the WHERE clause here, you are getting your final set of records INCLUSIVE of what WOULD be the highest value 3 entry in its result set. Now, that gets joined again to tableB AGAIN, but matching the qualifying entries. Since the COALESCE() will change any NULL value to 0 in the first result set, the JOIN clause must also match that. As in the case for the "A" and "B" groups where no Value_3 was applied, yet in the "C" group, it WILL have a valid value and pre-result in the entry with the max value of 98. That said, when re-joined back to instance "tb2" for TableB a second time will get the proper ID_2 of "A" for that set.
select
MaxQualified.ID_1,
tb2.ID_2,
MaxQualified.Value_1,
MaxQualified.Value_2,
tb2.Value_3
from
( select
tb.ID_1,
tb.Value_1,
tb.Value_2,
MAX( COALESCE( tb.Value_3, 0 ) ) as HighestVal3
from
TableB tb
where
tb.Value_1 +1 = tb.Value_2
group by
tb.ID_1,
tb.Value_1,
tb.Value_2 ) MaxQualified
JOIN TableB tb2
on MaxQualified.ID_1 = tb2.ID_1
AND MaxQualified.Value_1 = tb2.Value_1
AND MaxQualified.Value_2 = tb2.Value_2
AND MaxQualified.HighestVal3 = COALESCE( tb2.Value_3, 0 )
Now, that being said, and this is homework, this COULD fail or give multiple answers if you had multiple ID1, Value1, Value2, Value3 entries. It would return all "ID2" instances of the exact same common criteria. You would have to do even one more level nested to remove that level of distinction.
Your answer should ALSO return "A", "A", 2, 3

Is it possible to add conditions to a MAX() call in an aggregated query?

Background
My typical use case:
# Table
id category dataUID
---------------------------
0 A (NULL)
1 B (NULL)
2 C text1
3 C text1
4 D text2
5 D text3
# Query
SELECT MAX(`id`) AS `id` FROM `table`
GROUP BY `category`
This is fine; it will strip out any "duplicate categories" in the recordset that's being worked on, giving me the "highest" ID for each category.
I can then go on use this ID to pull out all the data again:
# Query
SELECT * FROM `table` JOIN (
SELECT MAX(`id`) AS `id` FROM `table`
GROUP BY `category`
) _ USING(`id`)
# Result
id category dataUID
---------------------------
0 A (NULL)
1 B (NULL)
3 C text1
5 D text3
Note that this is not the same as:
SELECT MAX(`id`) AS `id`, `category`, `dataUID` FROM `table`
GROUP BY `category`
Per the documentation:
In standard SQL, a query that includes a GROUP BY clause cannot refer
to nonaggregated columns in the select list that are not named in the
GROUP BY clause. For example, this query is illegal in standard SQL
because the name column in the select list does not appear in the
GROUP BY:
SELECT o.custid, c.name, MAX(o.payment) FROM orders AS o, customers
AS c WHERE o.custid = c.custid GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the
select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group.
[..]
This extension assumes that the nongrouped columns will have the same group-wise values. Otherwise, the result is indeterminate.
So I'd get an unspecified value for dataUID — as an example, either text2 or text3 for result with id 5.
This is actually a problem for other fields in my real case; as it happens, for the dataUID column specifically, generally I don't really care which value I get.
Problem
However!
If any of the rows for a given category has a NULL dataUID, and at least one other row has a non-NULL dataUID, I'd like MAX to ignore the NULL ones.
So:
id category dataUID
---------------------------
4 D text2
5 D (NULL)
At present, since I pick out the row with the maximum ID, I get:
5 D (NULL)
But, because the dataUID is NULL, instead I want:
4 D text2
How can I get this? How can I add conditional logic to the use of aggregate MAX?
I thought of maybe handing MAX a tuple and pulling the id out from it afterwards:
GET_SECOND_PART_SOMEHOW(MAX((IF(`dataUID` NOT NULL, 1, 0), `id`))) AS `id`
But I don't think MAX will accept arbitrary expressions like that, let alone tuples, and I don't know how I'd retrieve the second part of the tuple after-the-fact.

slight tweak to #ypercube's answer. To get the ids you can use
SELECT COALESCE(MAX(CASE
WHEN dataUID IS NOT NULL THEN id
END), MAX(id)) AS id
FROM table
GROUP BY category
And then plug that into a join

This was easier than I thought, in the end, because it turns out MySQL will accept an arbitrary expression inside MAX.
I can get the ordering I want by injecting a leading character into id to serve as an ordering hint:
SUBSTRING(MAX(IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`))) FROM 2)
Walk-through:
id category dataUID IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`)
--------------------------------------------------------------------------------------
0 A (NULL) a0
1 B (NULL) a1
2 C text1 b2
3 C text1 b3
4 D text2 b4
5 D (NULL) a5
So:
SELECT
`category`, MAX(IF (`dataUID` IS NULL, CONCAT('a',`id`), CONCAT('b',`id`)) AS `max_id_with_hint`
FROM `table`
GROUP BY `category`
category max_id_with_hint
------------------------------
A a0
B a1
C b3
D b4
It's then a simple matter to chop the ordering hint off again.
Thanks in particular to #JlStone for setting me, via COALESCE, on the path to embedding expressions inside the call to MAX and directly manipulating the values supplied to MAX.

From what I can remember you can use COALESCE inside of grouping statements. For example.
SELECT MAX(COALESCE(`id`,1)) ...
hm seems I read to quickly the first time. I think maybe you want something like this?
SELECT * FROM `table` JOIN (
SELECT MAX(`id`) AS `id` FROM `table`
WHERE `dataUID` IS NOT NULL
GROUP BY `category`
) _ USING(`id`)
or perhaps
SELECT MAX(`id`) AS `id`,
COALESCE (`dataUID`, 0) as `dataUID`
FROM `table`
GROUP BY `category`

select *
from t1
join (
select max(id) as id,
max(if(dataGUID is NULL, NULL, id)) as fallbackid,
category
from t1 group by category) as ids
on if(ids.id = fallbackid or fallbackid is null, id, fallbackid) = t1.id;

SELECT t.*
FROM table AS t
JOIN
( SELECT DISTINCT category
FROM table
) AS tdc
ON t.id =
COALESCE(
( SELECT MAX(id) AS id
FROM table
WHERE category = tdc.category
AND dataUID IS NOT NULL
)
, ( SELECT MAX(id) AS id
FROM table
WHERE category = tdc.category
AND dataUID IS NULL
)
)

you need clause OVER
SELECT id, category,dataUID
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY category ORDER BY id desc, dataUID desc ) rn,
id, category,dataUID FROM table
) q
WHERE rn=1
Consider that sorting by desc moves null values at last.

GROUP BY - do not group NULL

I'm trying to figure out a way to return results by using the group by function.
GROUP BY is working as expected, but my question is: Is it possible to have a group by ignoring the NULL field. So that it does not group NULLs together because I still need all the rows where the specified field is NULL.
SELECT `table1`.*,
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`
FROM `table1`
WHERE (enabled = 1)
GROUP BY `ancestor`
So now let's say I have 5 rows and the ancestor field is NULL, it returns me 1 row....but I want all 5.

Perhaps you should add something to the null columns to make them unique and group on that? I was looking for some sort of sequence to use instead of UUID() but this might work just as well.
SELECT `table1`.*,
IFNULL(ancestor,UUID()) as unq_ancestor
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`
FROM `table1`
WHERE (enabled = 1)
GROUP BY unq_ancestor

When grouping by column Y, all rows for which the value in Y is NULL are grouped together.
This behaviour is defined by the SQL-2003 standard, though it's slightly surprising because NULL is not equal to NULL.
You can work around it by grouping on a different value, some function (mathematically speaking) of the data in your grouping column.
If you have a unique column X then this is easy.
Input
X Y
-------------
1 a
2 a
3 b
4 b
5 c
6 (NULL)
7 (NULL)
8 d
Without fix
SELECT GROUP_CONCAT(`X`)
FROM `tbl`
GROUP BY `Y`;
Result:
GROUP_CONCAT(`foo`)
-------------------
6,7
1,2
3,4
5
8
With fix
SELECT GROUP_CONCAT(`X`)
FROM `tbl`
GROUP BY IFNULL(`Y`, `X`);
Result:
GROUP_CONCAT(`foo`)
-------------------
6
7
1,2
3,4
5
8
Let's take a closer look at how this is working
SELECT GROUP_CONCAT(`X`), IFNULL(`Y`, `X`) AS `grp`
FROM `tbl`
GROUP BY `grp`;
Result:
GROUP_CONCAT(`foo`) `grp`
-----------------------------
6 6
7 7
1,2 a
3,4 b
5 c
8 d
If you don't have a unique column that you can use, you can try to generate a unique placeholder value instead. I'll leave this as an exercise to the reader.

GROUP BY IFNULL(required_field, id)

SELECT table1.*,
GROUP_CONCAT(id SEPARATOR ',') AS children_ids
FROM table1
WHERE (enabled = 1)
GROUP BY ancestor
, CASE WHEN ancestor IS NULL
THEN table1.id
ELSE 0
END

Maybe faster version of previous solution in case you have unique identifier in table1 (let suppose it is table1.id) :
SELECT `table1`.*,
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`,
IF(ISNULL(ancestor),table1.id,NULL) as `do_not_group_on_null_ancestor`
FROM `table1`
WHERE (enabled = 1)
GROUP BY `ancestor`, `do_not_group_on_null_ancestor`

To union multiple tables and group_concat different column and a sum of the column for the (unique primary or foreign key) column to display a value in the same row
select column1,column2,column3,GROUP_CONCAT(if(column4='', null, column4)) as
column4,sum(column5) as column5
from (
select column1,group_concat(column2) as column2,sum(column3 ) as column3,'' as
column4,'' as column5
from table1
group by column1
union all
select column1,'' as column2,'' as column3,group_concat(column4) as
column4,sum(column5) as column5
from table 2
group by column1
) as t
group by column1

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How do I obtain the rows that contain a specific minimum value? - mysql

The null makes this tricky. I'm not sure if it should be considered "high" or "low". Let me assume "low": select t.* from t where coalesce(t.position, -1) = (select min(coalesce(t2.position, -1)) from t t2 where t2.name = t.name );

SELECT f.* FROM ( SELECT name, MIN(IFNULL(position,0)) as min_position FROM fruits GROUP BY name ) tmp LEFT JOIN fruits f ON f.name = tmp.name AND IFNULL(f.position,0) = min_position -- GROUP BY name -- optional if multiple (name, position) are possible for example -- [apple,fruit,5], [apple,red,5]

Related

SQL- preserve order from 'IN' clause and return null for non matching clauses

Count group by enum including possible enum values that have 0 count

MySQL query returning no rows

Is it possible to add conditions to a MAX() call in an aggregated query?

GROUP BY - do not group NULL

Categories

Resources