I have a database with 1 table with the following rows:
id name date
-----------------------
1 Mike 2012-04-21
2 Mike 2012-04-25
3 Jack 2012-03-21
4 Jack 2012-02-12
I want to extract only distinct values, so that I will only get Mike and Jack once.
I have this code for a search script:
SELECT DISTINCT name FROM table WHERE name LIKE '%$string%' ORDER BY id DESC
But it doesn't work. It outputs Mike, Mike, Jack, Jack.
Why?
Because of the ORDER BY id DESC clause, the query is treated rather as if it was written:
SELECT DISTINCT name, id
FROM table
ORDER BY id DESC;
except that the id columns are not returned to the user (you). The result set has to include the id to be able to order by it. Obviously, this result set has four rows, so that's what is returned. (Moral: don't order by hidden columns — unless you know what it is going to do to your query.)
Try:
SELECT DISTINCT name
FROM table
ORDER BY name;
(with or without DESC according to whim). That will return just the two rows.
If you need to know an id for each name, consider:
SELECT name, MIN(id)
FROM table
GROUP BY name
ORDER BY MIN(id) DESC;
You could use MAX to equally good effect.
All of this applies to all SQL databases, including MySQL. MySQL has some rules which allow you to omit GROUP BY clauses with somewhat non-deterministic results. I recommend against exploiting the feature.
For a long time (maybe even now) the SQL standard did not allow you to order by columns that were not in the select-list, precisely to avoid confusions such as this. When the result set does not include the ordering data, the ordering of the result set is called 'essential ordering'; if the ordering columns all appear in the result set, it is 'inessential ordering' because you have enough data to order the data yourself.
Related
I have a table emp with following structure and data:
name dept salary
----- ----- -----
Jack a 2
Jill a 1
Tom b 2
Fred b 1
When I execute the following SQL:
SELECT * FROM emp GROUP BY dept
I get the following result:
name dept salary
----- ----- -----
Jill a 1
Fred b 1
On what basis did the server decide return Jill and Fred and exclude Jack and Tom?
I am running this query in MySQL.
Note 1: I know the query doesn't make sense on its own. I am trying to debug a problem with a 'GROUP BY' scenario. I am trying to understand the default behavior for this purpose.
Note 2: I am used to writing the SELECT clause same as the GROUP BY clause (minus the aggregate fields). When I came across the behavior described above, I started wondering if I can rely on this for scenarios such as:
select the rows from emp table where the salary is the lowest/highest in the dept.
E.g.: The SQL statements like this works on MySQL:
SELECT A.*, MIN(A.salary) AS min_salary FROM emp AS A GROUP BY A.dept
I didn't find any material describing why such SQL works, more importantly if I can rely on such behavior consistently. If this is a reliable behavior then I can avoid queries like:
SELECT A.* FROM emp AS A WHERE A.salary = (
SELECT MAX(B.salary) FROM emp B WHERE B.dept = A.dept)
Read MySQL documentation on this particular point.
In a nutshell, MySQL allows omitting some columns from the GROUP BY, for performance purposes, however this works only if the omitted columns all have the same value (within a grouping), otherwise, the value returned by the query are indeed indeterminate, as properly guessed by others in this post. To be sure adding an ORDER BY clause would not re-introduce any form of deterministic behavior.
Although not at the core of the issue, this example shows how using * rather than an explicit enumeration of desired columns is often a bad idea.
Excerpt from MySQL 5.0 documentation:
When using this feature, all rows in each group should have the same values
for the columns that are omitted from the GROUP BY part. The server is free
to return any value from the group, so the results are indeterminate unless
all values are the same.
This is a bit late, but I'll put this up for future reference.
The GROUP BY takes the first row that has a duplicate and discards any rows that match after it in the result set. So if Jack and Tom have the same department, whoever appears first in a normal SELECT will be the resulting row in the GROUP BY.
If you want to control what appears first in the list, you need to do an ORDER BY. However, SQL does not allow ORDER BY to come before GROUP BY, as it will throw an exception. The best workaround for this issue is to do the ORDER BY in a subquery and then a GROUP BY in the outer query. Here's an example:
SELECT * FROM (SELECT * FROM emp ORDER BY name) as foo GROUP BY dept
This is the best performing technique I've found. I hope this helps someone out.
As far as I know, for your purposes the specific rows returned can be considered to be random.
Ordering only takes place after GROUP BY is done
You can put a:
SET GLOBAL sql_mode=(SELECT REPLACE(##sql_mode,'ONLY_FULL_GROUP_BY',''));
before your query to enforce SQL standard GROUP BY behavior
I find that the best thing to do is to consider this type of query unsupported. In most other database systems, you can't include columns that aren't either in the GROUP BY clause or in an aggregate function in the HAVING, SELECT or ORDER BY clauses.
Instead, consider that your query reads:
SELECT ANY(name), dept, ANY(salary)
FROM emp
GROUP BY dept;
...since this is what's going on.
Hope this helps....
I think ANSI SQL requires that the select includes only fields from the GROUP BY clause, plus aggregate functions.
This behaviour of MySQL looks like returns some row, possibly the last one the server read, or any row it had at hand, but don't rely on that.
This would select the most recent row for each person:
SELECT * FROM emp
WHERE ID IN
(
SELECT
MAX(ID) AS ID
FROM
emp
GROUP BY
name
)
If you are grouping by department does it matter about the other data? I know Sql Server will not even allow this query. If there is a possibility of this sounds like there might be other issues.
Try using ORDER BY to pick the row that you want.
SELECT * FROM emp GROUP BY dept ORDER BY name ASC;
Will return the following:
name dept salary
----- ----- -----
jack a 2
fred b 1
let's see I have a simple table like this:
name id
tom 1
jerry 2
... ...
And from the outside, I got a list contains the names (tom, jerry, kettie...)
I am trying to use WHERE IN clause to retrieve the id based on the name list.
I can do
SELECT id FROM mySimpleTable where name in ('tom','jerry','kettie');
So just iterate the name list and generate the contents in the parentheses.
This works, but the results is not in the input order, for example, the input is tom, jerry, kettie, the expected the result is 1,2,3, however, the output actually could be in any order.
Then how can I modify the SQL clause to make sure I get my input and output matched so that I can do the following process accrordingly. I heard JOIN may help in this situation.
SELECT id
FROM mySimpleTable
where name in ('tom','jerry','kettie')
order by field(name, 'tom','jerry','kettie')
I heard JOIN may help in this situation.
Yes it can help:
SELECT m.id
FROM mySimpleTable m
JOIN (
SELECT 'tom' AS name, 1 AS orderNum
UNION ALL
SELECT 'jerry' AS name, 2 AS orderNum
UNION ALL
SELECT 'kettie' AS name, 3 AS orderNum
) AS sub
ON m.name = sub.name
ORDER BY sub.orderNum ASC;
SqlFiddleDemo
This solution can be also used in different RDBMS. field is MySQL specific.
How it works:
Create derived table/subquery with values you need to check and ordering column
JOIN will return only rows that correspond each other based on name
ORDER BY column you've added in subquery
just select id,name from table_a where name in ('tom','jerry','happy') , you will have the combination of the input name and output id.
this entirely depends on where you're getting the list for your "in" clause.
if it's from somewhere on the outside, you probably should first turn the list into a temp table, adding an id column that indicates the order (see this answer for a start on how to do that) - and then do an inner join with it.
I did try to run your SQL query, and me for one did get the resultant output in the same order as that of the input. Well, but still it isn't necessary it would happen the same way every time, so the best way to arrange your output in a particular hierarchy is to use the ORDER BY clause. The syntax would be:
SELECT column_name
FROM table_name
WHERE conditions
ORDER BY column_name;
So in your case, the query would read as:
SELECT id
FROM mysimpletable
WHERE name
IN('tom','jerry','kettie'....)
ORDER BY id;
You can get more help with MySQL concepts here for further information.
Select
id
from mySimpleTable
where name in ('tom','jerry','kettie')
Order by id
I searched for an answer here and didn't find one closer to my question.
I have the following situation: I need to display a person first and then show the rest in ascending order. All the people from the same table. I tried UNION but after that, the SQL seems to mix everything again.
I have tried this:
select name from people where name = 'John'
UNION
select name from people order by name
Since UNION does not select duplicated values. But in the end, it mixed up every result and did not show in the correct order that should be:
John
Ana
Bruce
What am I doing wrong?
You need to use order by to get what you want. In MySQL, this is pretty easy:
select name
from people
order by (name = 'John') desc, name
Results sets (like tables) represent unordered sets in SQL. The only way to impose an order is to use order by. The order by at the end of a union/union all query applies to the entire query.
As an aside, your code would come close to working if you used union all -- which is much preferred over union. The union operation does additional work to remove duplicates. In this case, that reorders the results, a convenient reminder that you can only depend on the order of results when you use order by.
Also you can use UNION ALL in a derived table
SELECT name
FROM
(
SELECT 1 AS Row_Id, name
FROM people
WHERE name = 'John'
UNION ALL
SELECT 2 AS Row_Id, name
FROM people
) t
ORDER BY Row_Id
I have a table with 2 columns 'id' and 'name'. id is regular autoincrement index, name is just varchar.
id name
1 john
2 mary
3 pop
4 mary
5 john
6 michael
7 john
8 will
I would like to sort search results like this
8 will
7 john
5 john
1 john
6 michael
4 mary
2 mary
3 pop
Id's are DESC, but all duplicate names are grouped together. It's ok If ASC is easier to achieve. I can reverse result in PHP if necessary.
Regular ORDER BY name,id DESC first sorts all names, and then sorts id's. That's not what I want. I am not interested in alphabetical orders of names.
I want id in DESC order, grouped by duplicate names if any.
p.s. I apologize it this is a duplicate question.
If I understand you correctly, you will need a temporary table for this.
SELECT TableName.id, TableName.Name FROM TableName
INNER JOIN
(
SELECT Name, max(id) as maxid FROM TableName GROUP BY Name
) as tmp on tmp.Name = TableName.Name
ORDER BY maxid DESC, id DESC
In mysql there is a group by clause, see select, and search for group by on the page.
There are examples on the page
so try this...
SELECT id, name FROM TableName
group BY name;
As your ID is acting as your primary key you probably don't need to tell it to sort ascending, but you could add the
order by id
if you desire.
of course the group by clause enables the ASC and DESC clauses to be added also, so you should get them in alphabetical order if you desire.
Although, that said just using an
order by desc(name)
will likely give you a similar result, you'll need to test it on your dataset.
be aware that this could be a very expensive select, as it tries to sort everything out into your required ordering.
As such if you are expecting to use it as the base of other selects you may decide to create a temp table, that you can then select against directly, otherwise the grouping may be re-occurring each time you use the primary table.
This will probably be something you want to look into doing once your data set becomes fairly large though.
David
I have a database that has the following columns:
-------------------
id|domain|hit_count
-------------------
And I would like to perform this query on it:
SELECT id,MIN(hit_count)
FROM table WHERE domain='$domain'
GROUP BY domain ORDER BY MIN(hit_count)
I would like this query to give me the id of the row that had the smallest hit_count for $domain. The only problem is that if I have two rows that have the same domain, say www.bestbuy.com, the query will just group by whichever one came first, and then although I will get the correct lowest hit_count, the id may or may not be the id of the row that has the lowest hit_count.
Does anyone know of a way for me to perform this query and to get the id that matches up with MIN(hit_count)? Thanks!
Try this:
SELECT id,MIN(hit_count),domain FROM table GROUP BY domain HAVING domain='$domain'
See, when you're using aggregates, either via aggregate functions (and min() is such a function) or via GROUP BY or HAVING operators, your data is being grouped. In your case it is grouped by domain. You have 2 fields in your select list, id and min(hit_count).
Now, for each group database knows which hit_count to pick, as you've specified this explicitly via the aggregate function. But what about id — which one should be included?
MySQL internally wraps such fields into max() aggregate function, which I find an error prone approach. In all other RDBMSes you will get an error for such a query.
The rule is: if you use aggregates, then all columns should be either arguments of aggregate functions or arguments of GROUP BY operator.
To achieve the desired result, you need a subquery:
SELECT id, domain, hit_count
FROM `table`
WHERE domain = '$domain'
AND hit_count = (SELECT min(hit_count) FROM `table` WHERE domain = '$domain');
I've used backticks, as table is a reserved word in SQL.
SELECT
id,
hit_count
FROM
table
WHERE
domain='$domain'
AND hit_count = (SELECT MIN(hit_count) FROM table WHERE domain='$domain')
Try this:
SELECT id,hit_count
FROM table WHERE domain='$domain'
GROUP BY domain ORDER BY hit_count ASC;
This should also work:
select id, MIN(hit_count) from table where domain="$domain";
I had same question. Please see that question below.
min(column) is not returning me correct data of other columns
You are using a GROPU BY. Which means each row in result represents a group of values.
One of those values is the group name (the value of the field you grouped by). The rest are arbitrary values from within that group.
For example the following table:
F1 | F2
1 aa
1 bb
1 cc
2 gg
2 hh
If u will group by F1: SELECT F1,F2 from T GROUP BY F1
You will get two rows:
1 and one value from (aa,bb,cc)
2 and one value from (gg,hh)
If u want a deterministic result set, you need to tell the software what algorithem to apply to the group. Several for example:
MIN
MAX
COUNT
SUM
etc etc
There is a most simplist way your query is OK just modify it with DESC keyword after GROUP BY domain
SELECT
id,
MIN(hit_count)
FROM table
WHERE domain = '$domain'
GROUP BY domain DESC
ORDER BY MIN(hit_count)
Explanation:
When you use group by with aggregate function it always selects the first record but if you restrict it with desc keyword it will select the lowest or last record of that group.
For testing puspose use this query that has only group_concat added.
SELECT
group_concat(id),
MIN(hit_count)
FROM table
WHERE domain = '$domain'
GROUP BY domain DESC
ORDER BY MIN(hit_count)
If you can have duplicated domains group by id:
SELECT id,MIN(hit_count)
FROM domain WHERE domain='$domain'
GROUP BY id ORDER BY MIN(hit_count)