Deselect duplicates in sql - mysql

Database ::
id subid
1 1
2 1
3 2
4 3
The id column is automatically incremented and primary
subid is any number.
I want to arrange subid in decreasing order, remove rows which consists
duplicate subid in it and get rows
Result i want ::
id subid
4 3
3 2
2 1

select max(id), subid group by subid order by subid desc

Use
order by (col) ASC or DESC
To have ascending or descending sorting
Use also
group by (col)
or
distinct
To drop out duplicate datas

An explanation for #Geri of the use of GROUP BY.
with the accepted answer of :-
SELECT * FROM table GROUP BY subid ORDER BY subid DESC
the query selects all columns grouped by subid. In this case there is just 1 other column (although in other queries there could be many other columns). GROUP BY is designed to work with aggregate functions, for example MAX. With MAX(id) MySQL will find all the rows for each subid and then find the maximum value of id for each subid and return that.
So for subid 2 there is only a single id of 3 so that is obviously the max value of the id for that subid and will be returned, similarly for subid of 3 the only id is 4. For subid of 1 the possible values of id are 1 and 2. As 2 is the highest value it will be returned.
Without an aggregate function MySQL is free to chose any value of id, and which one it choosing is not defined and may change.
So for subid of 1 it might return the id of 1 or the id of 2.
Most flavours of SQL will issue an error in the situation where there is a column in the SELECT clause which is not the result of an aggregate function and is also not mentioned in the GROUP BY clause. By default MySQL does not error in this situation, and there are logical reasons for this to be correct. For example with extra columns which are entirely dependent on a column that is in the GROUP BY clause - such as grouping by a unique user id and pulling back the users name, and this behaviour is defined in the SQL standards although exact implementation is patchy (and of course it becomes a non issue if these directly related columns are added to the GROUP BY clause).
MySQL can be set up to error in this situation, bringing it more in to line with other flavours of SQL by running with only_full_group_by set up:-
http://dev.mysql.com/doc/refman/5.0/en/sql-mode.html#sqlmode_only_full_group_by
The answer by #franglais of:-
select max(id), subid group by subid order by subid desc
avoids the issue of which value of id to be returned being undefined, as it is specifying that for each subid returned the maximum id value for that subid will be returned.

SELECT MAX(id), subid FROM table GROUP BY subid ORDER BY subid DESC

Related

MySQL MAX Function mixes rows

I have the query SELECT id, MAX(value) FROM table1 and it returns the correct value, but it takes the first id of the table instead of the one corresponding to the value returned (id is primary key).
I've already seen solutions, but they all needed a WHERE clause which i can't use in my case.
I believe what you're trying to do is return the id of the row with the max value. Is that right?
I'm curious why you can't use a WHERE clause?
But ok, using that constraint this can be solved. I'm going to assume that your table is unique on id (if not, you should really talk to whoever built it and ask why ?)
SELECT id, value
FROM table1
ORDER BY value DESC
LIMIT 1
This will sort your table, by value descending (greatest -> least), and then only show the first row (ie, the row with the largest "value").
If your table is not unique on id, you can still group by ID and get the same
SELECT id, max(value) as max_value
FROM table1
GROUP BY id
ORDER BY max_value DESC
LIMIT 1
First, to answer why your query is behaving in the way you observe: I suspect you are running without sql_mode = only_full_group_by as your query would likely generate an error otherwise. As you've noticed, this can lead to somewhat odd results.
If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want.
In this case, since you have no GROUP BY clause, the entire table is effectively the group.
To get one id associated with the largest value in the table, you can select all the rows, order by the value (descending), and then just limit to the first result, no need for the aggregation operator (or a WHERE caluse):
SELECT id, value FROM table1 ORDER BY value DESC LIMIT 1
Note that if there are multiple ids with the (same) max value, this only returns one of them. In the comments, #RaymondNijland points out that this may give different results (for id, the value will always be the maximum) each time you run it, and you can make it more deterministic by ordering by id as well:
SELECT id, value FROM table1 ORDER BY value DESC, id ASC LIMIT 1
Likewise, if there are for some reason multiple values for the same ID, it will still return that ID if one of its rows happens to be the max value -- thankfully this doesn't apply in this case, as you mentioned that id is the primary key.
I think you forgot a group by clause :
SELECT id, MAX(value) FROM table1 GROUP BY id
EDIT : To answer your need you could do
SELECT id, MAX(value)
FROM table1
GROUP BY id
HAVING MAX(value) = (SELECT MAX(value) FROM table1)
This could give you multiple results if you have multiple ids with the max value. In this case you could add "LIMIT 1" to get only one result but that would be quite strange and random.

mysql return different result when using view and group by

I get a wrong result when using view and group by in mysql.
A simple table test
id name value
1 a 200
2 a 100
3 b 150
4 b NULL
5 c 120
when using normal syntax as
select * from (select * from test order by name asc, value asc ) as test group by test.name;
it returns
id name value
2 a 100
4 b NULL
5 c 120
however, if a replace the subquery as a view,
it shows different results.
create view test_view as select * from test order by name asc, value asc;
select * from test_view as test group by test.name;
it returns
id name value
1 a 200
3 b 150
5 c 120
it really bothers me, please someone give me some hint. Thanks.
Group it before and then order the result, try this, more simple same result:
select * from test group by name order by name asc, value asc
If you really need to make a subquery, it's the same, first group by:
select * from (select * from test group by name) as test order by test.name asc, test.value asc
http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html
"MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses."
There is nothing that suggests that your subquery trick should make the difference and assure the deterministic result you are hoping for.

MYSQL to order before grouping by

I have the following:
user_id date_created project_id
3 10/10/2013 1
3 09/10/2013 1
5 10/10/2013 1
8 10/10/2013 1
10 10/10/2013 1
3 08/10/2013 1
The end result i want is:
user_id date_created project_id
3 10/10/2013 1
5 10/10/2013 1
8 10/10/2013 1
10 10/10/2013 1
Context:
I have this thing called an influence, and a user can have many influences for a project.
I want to get a list of the latest influence from a user on a project.
I have tried:
select * from influences
where project_id = 1
group by user_id
ORDER BY created_at DESC
but of course this ignores first ordering by user created at, and then ordering the full list. It simply just squishes the users together and orders the end list
THE LARAVEL - Eloquent FOR THE ANSWER PROVIDED IS THIS:
return Influence::select( "user_id", "influence", DB::raw( "MAX(created_at) as created_at" ) )
->where( "project_id", "=", $projectID )
->groupBy( "user_id", "project_id" )->get();
You don't want to order before group by, because given the structure of your query, it won't necessary do what you want.
If you want the most recently created influence, then get it explicitly:
select i.*
from influences i join
(select user_id, max(created_at) as maxca
from influences i
where project_id = 1
group by user_id
) iu
on iu.user_id = i.user_id and iu.maxca = i.created_at
where i.project_id = 1;
Your intention is to use a MySQL extension that the documentation explicitly warns against using. You want to include columns in the select that are not in the group by. As the documentation says:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY clause. Sorting of the result set
occurs after values have been chosen, and ORDER BY does not affect
which values within each group the server chooses.
Use this:
SELECT user_id, project_id, MAX(date_created) as latest
FROM influences
WHERE project_id = 1
GROUP BY user_id, project_id
How it works: MySQL selects all the rows that match the WHERE conditions and sorts them by user_id then, for each user_id by project_id. From each set of rows having the same user_id and project_id it will produce a single row in the final result set.
You can use in the SELECT clause the columns used in the GROUP BY clause (user_id and project_id); their values are unambiguous: all the rows from each group have the same user_id and project_id.
You can also use aggregate functions. Each of them uses one column from all the rows in the group to compute a single value. The most recent created_at is, of course, MAX(created_at).
If you select a column that is neither included in the GROUP BY clause, nor passed to an aggregate function (like created_at you have in your query), MySQL has no hint how to compute that value. The standard SQL forbids it (the query is not valid) but MySQL allows it. It will simply pick a value from that column but there is no way to make it pick it from a specific row because this is, in fact, undefined behaviour.
You can omit the project_id from the GROUP BY clause because the WHERE clause will make all the rows having the same project_id. This will coincidentally make the result correct even if project_id does not appear in a GROUP BY clause and it's not computed using an aggregate function.
I recommend you to keep project_id into the GROUP BY clause. It doesn't affect the result or the query speed and it allows you to loose the filtering conditions (f.e. use WHERE project_id IN (1, 2)) always get the correct result (this doesn't happen if you remove it from GROUP BY).

MySQL select distinct doesn't work

I have a database with 1 table with the following rows:
id name date
-----------------------
1 Mike 2012-04-21
2 Mike 2012-04-25
3 Jack 2012-03-21
4 Jack 2012-02-12
I want to extract only distinct values, so that I will only get Mike and Jack once.
I have this code for a search script:
SELECT DISTINCT name FROM table WHERE name LIKE '%$string%' ORDER BY id DESC
But it doesn't work. It outputs Mike, Mike, Jack, Jack.
Why?
Because of the ORDER BY id DESC clause, the query is treated rather as if it was written:
SELECT DISTINCT name, id
FROM table
ORDER BY id DESC;
except that the id columns are not returned to the user (you). The result set has to include the id to be able to order by it. Obviously, this result set has four rows, so that's what is returned. (Moral: don't order by hidden columns — unless you know what it is going to do to your query.)
Try:
SELECT DISTINCT name
FROM table
ORDER BY name;
(with or without DESC according to whim). That will return just the two rows.
If you need to know an id for each name, consider:
SELECT name, MIN(id)
FROM table
GROUP BY name
ORDER BY MIN(id) DESC;
You could use MAX to equally good effect.
All of this applies to all SQL databases, including MySQL. MySQL has some rules which allow you to omit GROUP BY clauses with somewhat non-deterministic results. I recommend against exploiting the feature.
For a long time (maybe even now) the SQL standard did not allow you to order by columns that were not in the select-list, precisely to avoid confusions such as this. When the result set does not include the ordering data, the ordering of the result set is called 'essential ordering'; if the ordering columns all appear in the result set, it is 'inessential ordering' because you have enough data to order the data yourself.

Select Random Max in MySQL

I have this table:
id name bid
1 Test1 5.50
2 Test2 5.50
3 Test3 5.49
I want to select the row with the highest bid. If the highest bid is equal on another row, then it should randomly select one of the highest bid rows.
I tried:
SELECT name,max(bid) FROM table ORDER BY rand()
The output:
id name bid
1 Test1 5.50
My problem is that id "2" is never displayed because for some reason my query is only selecting id "1"
SELECTing name and MAX(bid) in the same query makes no sense: you are asking for the highest bid aggregated across all the rows, plus a name that's not aggregated, so it's not at all clear which row's name you'll be picking. MySQL typically picks the “right” answer you meant (one of the rows that owned the maximum bid) but it's not guaranteed, fails in all other databases, and is invalid in ANSI SQL.
To get a highest-bid row, order by bid and pick only the first result. If you want to ensure you get a random highest-bid row rather than just an arbitrary one, add a random factor to the order clause:
SELECT name, bid
FROM table
ORDER BY bid DESC, RAND()
LIMIT 1
SELECT name,bid
FROM table
WHERE bid=(SELECT max(bid) FROM table)
ORDER BY RAND()
LIMIT 1
should do the trick. Waiting for more optimized request ^^
That's because you're using an aggregate function, which collapses everything into a single row. You need a sub-select:
SELECT *
FROM table
WHERE bid = (SELECT MAX(bid) FROM table)
ORDER BY rand()
LIMIT 1;
But also be aware of why not to use ORDER BY RAND(). Although if you have only a few results, the performance implications may not be significant enough to bother changing.