SQL unexpected return

SQL unexpected return - mysql

So basically, I have a simple Database with only one table (it's a test DB).
The table has 4 columns:
ID
Name
OralGrade
WrittenGrade
What I'm trying to do is pretty simple (that's why i'm asking for your help): I want to get the name and average of the student whith the highest average.
What i tried:
SELECT nom, MAX(avg)
FROM (
SELECT nom, (noteOrale + noteEcrit)/2 as avg
FROM etudiant
GROUP BY nom) AS Table;
After trying this query, it returned me the name and an average but the average doesn't correspond to the name.
Can someone give me pointers or explain what went wrong? Thanks

Use order by and limit. No subquery is necessary:
SELECT nom, (noteOrale + noteEcrit)/2 as avg
FROM etudiant
ORDER BY avg DESC
LIMIT 1;
It would appear that no GROUP BY is needed either, because the values are all on one row.
If they are multiple rows, then you need GROUP BY.

I would just use limit for this:
SELECT nom, avg
FROM (
SELECT nom, (noteOrale + noteEcrit)/2 as avg
FROM etudiant
GROUP BY nom
) t
ORDER BY avg DESC
LIMIT 1
mysql allows you to use aggregation without including all non-aggregated columns in the group by clause -- so your query is just returning an arbitrary value for name.

What's wrong is that the value returned for nom is indeterminate. The MAX() aggregate is causing the query to collapse all of the rows, and picks out the the highest value of avg. That part is working.
But the value returned for nom is from some row in the collapsed group, not necessarily the row that has the highest value of avg. The query basically told MySQL to return a value of nom from any row in the group.
Other databases would throw an error with the query, with an error message along the lines of "non-aggregate in SELECT list not in GROUP BY".
A non-standard MySQL-specific extension allows the query to run. (It is possible to get MySQL to follow the standard more closely, like other database, if we include ONLY_FULL_GROUP_BY in sql_mode. With the extension disabled, MySQL would behave like other databases, and return an error.)
That's the reason for the "unexpected" behavior you observe.
(This answers the question you asked... explaining what went wrong.)

Just list the rows where the sum of the two fields is equal to the higest sum of the two fields in all the rows. You don't need to divide by two . That's just scaling.
Select * from table
Where noteOrale + noteEcrit =
(Select Max(noteOrale + noteEcrit)
From table)
if you also want the computed avg in the output, add it to the select clause.
Select nom, noteOrale, noteEcrit,
(noteOrale + noteEcrit)/2 Avg
from table
Where noteOrale + noteEcrit =
(Select Max(noteOrale + noteEcrit)
From table)

Related

MySQL MAX Function mixes rows

I have the query SELECT id, MAX(value) FROM table1 and it returns the correct value, but it takes the first id of the table instead of the one corresponding to the value returned (id is primary key).
I've already seen solutions, but they all needed a WHERE clause which i can't use in my case.

I believe what you're trying to do is return the id of the row with the max value. Is that right?
I'm curious why you can't use a WHERE clause?
But ok, using that constraint this can be solved. I'm going to assume that your table is unique on id (if not, you should really talk to whoever built it and ask why ?)
SELECT id, value
FROM table1
ORDER BY value DESC
LIMIT 1
This will sort your table, by value descending (greatest -> least), and then only show the first row (ie, the row with the largest "value").
If your table is not unique on id, you can still group by ID and get the same
SELECT id, max(value) as max_value
FROM table1
GROUP BY id
ORDER BY max_value DESC
LIMIT 1

First, to answer why your query is behaving in the way you observe: I suspect you are running without sql_mode = only_full_group_by as your query would likely generate an error otherwise. As you've noticed, this can lead to somewhat odd results.
If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want.
In this case, since you have no GROUP BY clause, the entire table is effectively the group.
To get one id associated with the largest value in the table, you can select all the rows, order by the value (descending), and then just limit to the first result, no need for the aggregation operator (or a WHERE caluse):
SELECT id, value FROM table1 ORDER BY value DESC LIMIT 1
Note that if there are multiple ids with the (same) max value, this only returns one of them. In the comments, #RaymondNijland points out that this may give different results (for id, the value will always be the maximum) each time you run it, and you can make it more deterministic by ordering by id as well:
SELECT id, value FROM table1 ORDER BY value DESC, id ASC LIMIT 1
Likewise, if there are for some reason multiple values for the same ID, it will still return that ID if one of its rows happens to be the max value -- thankfully this doesn't apply in this case, as you mentioned that id is the primary key.

I think you forgot a group by clause :
SELECT id, MAX(value) FROM table1 GROUP BY id
EDIT : To answer your need you could do
SELECT id, MAX(value)
FROM table1
GROUP BY id
HAVING MAX(value) = (SELECT MAX(value) FROM table1)
This could give you multiple results if you have multiple ids with the max value. In this case you could add "LIMIT 1" to get only one result but that would be quite strange and random.

Mysql DISTINCT with more than one column (remove duplicates)

My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.

I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.

MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.

How do I use MAX() to return the row that has the max value?

I have table orders with fields id, customer_id and amt:
SQL Fiddle
And I want get customer_id with the largest amt and value of this amt.
I made the query:
SELECT customer_id, MAX(amt) FROM orders;
But the result of this query contained an incorrect value of customer_id.
Then I built such the query:
SELECT customer_id, MAX(amt) AS maximum FROM orders GROUP BY customer_id ORDER BY maximum DESC LIMIT 1;
and got the correct result.
But I do not understand why my first query not worked properly. What am I doing wrong?
And is it possible to change my second query to obtain the necessary information to me in a simpler and competent way?

MySQL will allow you to leave GROUP BY off of a query, thus returning the MAX(amt) in the entire table with an arbitrary customer_id. Most other RDBMS require the GROUP BY clause when using an aggregate.
I don't see anything wrong with your 2nd query -- there are other ways to do it, but yours will work fine.

Some versions of SQL give you a warning or error when you select a field, have an aggregate operator like MAX or SUM, and the field you are selecting does not appear in GROUP BY.
You need a more complicated query to fetch the customer_id corresponding to the max amt. Unfortunately SQL is not as naive as you think. Once such way to do this is:
select customer_id from orders where amt = ( select max(amt) from orders);
Although a solution using joins is likely more performant.
To understand why what you were trying to do doesn't make sense, replace MAX with SUM. From the stance of how aggregate operators are interpreted, it's a mere coincidence that MAX returns something that corresponds to an actual row. SUM does not have this property, for instance.

Practically your first query can be seen as if it were GROUP BY-ed into a big single group.
Also, MySQL is free to choose each output value from different source rows from the same group.
http://dev.mysql.com/doc/refman/5.7/en/group-by-extensions.html
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause.
The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY clause. Sorting of the result set
occurs after values have been chosen, and ORDER BY does not affect
which values within each group the server chooses.

The problem with MAX() is that it will select the highest value of that specified field, considering the specified field alone. The other values in the same row are not considered or given preference for the result at any degree. MySQL will usually return whatever value is the first row of the GROUP (in this case the GROUP is composed by the entire table sinse no group was specified), dropping the information of the other rows during the agregation.
To solve this, you could do that:
SELECT customer_id, amt FROM orders ORDER BY amt DESC LIMIT 1
It should return you the customer_id and the highest amt while preserving the relation between both, because no agregation was made.

How do I use distinct for a column along with a where clause in sql server 2008?

I want to get the distinct value of a particular column however duplicity is not properly managed if more than 3 columns are selected.
The query is:
SELECT DISTINCT
ShoppingSessionId, userid
FROM
dbo.tbl_ShoppingCart
GROUP BY
ShoppingSessionId, userid
HAVING
userid = 7
This query produces correct result, but if we add another column then result is wrong.
Please help me as I want to use the ShoppingSessionId as a distinct, except when I want to use all the columns from the table, including with the where clause .
How can I do that?

The DISTINCT keyword applies to the entire row, never to a column.
Presently DISTINCT is not needed at all, because your script already makes sure that ShoppingSession is distinct: by specifying the column in GROUP BY and filtering on the other grouping column (userid).
When you add a third column to GROUP BY and it results in duplicated ShoppingSession, it means that some ShoppingSession values are associated with many different values of the added column.
If you want ShoppingSession to remain distinct after including that third column, you should decide which values of the the added column should be left in the output and which should be discarded. This is called aggregating. You could apply the MAX() function to that column, or MIN() or any other suitable aggregate function. Note that the column should not be included in GROUP BY in this case.
Here's an illustration of what I'm talking about:
SELECT
ShoppingSessionId,
userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
GROUP BY
ShoppingSessionId,
userid
HAVING userid = 7
There's one more note on your query. The HAVING clause is typically used for filtering on aggregated columns. If your filter does not involve aggregated columns, you'll be better off using the WHERE clause instead:
SELECT
ShoppingSessionId,
userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
WHERE userid = 7
GROUP BY
ShoppingSessionId,
userid
Although both queries would produce identical results, their efficiency would be different, because the first query would have to pull all rows, group/aggregate them, then discard all rows except userid = 7, but the second one would discard rows first and only then group/aggregate the remaining, which is much more efficient.
You could go even further and exclude the userid column from GROUP BY and pull its value with an aggregate function:
SELECT
ShoppingSessionId,
MAX(userid) AS userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
WHERE userid = 7
GROUP BY
ShoppingSessionId
Since all userid values in your output are supposed to contain 7 (because that's in your filter), you can just pick a maximum value per every ShoppingSession, knowing that it'll always be 7.

Why ORDER BY + MAX() return the maximum value when I grouping?

I have this part of a query :
GROUP BY trackid
ORDER BY MAX(history.date) DESC
And I see that, when I grouping, it returns the row with maximum date for each group. Why this behaviour? Order should works on the whole rows...not on the grouping ?!?!?
EDIT (Example)
This is my whole query :
SELECT tracklist.value, history.date
FROM history JOIN tracklist ON history.trackid=tracklist.trackid
ORDER BY history.date DESC
The result is :
tracklist3 2011-04-27 15:40:36
tracklist2 2011-04-27 13:15:43
tracklist2 2011-04-02 00:30:02
tracklist2 2011-04-01 14:20:12
tracklist1 2011-03-02 14:13:58
tracklist1 2011-03-01 12:11:50
As you can see, all line is correctly ordered by history.date.
Now I'd like to group them, keeping for each group the row with MAX history.date.
So the output should be :
tracklist3 2011-04-27 15:40:36
tracklist2 2011-04-27 13:15:43
tracklist1 2011-03-02 14:13:58
I see that :
GROUP BY trackid
ORDER BY MAX(history.date) DESC
works correctly, but I really don't understand why it works :) Order by is for the whole rows, not for the grouping....

When you say SELECT trackid, MAX(history.date) ... GROUP BY trackid ORDER BY MAX(history.date) DESC you are really saying: "Show me for each tracklist the most recent history entry and please show me the tracklist first whose most recent history entry is (overall) most recent."
The ORDER BY is applied after the grouping, (that's why it comes after the GROUP BY in the SQL).
Note that in your example, you have SELECT tracklist.value, history.date instead of SELECT tracklist.value, MAX(history.date). That is just wrong and unfortunately MySQL does not give a warning but it choses a random history.date at its discretion.

ORDER BY MAX(history.date) DESC is somewhat redundant if all you want to do is order the result set. Ordering applies to the result set.
Consider the results if you remove the ORDER BY clause. Without that, your query would only be grouping on the trackid column, so it wouldn't be valid--you would need to add an aggregate function to the date column or add the date column to the GROUP BY clause. By adding the aggregate function to the ORDER BY clause, you are essentially telling the SQL engine that for each group of trackid, get the maximum date. This seems to be what you want.

It seems you do not fully understand the GROUP BY statement. I would recommend looking up a tutorial on it.
But essentially, the GROUP BY statement combines a number of rows into one. The column names you GROUP BY determine how unique the new combined rows will be. SQL doesn't know how to handle all of the non-grouped columns since each new combined row will be pulling data from a number of rows that contain different values in these columns. That's why you use aggregate functions on the non-grouped columns in the SELECT statement. The aggregate function MAX() looks at all of the values in the history.date column for the rows that are being combined and returns only one of them. Additionally, the ORDER BY clause can only use columns that are being selected, that's why ORDER BY also must contain aggregate functions.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL unexpected return - mysql

Use order by and limit. No subquery is necessary: SELECT nom, (noteOrale + noteEcrit)/2 as avg FROM etudiant ORDER BY avg DESC LIMIT 1; It would appear that no GROUP BY is needed either, because the values are all on one row. If they are multiple rows, then you need GROUP BY.

Related

MySQL MAX Function mixes rows

Mysql DISTINCT with more than one column (remove duplicates)

How do I use MAX() to return the row that has the max value?

How do I use distinct for a column along with a where clause in sql server 2008?

Why ORDER BY + MAX() return the maximum value when I grouping?

Categories

Resources