Get AVG() of values from table with different names - mysql

I have a table :
CREATE TABLE data
(
value integer,
name varchar(100)
)
In my table there can be duplicate values of name possible with different value of value. Now I want to get DISTINCT name and there avg() value from the Table data.
I am able to get DISTINCT value of name but unable to get avg() of there values.
Now with following Query I get avg() of all data :
select floor(avg(value)) from data
I know this is incorrect but I am new to SQL. I want this select floor(avg(value)) for distinct values of name.
Data :
insert into data values(10, 'mnciitbhu')
insert into data values(20, 'mnciitbhu')
insert into data values(40, 'mafiya69')
insert into data values(20, 'mafiya69')
insert into data values(0, 'mafiya69')
Output :
mnciitbhu 15
mafiya69 20

Adding this because the other answers while accurate, are not detailed.
What you want to do here, are use the grouping and aggregation features of SQL.
grouping your results by particular fields, will divide your result set into discrete sections, which you can operate on with aggregate functions, to get averages, sums, counts etc, per group.
For a full list of aggregate functions, and other miscellaneous information about group by, you can read 12.16.1 GROUP BY (Aggregate) Functions.
In your instance, since you want the average per name, you will need to group by name. This would give the following query:
select name, avg(value)
from `data`
group by name; -- this is the important line
And this query will calculate the average of value, for each group of names in your table, returning one row per group.
One very important consideration when using group by, is that all fields contained in the select, must either be contained in the group by clause, or used in aggregate functions. If you refer to a field that isn't covered by this, you may end up with undesired indeterminate results.
From the manual 12.16.3 MySQL Handling of GROUP BY
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
The importance of that paragraph cannot be overstated. It is very easy to mis-understand how this works, arrive at a query that seems to give the desired result, but will occasionally give incorrect/undesired data.

Use this code:
select name,AVG(value) as Average from data
group by name
order by name desc
OUTPUT:
name Average
mnciitbhu 15
mafiya69 20

Try this
select name,avg(value) from data group by name

Related

in mysql my distinct query not working?

in the below image I'm using
SELECT DISTINCT(name),date,reporting,leaving from attendance where date='2016-09-01
and I'm still getting repeating names. Why?
When using DISCTINCT, MySQL uses all columns as grouping factor. If you want group by only one column and get all corresponding column values, use GROUP BY instead
SELECT name, date, reporting, leaving FROM attendance GROUP BY name WHERE ...
Actually your all rows have distinct data apart from Name column if you want only distinct names then you can get it with help of Aggregate functions, you can use MIN or MAX as per your business requirement
SELECT Name,MAX(date),MAX(reporting),MAX(leaving)
FROM attendance
WHERE date='2016-09-01'
GROUP BY Name

Understanding correlation in mysql

I have a table with duplicate IDs representing a person who has placed an order. Each of these orders has a date. Each order has a status code from 1 - 4. 4 means a cancelled order. I am using the following query:
SELECT
personID, MAX(date), status
FROM
orders
WHERE
status = 4
GROUP BY
personID
The problem is, while this DOES return a unique record for each person with their most recent order date, it does NOT give me the correct status. In other words, I assumed that the status would be correctly correlated to the MAX(date) and it is not. It simply pulls, seemingly at random, one of the statuses from one of the orders. Can I add specificity to say, in basic terms, give me the EXACT status from the same record as whatever the MAX(date) is.
Unfortunately, there is no simple way to get what you want. Most other RDBMS vendors don't even consider queries using aggregate functions valid unless all non-aggregated result fields are in the GROUP BY. The general solution for these kinds of questions usually involves a subquery to get the "last" records, which is then joined to the original table to get those rows.
Depending on the structure of your data this may or may not be possible. For instance, if you have multiple rows with the same personID and date there is no way to determine from those alone which one's status should be used.
To get result you want you could use:
SELECT personId, date, status
FROM orders
WHERE (personID,date) IN (SELECT personID, MAX(date)
FROM orders
-- WHERE status = 4
GROUP BY personID);
As for:
It simply pulls, seemingly at random, one of the statuses from one of the orders.
It works as intended:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate
Related: Group by clause in mySQL and postgreSQL, why the error in postgreSQL?

How do I use MAX() to return the row that has the max value?

I have table orders with fields id, customer_id and amt:
SQL Fiddle
And I want get customer_id with the largest amt and value of this amt.
I made the query:
SELECT customer_id, MAX(amt) FROM orders;
But the result of this query contained an incorrect value of customer_id.
Then I built such the query:
SELECT customer_id, MAX(amt) AS maximum FROM orders GROUP BY customer_id ORDER BY maximum DESC LIMIT 1;
and got the correct result.
But I do not understand why my first query not worked properly. What am I doing wrong?
And is it possible to change my second query to obtain the necessary information to me in a simpler and competent way?
MySQL will allow you to leave GROUP BY off of a query, thus returning the MAX(amt) in the entire table with an arbitrary customer_id. Most other RDBMS require the GROUP BY clause when using an aggregate.
I don't see anything wrong with your 2nd query -- there are other ways to do it, but yours will work fine.
Some versions of SQL give you a warning or error when you select a field, have an aggregate operator like MAX or SUM, and the field you are selecting does not appear in GROUP BY.
You need a more complicated query to fetch the customer_id corresponding to the max amt. Unfortunately SQL is not as naive as you think. Once such way to do this is:
select customer_id from orders where amt = ( select max(amt) from orders);
Although a solution using joins is likely more performant.
To understand why what you were trying to do doesn't make sense, replace MAX with SUM. From the stance of how aggregate operators are interpreted, it's a mere coincidence that MAX returns something that corresponds to an actual row. SUM does not have this property, for instance.
Practically your first query can be seen as if it were GROUP BY-ed into a big single group.
Also, MySQL is free to choose each output value from different source rows from the same group.
http://dev.mysql.com/doc/refman/5.7/en/group-by-extensions.html
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause.
The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
Furthermore, the selection of values from each group cannot be
influenced by adding an ORDER BY clause. Sorting of the result set
occurs after values have been chosen, and ORDER BY does not affect
which values within each group the server chooses.
The problem with MAX() is that it will select the highest value of that specified field, considering the specified field alone. The other values in the same row are not considered or given preference for the result at any degree. MySQL will usually return whatever value is the first row of the GROUP (in this case the GROUP is composed by the entire table sinse no group was specified), dropping the information of the other rows during the agregation.
To solve this, you could do that:
SELECT customer_id, amt FROM orders ORDER BY amt DESC LIMIT 1
It should return you the customer_id and the highest amt while preserving the relation between both, because no agregation was made.

Mysql group by clause with multiple selects

I have 2 columns in my product table -name and brand, Given is the data,
NAME BRAND
'Ruby Axe Guitar', 'Guitar''s & Co'
'TV' , 'LG'
When I tried this query its working fine,
select name,brand, sum(1000) as sum,'Test' as name1
from products
group by name,brand
but I got surprised even when I dont include brand in the group by clause the query is working fine..
select name,brand, sum(1000) as sum,'Test' as name1
from products
group by name
Can someone explain?
You cannot select ungrouped row without aggregate function - MySQL will give you random value. I guess you are lucky with this second query
Because NAME is already unique with your data, so GROUP BY NAME is same as GROUP BY NAME, OTHER_FIELD.
NAME is unique, then the combination with any other column is unique too.
MySQL is a lot less strict than it should be IMHO. According to the actual SQL specification, any non-grouped column needs an aggregate function in a query containing a GROUP BY clause.
MySQL will allow retrieving non-grouped columns without such aggregate functions, returning an arbitrary value. They have an explanation of this choice in their documentation:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
I believe its intended as an ignored oversight. TYPICALLY, you would be required to include any column that is a non-aggregate into the "GROUP BY" clause. However, MySQL basically grabs the first entry for the column not part of the aggregate it encounters.
This could be ok, such as doing a query against a table/columns that you know wont change no matter how many records in the corresponding group by. For example. You want a list of customers and their total orders. The orders table has a customer ID that joins to the customer table. So, you can do a SUM( Orders.Amount ), yet still get customer ID, Name, Address, Phone. Since the join is on a customer ID, the corresponding name, address, etc will never change and thus not be important within the group by. Just group by a customer ID.
So, MySQL won't choke on you if you inadvertently leave out a column...

MySQL GROUP BY returns only first row

I have a table named forms with the following structure-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
SomeGroup | SomeForm2 | SomePath2
------------------------------------
I use the following query-
SELECT * FROM forms GROUP BY 'GROUP'
It returns only the first row-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
------------------------------------
Shouldn't it return both (or all of it)? Or am I (possibly) wrong?
As the manual states:
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
In your case, MySQL is correctly performing the grouping operation, but (since you select all columns including those by which you are not grouping the query) gives you an indeterminate one record from each group.
It only returns one row, because the values of your GROUP column are the same ... that's basically how GROUP BY works.
Btw, when using GROUP BY it's good form to use aggregate functions for the other columns, such as COUNT(), MIN(), MAX(). In MySQL it usually returns the first row of each group if you just specify the column names; other databases will not like that though.
Your code:
SELECT * FROM forms GROUP BY 'GROUP'
isn't very "good" SQL, MySQL lets you get away with it and returns only one value for all columns not mentioned in the group by clause. Almost any other database would not perform this query. As a rule, any column, that is not part of the grouping condition must be used with an aggregate function.
as far as mysql is concerned, I just solved my problem by hit & trial.
I had the same problem 10 minutes ago. I was using mysql statement something like this:
SELECT * FROM forms GROUP BY 'ID'; // returns only one row
However using the statement like the following would yeild same result:
SELECT ID FROM forms GROUP BY 'ID'; // returns only one row
The following was my solution:
SELECT ID FROM forms GROUP BY ID; // returns more than one row (with one column of field "ID") grouped by ID
or
SELECT * FROM forms GROUP BY ID; // returns more than one row (with columns of all fields) grouped by ID
or
SELECT * FROM forms GROUP BY `ID`; // returns more than one row (with columns of all fields) grouped by ID
Lesson: Donot use semicolon, i believe it does a stringtype search with colons. Remove colons from column name and it will group by its value. However you can use backtick escapes eg. ID
Thank you everyone for pointing out the obvious mistake I was too blind to see. I finally replaced GROUP BY with ORDER BY and included a WHERE clause to get my desired result. That is what I was intending to use all along. Silly me.
My final query becomes this-
SELECT * FROM forms WHERE GROUP='SomeGroup' ORDER BY 'GROUP'
SELECT * FROM forms GROUP BY `GROUP`
it's strange that your query does work
The above result is kind of correct, but not quite.
All columns you select, which are not part of the GROUP BY statement have to be aggregated by some function (list of aggregation function from the MySQL docu). Most often they are used together with numeric columns.
Besides this, your query will return one output row for every (combination of) attributes in the columns referenced in the GROUP BY statement. In your case there is just one distinct value in the GROUP column, namely "SomeGroup", so the output will only contain one row for this value.
Group by clause should only be required if you have any group functions, say max, min, avg, sum, etc, applied in query expressions. Your query does not show any such functions. Meaning you actually not required a Group by clause. And if you still use such clause, you will receive only the first record from a grouped results.
Hence output on your query is perfect.
Query result is perfect; it will return only one row.