Getting average of selected values in mysql - mysql

I have a table structure
id group name points
1 1 a 10
2 1 b 9
3 2 c 7
and so on..
I am writing a query which gives me an array of names and avg of all the points for seleceted rows where group matches the value
for group_list = [1] want a results like this [name: ['a','b'], median:[9.5]]
I have tried like this
$group_list = [1];
createQueryBuilder()
->select('x.name as name, x.AVG(points) as median')
->from('myTable', 'x')
->where('x.group IN(:groupList)')
->setParameter('groupList', $group_list)
->getQuery()
->getResult();
Need some help with this

You are combining 2 distinct requirements into a single sql statement and this causes the problem.
The average of points is a single value per group or per all records, while the names are a list. You can combine the 2 into a single query by repeating the averages across the names, however, it just generates an overhead.
I would simply run a query to get the list of usernames and a separate one to get the average points (either grouped by groups or across all groups, this is not clear from the question).
This solution is so simple, that I do not think I need to provide any code.
Alternatively, you can use MySQL's group_concat() function to get the list of names per group into in single value in comma separated list (you can use any other separator character in place of comma). In this case it is more worthwile to combine the 2 in a single query:
select group_concat(`name`) as names, avg(`points`) as median
from mytable
where `group` in (...)
If you want names from more than one groups, then add group field to the select and group by lists:
select `group`, group_concat(`name`) as names, avg(`points`) as median
from mytable
where `group` in (...)
group by `group`

You should add a group by
->groupBy('x.`group`')

Related

Group by clause and list

GROUP BY: If you want, you can take the rows that remain after WHERE and put them in groups or buckets, where each group contains the same value for the GROUP BY expression (and all the other rows are put in a list for that group). In Java, you would get something like: Map<String, List<Row>>. If you do specify a GROUP BY clause, then your actual rows contain only the group columns, no longer the remaining columns, which are now in that list. Those columns in the list are only visible to aggregate functions that can operate upon that list.
This above paragraph was taken from: https://blog.jooq.org/a-beginners-guide-to-the-true-order-of-sql-operations/
We have a table named student that has the following fields :
Student_id
Student_name
Student_marks
Student_branch
I write a query as:
select sum(student_marks) from student
group by student_branch;
So according to the paragraph, I am grouping the rows by student_branch.
So what group is created?
Is there a group that contains all of the student_branch values?
Also I couldn't get the meaning of this sentence '(and all the other rows are put in a list for that group).'
Can anyone please explain how does group by actually work and then how do the aggregate functions work on those groups.
In your example you would have a separate group for each distinct value in the column student_branch. So if you you have student_branch = A for 5 students and student_branch = B for 3 students, then you would get 2 groups:
A with 5 records
B with 3 records
When you use aggregate functions, they will operate on all records within one group. So SUM(student_marks) will add all student marks of the students in group A separately, and also in group B separately.
In your sample query you will get 2 aggregated result rows, with only the sums of the marks. The result would be more meaningful, if you include the student_branch in the SELECT clause, like:
SELECT student_branch, SUM(student_marks) AS sum_of_marks FROM student
GROUP BY student_branch;
Then the result would look like:
student_branch
sum_of_marks
A
15 (random number, = sum of marks for A students)
B
8 (random number, = sum of marks for B students)
The GROUP BY will only group the records which remain after filtering the data with the WHERE clause. In your example case there is no WHERE clause, so it will group the records of the whole table by student branch.

Mysql DISTINCT with more than one column (remove duplicates)

My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.
I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.
MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.

Multiple counting in 1 sql statement

Lets say I have a table with a column of ages..
Here is the list of ages
1
2
3
1
1
3
I want the SQL to count how many of age 1s, how many of 2s and 3s.
The code:
Select count(age) as age1 where age = ‘1’;
Select count(age) as age2 where age = ‘2’;
Select count(age) as age3 where age = ‘3’;
Should work but would there be a way to just display it all using only 1 line of code?
This is an instance where the GROUP BY clause really shines:
SELECT age, COUNT(age)
FROM table_name
GROUP BY age
Just an additional tip:
You shouldn't use single quotes here in your query:
WHERE age = '1';
This is because age is an INT data type and therefore does not have single quotes. MySQL will implicitly convert age to the correct data type for you - and it's a negligible amount of overhead here. But imagine if you were doing a JOIN of two tables with millions of rows, then the overhead introduced would be something to consider.
Try this ,if the count is limited to three ages ,also using aggregate functions without grouping them will result in a single row,you can use SUM() with the condition which will result in a boolean and you can get the count based on your criteria
Select SUM(age = '1') as age1,
SUM(age = '2') as age2,
SUM(age = '3') as age3
from table
SELECT SUM(CASE WHEN age = 1 THEN 1 ELSE 0 END) AS age1,
SUM(CASE WHEN age = 2 THEN 1 ELSE 0 END) AS age2,
SUM(CASE WHEN age = 3 THEN 1 ELSE 0 END) AS age3
FROM YourTable
If your query should return only one column (age in this case, you can use Count+groupby):
SELECT age, Count(1) as qty
FROM [yourTable]
GROUP BY age
Remember you must include any additional column in your group by condition.
Select age as Age_Group, count(age) as Total_count from table1 group by age;
select age, count(age) from SomeTable group by age
http://sqlfiddle.com/#!2/b40da/2
The group by clause works like this:
When using aggregate functions, like the count function without a group by clause the function will apply to the entire dataset determined by the from and where clauses. A count will for instance count the number of rows in the result set, and sum over a specfic column will sum all the rows in the result set.
What the group by clause allows us to do, is to divide the result set determined by the from and where clause into partitions, so that the aggregate functions no longer applies to the result set as a whole, but rather within each partition of the result set.
When you specify a column to group by, what you are saying is something like "for each distinct value of column x in the result set, create a partition containing any row in the result set with this particular value in column x". Then, instead of yielding one result covering the entire resultset, aggregate functions will yield one result for each distinct value of column x in the result set.
With your example input of:
1
2
3
1
1
3
let's analyze the above query. As always, we should look at the from clause and the where clause first. The from clause tells us that we are selecting from SomeTable and only this, and the lack of a where clause tells us that we are selecting from the full contents of SomeTable.
Next, we'll look at the group by clause. It's present, and it groups by the age column, which is the only column in our example. The presence of the group by clause changes our dataset completely! Instead of selecting from the entire row set of SomeTable, we are now selecting from a set of partitions, one for each distinct value of the age-column in our original result set (which was every row in SomeTable).
At last, we'll look at the select-clause. Now, since we are selecting from partitions and not regular rows, the select-clause has fewer options for what it can contain, actually it only has 2: The column that it is grouped by, or an aggregate function.
Now, in our example we only have one column, but consider that we had another column, like here:
http://sqlfiddle.com/#!2/d5479/2
Now, imagine that in our data set we have two rows, both with age='1', but with different values in the other column. If we were to include this other column in a query that is grouped by the age-column (which we now know will return one row for each partition over the age-column), which value should be presented in the result? It makes no sense to include other column than the one you grouped by. (I'll leave multiple columns in the group by clause out of this, in my experience one usually just wants one..)
But back to our select-clause, knowing our dataset has the distinct values {1, 2, 3} in the age-column, we should expect to get 3 rows in our result set. The first thing to be selected is the age-column, which will yield the values [1, 2, 3]´ in the three rows. Next in theselect-list is an aggregate functioncount(age), which we now know will count the number of rows in each partition. So, for the row in the result whereage='1', it will count the number of rows withage='1', for the row whereage='2'it will count the number of rows whereage='2'`, and so on.
The result would look something like this:
age count(age)
1 3
2 1
3 2
(of course you are free to override the name of the second column in the result, with the as-operator..)
And that concludes today's lesson.

mysql ORDER BY MIN() not matching up with id

I have a database that has the following columns:
-------------------
id|domain|hit_count
-------------------
And I would like to perform this query on it:
SELECT id,MIN(hit_count)
FROM table WHERE domain='$domain'
GROUP BY domain ORDER BY MIN(hit_count)
I would like this query to give me the id of the row that had the smallest hit_count for $domain. The only problem is that if I have two rows that have the same domain, say www.bestbuy.com, the query will just group by whichever one came first, and then although I will get the correct lowest hit_count, the id may or may not be the id of the row that has the lowest hit_count.
Does anyone know of a way for me to perform this query and to get the id that matches up with MIN(hit_count)? Thanks!
Try this:
SELECT id,MIN(hit_count),domain FROM table GROUP BY domain HAVING domain='$domain'
See, when you're using aggregates, either via aggregate functions (and min() is such a function) or via GROUP BY or HAVING operators, your data is being grouped. In your case it is grouped by domain. You have 2 fields in your select list, id and min(hit_count).
Now, for each group database knows which hit_count to pick, as you've specified this explicitly via the aggregate function. But what about id — which one should be included?
MySQL internally wraps such fields into max() aggregate function, which I find an error prone approach. In all other RDBMSes you will get an error for such a query.
The rule is: if you use aggregates, then all columns should be either arguments of aggregate functions or arguments of GROUP BY operator.
To achieve the desired result, you need a subquery:
SELECT id, domain, hit_count
FROM `table`
WHERE domain = '$domain'
AND hit_count = (SELECT min(hit_count) FROM `table` WHERE domain = '$domain');
I've used backticks, as table is a reserved word in SQL.
SELECT
id,
hit_count
FROM
table
WHERE
domain='$domain'
AND hit_count = (SELECT MIN(hit_count) FROM table WHERE domain='$domain')
Try this:
SELECT id,hit_count
FROM table WHERE domain='$domain'
GROUP BY domain ORDER BY hit_count ASC;
This should also work:
select id, MIN(hit_count) from table where domain="$domain";
I had same question. Please see that question below.
min(column) is not returning me correct data of other columns
You are using a GROPU BY. Which means each row in result represents a group of values.
One of those values is the group name (the value of the field you grouped by). The rest are arbitrary values from within that group.
For example the following table:
F1 | F2
1 aa
1 bb
1 cc
2 gg
2 hh
If u will group by F1: SELECT F1,F2 from T GROUP BY F1
You will get two rows:
1 and one value from (aa,bb,cc)
2 and one value from (gg,hh)
If u want a deterministic result set, you need to tell the software what algorithem to apply to the group. Several for example:
MIN
MAX
COUNT
SUM
etc etc
There is a most simplist way your query is OK just modify it with DESC keyword after GROUP BY domain
SELECT
id,
MIN(hit_count)
FROM table
WHERE domain = '$domain'
GROUP BY domain DESC
ORDER BY MIN(hit_count)
Explanation:
When you use group by with aggregate function it always selects the first record but if you restrict it with desc keyword it will select the lowest or last record of that group.
For testing puspose use this query that has only group_concat added.
SELECT
group_concat(id),
MIN(hit_count)
FROM table
WHERE domain = '$domain'
GROUP BY domain DESC
ORDER BY MIN(hit_count)
If you can have duplicated domains group by id:
SELECT id,MIN(hit_count)
FROM domain WHERE domain='$domain'
GROUP BY id ORDER BY MIN(hit_count)

MS Access 2003 - ordering the string values for a listbox not alphabetical

Here is a silly question. Lets say I have a query that produces for a list box, and it produces values for three stores
Store A 18
Store B 32
Store C 54
Now if I ORDER BY in the sql statement the only thing it will do is descending or ascending alphabetically but I want a certain order (only because THEY WANT A CERTAIN ORDER) .....so is there a way for me to add something to the SQL to get
Store B
Store C
Store A
i.e. basically row by row what i want. thanks!
Add a numeric field, sequencer, to the table which contains the store names. Use the sequencer values to determine your sort order.
SELECT sequencer, store_name FROM YourTable ORDER BY sequencer;
In the list box, set the column width = 0 for the sequencer column.
Or simply, as #dscarr suggested, don't include sequencer in the SELECT field list, but just include it in the ORDER BY ...
SELECT store_name FROM YourTable ORDER BY sequencer;
You can do 1 of 2 things.
Either use a SWITCH stament, something like
SELECT Table1.Store,
Table1.Val,
Switch([Store]="StoreB",1,[Store]="StoreC",2,[Store]="StoreA",3) AS Expr1
FROM Table1
ORDER BY Switch([Store]="StoreB",1,[Store]="StoreC",2,[Store]="StoreA",3);
Or use a secondary order table, that stores the values of the store names, and an order by value.