How can I group by both date and another column? [duplicate] - mysql

I understand the point of GROUP BY x.
But how does GROUP BY x, y work, and what does it mean?

Group By X means put all those with the same value for X in the one group.
Group By X, Y means put all those with the same values for both X and Y in the one group.
To illustrate using an example, let's say we have the following table, to do with who is attending what subject at a university:
Table: Subject_Selection
+---------+----------+----------+
| Subject | Semester | Attendee |
+---------+----------+----------+
| ITB001 | 1 | John |
| ITB001 | 1 | Bob |
| ITB001 | 1 | Mickey |
| ITB001 | 2 | Jenny |
| ITB001 | 2 | James |
| MKB114 | 1 | John |
| MKB114 | 1 | Erica |
+---------+----------+----------+
When you use a group by on the subject column only; say:
select Subject, Count(*)
from Subject_Selection
group by Subject
You will get something like:
+---------+-------+
| Subject | Count |
+---------+-------+
| ITB001 | 5 |
| MKB114 | 2 |
+---------+-------+
...because there are 5 entries for ITB001, and 2 for MKB114
If we were to group by two columns:
select Subject, Semester, Count(*)
from Subject_Selection
group by Subject, Semester
we would get this:
+---------+----------+-------+
| Subject | Semester | Count |
+---------+----------+-------+
| ITB001 | 1 | 3 |
| ITB001 | 2 | 2 |
| MKB114 | 1 | 2 |
+---------+----------+-------+
This is because, when we group by two columns, it is saying "Group them so that all of those with the same Subject and Semester are in the same group, and then calculate all the aggregate functions (Count, Sum, Average, etc.) for each of those groups". In this example, this is demonstrated by the fact that, when we count them, there are three people doing ITB001 in semester 1, and two doing it in semester 2. Both of the people doing MKB114 are in semester 1, so there is no row for semester 2 (no data fits into the group "MKB114, Semester 2")
Hopefully that makes sense.

Here I am going to explain not only the GROUP clause use, but also the Aggregate functions use.
The GROUP BY clause is used in conjunction with the aggregate functions to group the result-set by one or more columns. e.g.:
-- GROUP BY with one parameter:
SELECT column_name, AGGREGATE_FUNCTION(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name;
-- GROUP BY with two parameters:
SELECT
column_name1,
column_name2,
AGGREGATE_FUNCTION(column_name3)
FROM
table_name
GROUP BY
column_name1,
column_name2;
Remember this order:
SELECT (is used to select data from a database)
FROM (clause is used to list the tables)
WHERE (clause is used to filter records)
GROUP BY (clause can be used in a SELECT statement to collect data
across multiple records and group the results by one or more columns)
HAVING (clause is used in combination with the GROUP BY clause to
restrict the groups of returned rows to only those whose the condition
is TRUE)
ORDER BY (keyword is used to sort the result-set)
You can use all of these if you are using aggregate functions, and this is the order that they must be set, otherwise you can get an error.
Aggregate Functions are:
MIN() returns the smallest value in a given column
MAX() returns the maximum value in a given column.
SUM() returns the sum of the numeric values in a given column
AVG() returns the average value of a given column
COUNT() returns the total number of values in a given column
COUNT(*) returns the number of rows in a table
SQL script examples about using aggregate functions:
Let's say we need to find the sale orders whose total sale is greater than $950. We combine the HAVING clause and the GROUP BY clause to accomplish this:
SELECT
orderId, SUM(unitPrice * qty) Total
FROM
OrderDetails
GROUP BY orderId
HAVING Total > 950;
Counting all orders and grouping them customerID and sorting the result ascendant. We combine the COUNT function and the GROUP BY, ORDER BY clauses and ASC:
SELECT
customerId, COUNT(*)
FROM
Orders
GROUP BY customerId
ORDER BY COUNT(*) ASC;
Retrieve the category that has an average Unit Price greater than $10, using AVG function combine with GROUP BY and HAVING clauses:
SELECT
categoryName, AVG(unitPrice)
FROM
Products p
INNER JOIN
Categories c ON c.categoryId = p.categoryId
GROUP BY categoryName
HAVING AVG(unitPrice) > 10;
Getting the less expensive product by each category, using the MIN function in a subquery:
SELECT categoryId,
productId,
productName,
unitPrice
FROM Products p1
WHERE unitPrice = (
SELECT MIN(unitPrice)
FROM Products p2
WHERE p2.categoryId = p1.categoryId)
The following will show you how to select the most recent date item "productDate", using MAX function in a subquery:
SELECT categoryId,
productId,
productName,
unitPrice,
productDate
FROM Products p1
WHERE productDate= (
SELECT MAX(productDate)
FROM Products p2
WHERE p2.categoryId = p1.categoryId)
The following statement groups rows with the same values in both categoryId and productId columns:
SELECT
categoryId, categoryName, productId, SUM(unitPrice)
FROM
Products p
INNER JOIN
Categories c ON c.categoryId = p.categoryId
GROUP BY categoryId, productId

In simple English from GROUP BY with two parameters what we are doing is looking for similar value pairs and get the count to a 3rd column.
Look at the following example for reference. Here I'm using International football results from 1872 to 2020
+----------+----------------+--------+---+---+--------+---------+-------------------+-----+
| _c0| _c1| _c2|_c3|_c4| _c5| _c6| _c7| _c8|
+----------+----------------+--------+---+---+--------+---------+-------------------+-----+
|1872-11-30| Scotland| England| 0| 0|Friendly| Glasgow| Scotland|FALSE|
|1873-03-08| England|Scotland| 4| 2|Friendly| London| England|FALSE|
|1874-03-07| Scotland| England| 2| 1|Friendly| Glasgow| Scotland|FALSE|
|1875-03-06| England|Scotland| 2| 2|Friendly| London| England|FALSE|
|1876-03-04| Scotland| England| 3| 0|Friendly| Glasgow| Scotland|FALSE|
|1876-03-25| Scotland| Wales| 4| 0|Friendly| Glasgow| Scotland|FALSE|
|1877-03-03| England|Scotland| 1| 3|Friendly| London| England|FALSE|
|1877-03-05| Wales|Scotland| 0| 2|Friendly| Wrexham| Wales|FALSE|
|1878-03-02| Scotland| England| 7| 2|Friendly| Glasgow| Scotland|FALSE|
|1878-03-23| Scotland| Wales| 9| 0|Friendly| Glasgow| Scotland|FALSE|
|1879-01-18| England| Wales| 2| 1|Friendly| London| England|FALSE|
|1879-04-05| England|Scotland| 5| 4|Friendly| London| England|FALSE|
|1879-04-07| Wales|Scotland| 0| 3|Friendly| Wrexham| Wales|FALSE|
|1880-03-13| Scotland| England| 5| 4|Friendly| Glasgow| Scotland|FALSE|
|1880-03-15| Wales| England| 2| 3|Friendly| Wrexham| Wales|FALSE|
|1880-03-27| Scotland| Wales| 5| 1|Friendly| Glasgow| Scotland|FALSE|
|1881-02-26| England| Wales| 0| 1|Friendly|Blackburn| England|FALSE|
|1881-03-12| England|Scotland| 1| 6|Friendly| London| England|FALSE|
|1881-03-14| Wales|Scotland| 1| 5|Friendly| Wrexham| Wales|FALSE|
|1882-02-18|Northern Ireland| England| 0| 13|Friendly| Belfast|Republic of Ireland|FALSE|
+----------+----------------+--------+---+---+--------+---------+-------------------+-----+
And now I'm going to group by similar country(column _c7) and tournament(_c5) value pairs by GROUP BY operation,
SELECT `_c5`,`_c7`,count(*) FROM res GROUP BY `_c5`,`_c7`
+--------------------+-------------------+--------+
| _c5| _c7|count(1)|
+--------------------+-------------------+--------+
| Friendly| Southern Rhodesia| 11|
| Friendly| Ecuador| 68|
|African Cup of Na...| Ethiopia| 41|
|Gold Cup qualific...|Trinidad and Tobago| 9|
|AFC Asian Cup qua...| Bhutan| 7|
|African Nations C...| Gabon| 2|
| Friendly| China PR| 170|
|FIFA World Cup qu...| Israel| 59|
|FIFA World Cup qu...| Japan| 61|
|UEFA Euro qualifi...| Romania| 62|
|AFC Asian Cup qua...| Macau| 9|
| Friendly| South Sudan| 1|
|CONCACAF Nations ...| Suriname| 3|
| Copa Newton| Argentina| 12|
| Friendly| Philippines| 38|
|FIFA World Cup qu...| Chile| 68|
|African Cup of Na...| Madagascar| 29|
|FIFA World Cup qu...| Burkina Faso| 30|
| UEFA Nations League| Denmark| 4|
| Atlantic Cup| Paraguay| 2|
+--------------------+-------------------+--------+
Explanation: The meaning of the first row is there were 11 Friendly tournaments held on Southern Rhodesia in total.
Note: Here it's mandatory to use a counter column in this case.

Related

MySQL GROUP_CONCAT with SUM() and multiple JOINs inside subquery

I'm very average with MySQL, but usually I can write all the needed queries after reading documentation and searching for examples. Now, I'm in the situation where I spent 3 days re-searching and re-writing queries, but I can't get it to work the exact way I need. Here's the deal:
1st table (mpt_companies) contains companies:
| company_id | company_title |
------------------------------
| 1 | Company A |
| 2 | Company B |
2nd table (mpt_payment_methods) contains payment methods:
| payment_method_id | payment_method_title |
--------------------------------------------
| 1 | Cash |
| 2 | PayPal |
| 3 | Wire |
3rd table (mpt_payments) contains payments for each company:
| payment_id | company_id | payment_method_id | payment_amount |
----------------------------------------------------------------
| 1 | 1 | 1 | 10.00 |
| 2 | 2 | 3 | 15.00 |
| 3 | 1 | 1 | 20.00 |
| 4 | 1 | 2 | 10.00 |
I need to list each company along with many stats. One of stats is the sum of payments in each payment method. In other words, the result should be:
| company_id | company_title | payment_data |
--------------------------------------------------------
| 1 | Company A | Cash:30.00,PayPal:10.00 |
| 2 | Company B | Wire:15.00 |
Obviously, I need to:
Select all the companies;
Join payments for each company;
Join payment methods for each payment;
Calculate sum of payments in each method;
GROUP_CONCAT payment methods and sums;
Unfortunately, SUM() doesn't work with GROUP_CONCAT. Some solutions I found on this site suggest using CONCAT, but that doesn't produce the list I need. Other solutions suggest using CAST(), but maybe I do something wrong because it doesn't work too. This is the closest query I wrote, which returns each company, and unique list of payment methods used by each company, but doesn't return the sum of payments:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
Then I tried:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title), ':', CAST(SUM(mpt_payments.payment_amount) AS CHAR))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
...and many other variations, but all of them either returned query errors, either didn't return/format data I need.
The closest answer I could find was MySQL one to many relationship: GROUP_CONCAT or JOIN or both? but after spending 2 hours re-writing the provided query to work with my data, I couldn't do it.
Could anyone give me a suggestion, please?
You can do that by aggregating twice. First for the sum of payments per method and company and then to concatenate the sums for each company.
SELECT x.company_id,
x.company_title,
group_concat(payment_amount_and_method) payment_data
FROM (SELECT c.company_id,
c.company_title,
concat(pm.payment_method_title, ':', sum(p.payment_amount)) payment_amount_and_method
FROM mpt_companies c
INNER JOIN mpt_payments p
ON p.company_id = c.company_id
INNER JOIN mpt_payment_methods pm
ON pm.payment_method_id = p.payment_method_id
GROUP BY c.company_id,
c.company_title,
pm.payment_method_id,
pm.payment_method_title) x
GROUP BY x.company_id,
x.company_title;
db<>fiddle
Here you go
SELECT company_id,
company_title,
GROUP_CONCAT(
CONCAT(payment_method_title, ':', payment_amount)
) AS payment_data
FROM (
SELECT c.company_id, c.company_title, pm.payment_method_id, pm.payment_method_title, SUM(p.payment_amount) AS payment_amount
FROM mpt_payments p
JOIN mpt_companies c ON p.company_id = c.company_id
JOIN mpt_payment_methods pm ON pm.payment_method_id = p.payment_method_id
GROUP BY p.company_id, p.payment_method_id
) distinct_company_payments
GROUP BY distinct_company_payments.company_id
;

Combining MySQL querys

This SQL tells me how much when the maximum occurred in the last hour, and is easily modified to show the same for the minimum.
SELECT
mt.mB as Hr_mB_Max,
mt.UTC as Hr_mB_Max_when
FROM
thundersense mt
WHERE
mt.mB =(
SELECT
MAX(mB)
FROM
thundersense mt2
WHERE
mt2.UTC >(UNIX_TIMESTAMP() -3600))
ORDER BY
utc
DESC
LIMIT 1
How do I modify it so it returns both maximum & minimum and their respective times?
Yours Simon M.
Based on my understanding of your question, you are looking to create a 4 column and 1 row answer where it looks like:
+-------+-----------------+----------+-----------------+
| event | time_it_occured | event | time_it_occured |
+-------+-----------------+----------+-----------------+
| fun | 90000 | homework | 12000 |
+-------+-----------------+----------+-----------------+
Below is a similar situation/queries you can adapt for your situation.
So, given a table called 'people' that looks like:
+----+------+--------+
| ID | name | salary |
+----+------+--------+
| 1 | bob | 40000 |
| 2 | cat | 12000 |
| 3 | dude | 50000 |
+----+------+--------+
You can use this query:
SELECT * FROM
(SELECT name, salary FROM people WHERE salary = (SELECT MAX(salary) FROM people)) t JOIN
(SELECT name, salary FROM people WHERE salary = (SELECT MIN(salary) FROM people)) a;
to generate:
+------+--------+------+--------+
| name | salary | name | salary |
+------+--------+------+--------+
| bob | 40000 | cat | 12000 |
+------+--------+------+--------+
Some things to note:
you can change the WHERE clauses to be the ones you have mentioned in question (for MAX and MIN).
Please be careful with the above query, here I am using a cartesian join (cross join in MYSQL) in order to get the 4 columns. To be honest, it doesn't make sense for me to get back data in one row but you said that's what you're looking for.
Here is what I would work with instead, getting two tuples/rows back:
+----------+--------+
| name | salary |
+----------+--------+
| dude | 95000 |
| Cat | 12000 |
+----------+--------+
And to generate this, you would use:
(SELECT name, salary FROM instructor WHERE salary = (SELECT MAX(salary) FROM instructor))
UNION
(SELECT name, salary FROM instructor WHERE salary = (SELECT MIN(salary) FROM instructor));
Also: A JOIN without a ON clause is just a CROSS JOIN.
How to use mysql JOIN without ON condition?
One method uses a join:
SELECT mt.mB as Hr_mB_Max, mt.UTC as Hr_mB_Max_when
FROM thundersense mt JOIN
(SELECT MAX(mB) as max_mb, MIN(mb) as min_mb
FROM thundersense mt2
WHERE mt2.UTC >(UNIX_TIMESTAMP() - 3600)
) mm
ON mt.mB IN (mm.max_mb, mm.min_mb)
ORDER BY utc DESC;
My only concern is your limit 1. Presumably, the mBs should be unique. If not, there is a bit of a challenge. One possibility would be to use an auto-incremented id rather than mB.

Mysql return 0 when i use count

I have this table:
Notifications
+-------+--------+--------+
| id |category| notify |
+-------+--------+--------+
| 1 |products| yes|
| 2 |web | no |
| 3 |clients | yes|
| 4 |clients | yes|
+-------+--------+--------+
I need this result:
Category | Count |
products | 1|
web | 0|
clients | 2|
I tried
SELECT count(*), category
FROM Notifications
WHERE notify = "yes"
GROUP BY category
but I get
Category | Count |
products | 1|
clients | 2|
Thanks for any help.
One method to get all categories in Notifications is to use conditional aggregation:
SELECT n.category, sum(n.notify = 'no')
FROM Notifications n
GROUP BY n.category;
If you have a separate table of categories, then you can get all categories (even those without notifications) using left join:
select c.category, count(n.category)
from categories c left join
notifications n
on c.category = n.category and n.notify = 'no'
group by c.category;
And, you can also use the ubiquitous case statement to perform conditional aggregation.
Select category,
Sum(case when Notify = 'Yes' then 1 else 0 end) count,
FROM Notifications
GROUP BY category
SELECT count(*) AS no_category FROM Notifications
WHERE notify="no" GROUP BY category;
With this the count value will be stored inside no_category. Using PHP, you can
echo $row["no_category"];
And on the MySQL console, the query will generate a table in the form
no_category
1

How to calculate rows in sql

I have a table:
+---------+-----------+--------------+
| acc_id | dns_id | mail_name |
+---------+-----------+--------------+
| 1| 5 | myac1 |
| 2| 5 | myac2 |
+---------+-----------+--------------+-
I showed to you two records. I have a lot of acc_id, but I have the same dns_id for a lot of this acc_id. How can I calculate how many acc_id in the same dns_id. I have not only dns_id = 5, I have other values. How can I calculate for each dns_id number of `acc_id'?
According to my table, I have two acc_id in one dns_id.
You can use count(acc_id) in the select statement and a group by with the column dns_id. It would look something like this::
SELECT dns_id, count(acc_id) FROM [TABLE_NAME]
GROUP BY dns_id
Try GROUP BY and count():
SELECT dns_id, COUNT(*) FROM table GROUP BY dns_id;
It groups the results by different dns_id and counts the number of rows that were grouped.

SQL sorting does not follow group by statement, always uses primary key

I have a SQL database with a table called staff, having following columns:
workerID (Prim.key), name, department, salary
I am supposed to find the workers with the highest salary per department and used the following statement:
select staff.workerID, staff.name, staff.department, max(staff.salary) AS biggest
from staff
group by staff.department
I get one worker shown from each department, but they are NOT the workers with the highest salary, BUT the biggest salary value is shown, even though the worker does not get that salary.
The person shown is the worker with the "lowest" workerID per department.
So, there is some sorting going on using the primary key, even though it is not mentioned in the group by statement.
Can someone explain, what is going on and maybe how to sort correctly.
Explanation for what is going on:
You are performing a GROUP BY on staff.department, however your SELECT list contains 2 non-grouping columns staff.workerID, staff.name. In standard sql this is a syntax error, however MySql allows it so the query writers have to make sure that they handle such situations themselves.
Reference: http://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause.
The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
Starting with MySQL 5.1 the non-standard feature can be disabled by setting the ONLY_FULL_GROUP_BY flag in sql_mode: http://dev.mysql.com/doc/refman/5.6/en/sql-mode.html#sqlmode_only_full_group_by
How to fix:
select staff.workerID, staff.name, staff.department, staff.salary
from staff
join (
select staff.department, max(staff.salary) AS biggest
from staff
group by staff.department
) t
on t.department = staff.department and t.biggest = staff.salary
In the inner query, fetch department and its highest salary using GROUP BY. Then in the outer query join those results with the main table which would give you the desired results.
This is the usual case group by with a aggregate function does not guarantee proper row corresponding to the aggregate function. Now there are many ways to do it and the usual practice is a sub-query and join. But if the table is big then performance wise it kills, so the other approach is to use left join
So lets say we have the table
+----------+------+-------------+--------+
| workerid | name | department | salary |
+----------+------+-------------+--------+
| 1 | abc | computer | 400 |
| 2 | cdf | electronics | 200 |
| 3 | gfd | computer | 400 |
| 4 | wer | physics | 300 |
| 5 | hgt | computer | 700 |
| 6 | juy | electronics | 100 |
| 7 | wer | physics | 400 |
| 8 | qwe | computer | 200 |
| 9 | iop | electronics | 800 |
| 10 | kli | physics | 800 |
| 11 | qsq | computer | 600 |
| 12 | asd | electronics | 300 |
+----------+------+-------------+--------+
SO we can get the data as
select st.* from staff st
left join staff st1 on st1.department = st.department
and st.salary < st1.salary
where
st1.workerid is null
The above will give you as
+----------+------+-------------+--------+
| workerid | name | department | salary |
+----------+------+-------------+--------+
| 5 | hgt | computer | 700 |
| 9 | iop | electronics | 800 |
| 10 | kli | physics | 800 |
+----------+------+-------------+--------+
My favorite solution to this problem uses LEFT JOIN:
SELECT m.workerID, m.name, m.department, m.salary
FROM staff m # 'm' from 'maximum'
LEFT JOIN staff o # 'o' from 'other'
ON m.department = o.department # match rows by department
AND m.salary < o.salary # match each row in `m` with the rows from `o` having bigger salary
WHERE o.salary IS NULL # no bigger salary exists in `o`, i.e. `m`.`salary` is the maximum of its dept.
;
This query selects all the workers that have the biggest salary from their department; i.e. if two or more workers have the same salary and it is the bigger in their department then all these workers are selected.
Try this:
SELECT s.workerID, s.name, s.department, s.salary
FROM staff s
INNER JOIN (SELECT s.department, MAX(s.salary) AS biggest
FROM staff s GROUP BY s.department
) AS B ON s.department = B.department AND s.salary = B.biggest;
OR
SELECT s.workerID, s.name, s.department, s.salary
FROM (SELECT s.workerID, s.name, s.department, s.salary
FROM staff s
ORDER BY s.department, s.salary DESC
) AS s
GROUP BY s.department;