MySQL MAX() with GROUP BY - mysql

Considering these entries:
INSERT INTO `schedule_hours` (`id`, `weekday`, `start_hour`) VALUES
(1, 1, '09:00:00'),
(2, 2, '09:00:00'),
(3, 3, '09:00:00'),
(4, 4, '09:00:00'),
(5, 5, '09:00:00'),
(6, 6, NULL),
(7, 7, NULL),
(8, 1, '12:00:00');
I'm running the following query:
SELECT MAX(id), weekday, start_hour
FROM schedule_hours
GROUP BY weekday
ORDER BY weekday
The objective is to get a whole week (weekday 1-monday, 2-tuesday, etc...) but return the most recent entries.
So, in my table I now have 2 entries for Monday and 1 entry for the rest of the days, I only want to return the latest ones (id is an increment field), the right result should be:
8 1 12:00:00
2 2 09:00:00
3 3 09:00:00
4 4 09:00:00
5 5 09:00:00
6 6 NULL
7 7 NULL
What I'm currently getting:
8 1 09:00:00 < wrong
2 2 09:00:00
3 3 09:00:00
4 4 09:00:00
5 5 09:00:00
6 6 NULL
7 7 NULL
The id and weekday columns are correct, but the first row is showing a wrong result for the start_hour column!

You should try this query:
SELECT id, weekday, start_hour
FROM schedule_hours
WHERE id IN (
SELECT MAX(id)
FROM schedule_hours
GROUP BY weekday
)
ORDER BY weekday
Currently in your query, the columns in SELECT clause are different from the columns in GROUP BY clause. In standard SQL, your query is illegal and will result in a syntax error. However, MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause, which is why you are not getting an error but the output is not what you are expecting. For more details, you may read MySQL Extensions to GROUP BY.

An alternative which avoids taking advantage of MySQL allowing a GROUP BY of a field which isn't in the SELECT statement:-
SELECT schedule_hours.id, schedule_hours.weekday, schedule_hours.start_hour
FROM schedule_hours
INNER JOIN
(
SELECT weekday, MAX(id) AS MaxId
FROM schedule_hours
GROUP BY weekday
)Sub1
ON schedule_hours.id = Sub1.MaxId
AND schedule_hours.weekday = Sub1.weekday
ORDER BY schedule_hours.weekday

Related

query to calculate sum of sales based on a historical related records

I have a sample sales table that contains multiple stores' daily sales figures. I am trying to get a same-store sales query. What that means, I want to sum the total daily sales of the stores that existed a year ago from the date range I have.
Here is the sample table and data:
CREATE TABLE `sales` (
`store_id` int(11) NOT NULL,
`date` date NOT NULL,
`total_sales` double NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `sales` (`store_id`, `date`, `total_sales`) VALUES
(1, '2022-03-01', 100),
(1, '2022-03-02', 100),
(1, '2022-03-03', 100),
(1, '2021-03-01', 50),
(1, '2021-03-03', 50),
(2, '2022-03-01', 30),
(2, '2022-03-02', 30),
(2, '2022-03-03', 30),
(2, '2021-03-01', 15),
(2, '2021-03-02', 15),
(2, '2021-03-03', 15),
(3, '2022-03-01', 500),
(3, '2022-03-02', 500),
(3, '2022-03-03', 500);
In this case, I want to be able to select the sum of daily sales from 1 march 2022 until 2 march 2022 for all the stores that had sales between 1 march 2021 and 2 march 2021 grouped by date. The challenge is, if a store didn’t have sales on a specific day last year, the corresponding sales record of this year should not be added to the sum total. Based on the sample table above, I expect the output of the desired query to be:
SELECT SALES BETWEEN 01-03-2022 AND 02-03-2022
+-----------------------------+
| Date | Sales |
+-----------------------------+
| 01-03-2022 | 130 |
| 02-03-2022 | 30 |
+-----------------------------+
So for 01-03-2022, we check which stores had sales in 01-03-2021, we get that stores 1 and 2 (not 3) so the total sales of store 1 and 2 for 01-03-2022 is: 100 + 30 = 130
For 02-03-2022, we check last year sales, we had only store 2 with sales on 02-03-2021, so the sum of sales for 2022 is the sales of store 2 on 02-03-2022 which is 30
How can I accomplish that with a MySQL query?
First you need a calendar table that returns all the dates for the date range that you want, or you can create the dates with a recursive cte.
Then LEFT join to the cte 2 copies of the table, one for the previous year's dates and the other for the date range's dates and finally aggregate:
WITH RECURSIVE dates(date) AS (
SELECT '2022-03-01'
UNION ALL
SELECT date + INTERVAL 1 day
FROM dates
WHERE date < '2022-03-02'
)
SELECT d.date,
COALESCE(SUM(s2.total_sales), 0) total_sales
FROM dates d
LEFT JOIN sales s1 ON s1.date = d.date - INTERVAL 1 year
LEFT JOIN sales s2 ON s2.date = d.date AND s2.store_id = s1.store_id
GROUP BY d.date;
See the demo.

Delete all SQL rows except one for a Group

I have a table like this:
Schema (MySQL v5.7)
CREATE TABLE likethis
(`id` int, `userid` int, `date` DATE)
;
INSERT INTO likethis
(`id`, `userid`, `date`)
VALUES
(1, 1, "2021-11-15"),
(2, 2, "2021-11-15"),
(3, 1, "2021-11-13"),
(4, 3, "2021-10-13"),
(5, 3, "2021-09-13"),
(6, 2, "2021-09-13");
id
userid
date
1
1
2021-11-15
2
2
2021-11-15
3
1
2021-11-13
4
3
2021-10-13
5
3
2021-09-13
6
2
2021-09-13
View on DB Fiddle
I want to delete all records which are older than 14 days, EXCEPT if the user only has records which are older - than keep the "newest" (biggest "id") row for this user.
Desired target after that action shall be:
id
userid
date
1
1
2021-11-15
2
2
2021-11-15
3
1
2021-11-13
4
3
2021-10-13
i.e.: User ID 1 only has records within the last 14 days: Keep all of them. User ID has a record within the last 14 days, so delete ALL his records which are older than 14 days. User ID 3 has only "old" records, i.e. older than 14 days - so keep only the one newest of those records, even though it's older than 14 days.
I thought of something like a self join with a subquery where I group by user-id ... but can't really get to it ...
This query could work
DELETE b
FROM likethis a
JOIN likethis b ON a.`userid` = b.`userid` AND a.`date` > b.`date`
WHERE b.`date` < NOW() - INTERVAL 14 DAY
I believe you can use case function in MySql
For Example -
SELECT TableID, TableCol,
CASE
WHEN Date > 30 THEN "Delete statement"
ELSE "Dont Delete (Record is not 30"
END
FROM TableName;
Suggested link:
https://www.w3schools.com/sql/func_mysql_case.asp
https://dev.mysql.com/doc/refman/5.7/en/case.html
Hope this helps...

Group by last occurrence

i am trying to get the last rows where rec_p_id = 4 SORTED by the timestamp. Since i do not want to have all the results WHERE rec_p_id = 4, i am using GROUP BY to group it by send_p_id.
My SQL query looks like this:
SELECT *
FROM
( SELECT *
FROM chat
WHERE rec_p_id= "4"
ORDER
BY timestamp DESC) as sub
GROUP
BY send_p_id
My table looks like this:
Table chat
c_id
send_p_id
rec_p_id
timestamp
1
3
4
2020-05-01 14:46:00
2
3
4
2020-05-01 14:49:00
3
3
4
2020-05-01 14:50:00
4
7
4
2020-05-01 12:00:00
5
4
7
2020-05-01 12:10:00
6
7
4
2020-05-01 12:20:00
7
9
4
2020-05-01 16:50:00
8
9
4
2020-05-01 17:00:00
I want to get the last occurrences:
c_id
send_p_id
rec_p_id
timestamp
3
3
4
2020-05-01 14:50:00
6
7
4
2020-05-01 12:20:00
8
9
4
2020-05-01 17:00:00
But instead i get all the first ones:
c_id
send_p_id
rec_p_id
timestamp
1
3
4
2020-05-01 14:46:00
4
7
4
2020-05-01 12:00:00
7
9
4
2020-05-01 16:50:00
I saw the query i am using in this question: ORDER BY date and time BEFORE GROUP BY name in mysql
it seems to work for all of them. What am i doing wrong with my query?
Thanks in advance.
Looking to your expected result seems you are looking for
select max(c_id) c_id, send_p_id, min(timestamp) timestamp
from chat WHERE rec_p_id= "4"
group by send_p_id
ORDER BY c_id
Group by is for aggregated result ..
an use without aggregation function can produce unpredicatble result and in version > 5.6 can produce error
I used this answer and built this setup for you.
The code below is the copy of it, so that you can run it later yourself.
For the solution the example from the official manual.
CREATE TABLE chat
(
c_id INT PRIMARY KEY
, send_p_id INT
, rec_p_id INT
, timestamp DATETIME
);
INSERT INTO chat VALUES
(1, 3, 4, '2020-05-01 14:46:00')
, (2, 3, 4, '2020-05-01 14:49:00')
, (3, 3, 4, '2020-05-01 14:50:00')
, (4, 7, 4, '2020-05-01 12:00:00')
, (5, 4, 7, '2020-05-01 12:10:00')
, (6, 7, 4, '2020-05-01 12:20:00')
, (7, 9, 4, '2020-05-01 16:50:00')
, (8, 9, 4, '2020-05-01 17:00:00');
Solution:
SELECT c_id,
send_p_id,
rec_p_id,
timestamp
FROM chat AS c
WHERE timestamp=(SELECT MAX(c1.timestamp)
FROM chat AS c1
WHERE c.send_p_id = c1.send_p_id)
AND send_p_id != 4
ORDER BY timestamp;

Select record by month between 12 and 2 of new Year

I have this record:
[id | dateofit ]
[ 1 | 2017-12-1]
Which I want to select using this Query:
SELECT id FROM records WHERE MONTH(dateofit) BETWEEN 12 AND 2
The problem is that 12 is from year 2017 and 2 from year 2018, So I don't get any results,
I tried to replace the Query like this MONTH(dateofit) BETWEEN MONTH(12) AND MONTH(1) But still the same problem
What I want to do is to select records has a month of [12, 1(newyear), 2(newyear)]
Why not provide just the time-range?
SELECT id FROM records WHERE dateofit BETWEEN "2017-12-01 00:00:00" AND "2018-03-01 23:59:59";
Here's two possible approaches:
If you only have data from March 2017 to November 2018:
SELECT
id
FROM
records
WHERE
MONTH(dateofit) IN (12, 1, 2)
If you have mulitple years of data:
SELECT
id
FROM
records
WHERE
dateofit BETWEEN '2017-12-01' AND '2018-02-28'
Try this:
SELECT id FROM records
WHERE ( YEAR(dateofit) = 2017 AND MONTH(dateofit) = 12) OR
( YEAR(dateofit) = 2018 AND MONTH(dateofit) IN (1, 2));

Select all rows containing duplicate values in one of two columns from within distinct groups of related records

I'm trying to create a MySQL query that will return all individual rows (not grouped) containing duplicate values from within a group of related records. By 'groups of related records' I mean those with the same account number (per the sample below).
Basically, within each group of related records that share the same distinct account number, select just those rows whose values for the date or amount columns are the same as another row's values within that account's group of records. Values should only be considered duplicate from within that account's group. The sample table and ideal output details below should clear things up.
Also, I'm not concerned with any records with a status of X being returned, even if they have duplicate values.
Small sample table with relevant data:
id account invoice date amount status
1 1 1 2012-04-01 0 X
2 1 2 2012-04-01 120 P
3 1 2 2012-05-01 120 U
4 1 3 2012-05-01 117 U
5 2 4 2012-04-01 82 X
6 2 4 2012-05-01 82 U
7 2 5 2012-03-01 81 P
8 2 6 2012-05-01 80 U
9 3 7 2012-03-01 80 P
10 3 8 2012-04-01 79 U
11 3 9 2012-04-01 78 U
Ideal output returned from desired SQL query:
id account invoice date amount status
2 1 2 2012-04-01 120 P
3 1 2 2012-05-01 120 U
4 1 3 2012-05-01 117 U
6 2 4 2012-05-01 82 U
8 2 6 2012-05-01 80 U
10 3 8 2012-04-01 79 U
11 3 9 2012-04-01 78 U
Thus, row 7/9 and 8/9 should not both be returned because their duplicate values are not considered duplicate from within the scope of their respective accounts. However, row 8 should be returned because it shares a duplicate value with row 6.
Later, I may want to further hone the selection by grabbing only duplicate rows that have matching statuses, thus row 2 would be excluded because it does't match the other two found within that account's group of records. How much more difficult would that make the query? Would it just be a matter of adding a WHERE or HAVING clause, or is it more complicated than that?
I hope my explanation of what I'm trying to accomplish makes sense. I've tried using INNER JOIN but that returns each desired row more than once. I don't want duplicates of duplicates.
Table Structure and Sample Values:
CREATE TABLE payment (
id int(11) NOT NULL auto_increment,
account int(10) NOT NULL default '0',
invoice int(10) NOT NULL default '0',
date date NOT NULL default '0000-00-00',
amount int(10) NOT NULL default '0',
status char(1) NOT NULL default '',
PRIMARY KEY (id)
);
INSERT INTO payment VALUES (1, 1, 1, '2012-04-01', 0, 'X');
INSERT INTO payment VALUES (2, 1, 2, '2012-04-01', 120, 'P');
INSERT INTO payment VALUES (3, 1, 2, '2012-05-01', 120, 'U');
INSERT INTO payment VALUES (4, 1, 3, '2012-05-01', 117, 'U');
INSERT INTO payment VALUES (5, 2, 4, '2012-04-01', 82, 'X');
INSERT INTO payment VALUES (6, 2, 4, '2012-05-01', 82, 'U');
INSERT INTO payment VALUES (7, 2, 5, '2012-03-01', 81, 'p');
INSERT INTO payment VALUES (8, 2, 6, '2012-05-01', 80, 'U');
INSERT INTO payment VALUES (9, 3, 7, '2012-03-01', 80, 'U');
INSERT INTO payment VALUES (10, 3, 8, '2012-04-01', 79, 'U');
INSERT INTO payment VALUES (11, 3, 9, '2012-04-01', 78, 'U');
This type of query can be implemented as a semi join.
Semijoins are used to select rows from one of the tables in the join.
For example:
select distinct l.*
from payment l
inner join payment r
on
l.id != r.id and l.account = r.account and
(l.date = r.date or l.amount = r.amount)
where l.status != 'X' and r.status != 'X'
order by l.id asc;
Note the use of distinct, and that I'm only selecting columns from the left table. This ensures that there are no duplicates.
The join condition checks that:
it's not joining a row to itself (l.id != r.id)
rows are in the same account (l.account = r.account)
and either the date or the amount is the same (l.date = r.date or l.amount = r.amount)
For the second part of your question, you would need to update the on clause in the query.
This seems to work
select * from payment p1
join payment p2 on
(p1.id != p2.id
and p1.status != 'X'
and p1.account = p2.account
and (p1.amount = p2.amount or p1.date = p2.date))
group by p1.id
http://sqlfiddle.com/#!2/a50e9/3