use group by max() twice in another column - mysql

I use mysql query.
I want to know the date and amount of the highest salary.
I want to know if I can use max() twice as below.
The result comes out exactly as I want.
But as far as I know, you have to use only one max().
Was I mistaken?
http://sqlfiddle.com/#!9/8db0c17/4
create table test ( mid bigint , sal bigint , dt date);
insert into test values( 1, 100, '2020-01-01 00:00:00'),
( 1, 200, '2020-02-01 00:00:00'),
( 2, 100, '2020-03-01 00:00:00'),
( 2, 200, '2020-04-01 00:00:00'),
( 2, 300, '2020-05-01 00:00:00'),
( 3, 500, '2020-10-01 00:00:00');
select mid, max(sal), max(dt) from test group by mid;
mid max(sal) max(dt)
1 200 2020-02-01
2 300 2020-05-01
3 500 2020-10-01

You can use max several times in your query, but in your case, you will not get what you want.
If we change your data like this:
1, 100, 2020-01-01
1, 200, 2020-02-01
2, 100, 2020-02-01
2, 300, 2020-03-01
2, 200, 2020-04-01
2, 300, 2020-05-01
3, 500, 2020-10-01
3, 300, 2020-11-01
Your result will be:
1, 200, 2020-02-01
2, 300, 2020-05-01
3, 500, 2020-11-01
As you can see, for 3rd row we get maximum value for sal and for dt but separately.
We can use somthing like this to get the right result:
select
t.mid, max(t.dt), tmp.sal_max
from test t
join (
select t1.mid, max(t1.sal) as sal_max
from test t1
group by t1.mid) tmp on tmp.mid = t.mid AND tmp.sal_max = t.sal
group by tmp.mid;
Result:
1, 2020-02-01, 200
2, 2020-05-01, 300
3, 2020-10-01, 500
I think it is not the simplest option, but it will work.

Yes you can use max many times it should be different columns names
The other column should be in group by

Related

How to compare sum with the amount?

The table looks like this:
id, price, amount, transactionid
1, 5, 10, abc
2, 5, 10, abc
3, 20, 40, def
4, 20, 40, def
5, 15, 40, xyz
6, 20, 40, xyz
I want to compare the sum of the amounts with the amount and only select that are not equal.
Also in the example: 15 + 20 != 40
SELECT sum(price), transactionid FROM payment group by transactionid
Now I need the check with one of the amounts from a row and show only if is unequal.
Set the conditions in the HAVING clause:
SELECT transactionid,
SUM(price) total_price,
MAX(amount) amount
FROM payment
GROUP BY transactionid
HAVING total_price <> amount;
See the demo.

Cumulative sum with max date group by month year and id

Now i need to make similar query but need to several criteria
Here is my table
`transaksi` (`transid`, `idpinj`, `tanggal`,`sisapokok`, `sisajasa`
(1, 1, '2018-01-01', 1000, 100, 1),
(2, 1, '2018-01-05', 1000, 100, 3),
(3, 2, '2018-02-04', 1000, 100, 4),
(4, 2, '2018-02-08', 1000, 100, 5),
(5, 1, '2018-02-19', 1000, 100, 3),
(6, 3, '2018-02-22', 1000, 100, 2),
(7, 2, '2018-03-09', 1000, 100, 3),
(8, 3, '2018-03-10', 1000, 100, 3)
(9, 3, '2018-03-12', 1000, 100, 4)
(10, 1, '2018-03-17', 1000, 100, 4)
(11, 4, '2018-03-19', 1000, 100, 3)
(12, 2, '2018-03-20', 1000, 100, 4)
DB Fiddle table
From the table above i need to get output as follow
Month sisapokok sisajasa
Jan-2018 1000 100 ->row2
Feb-2018 4000 400 ->+ row3+5
Mar-2018 12000 1200 ->+ row9+10+11+12
First I need to get sum(sisapokok) and sum(sisajasa) for each idpinj where date is max(tanggal), status between 3 and 4. This value then sum as total per month
Make cumulative sum each month for the last 12 month
I try this query but it get the max(date) from all records not max(date) by month and each idpinj.
SELECT a.idpinj,a.sisapokok
FROM transaksi a
INNER JOIN
(
SELECT idpinj, MAX(tanggal) tgl
FROM transaksi
GROUP BY idpinj
) b ON a.idpinj = b.idpinj
AND a.tanggal = b.tgl
ORDER BY `a`.`idpinj` ASC
Not sure exactly what you are asking for but see if this helps:
select monthyear, sum(sisapokok)sisapokok, sum(sisajasa)sisajasa from (
select cast(month(tanggal) as varchar)+'-'+cast(year(tanggal) as varchar) monthyear, sum(sisapokok)sisapokok, sum(sisajasa)sisajasa
from #transaksi
group by cast(month(idpinj) as varchar)+'-'+cast(year(tanggal) as varchar) , tanggal) a
group by monthyear
Based on the fiddle data
select yyyy,mm,
#s:=#s+sisapokok sisapokok,
#t:=#t+sisajasa sisajasa
from
(
select yyyy,mm,sum(sisapokok) sisapokok,sum(sisajasa) sisajasa
from
(
select year(tanggal) yyyy,month(tanggal) mm, sisapokok,sisajasa
from transaksi t
join
(
select year(tanggal) yyyy,month(tanggal) mm,idpinj,max(transid) maxid
from `transaksi`
where status in(3,4)
group by year(tanggal),month(tanggal),idpinj
) s on s.maxid = transid
) t
group by yyyy,mm
) u
,(select #s:=0,#t:=0) r
order by yyyy,mm
+------+------+-----------+----------+
| yyyy | mm | sisapokok | sisajasa |
+------+------+-----------+----------+
| 2018 | 1 | 2000 | 2003 |
| 2018 | 2 | 5000 | 2303 |
| 2018 | 3 | 13000 | 3103 |
+------+------+-----------+----------+
3 rows in set (0.00 sec)
Note the inner query finds the last relevant id and the code progresses outward to use variables to calculate running totals.

Query for adding totals sum

Table1
ID, ANumber, Type, Amount, Date
1, 00010, 400, 10, 2016-11-16
2, 00011, 600, 20, 2016-11-12
3, 00012, 600, 10, 2016-11-13
4, 00013, 500, 30, 2016-11-17
5, 00014, 400, 40, 2016-11-19
Results:
400, 60
600, 30
500, 30
totals, 110
I want to add the totals. this is an existing table i can only SELECT.
This is my query. i don't know how to add the totals
SELECT Type, SUM(Amount)
FROM table1
GROUP BY Type
You are looking for with rollup:
select type, sum(amount)
from t
group by type with rollup;
Note: The final group will have NULL for the type rather than totals. You can use coalesce() to get whatever value you want.
You can always sum the initial values you returned in your initial query to generate a total:
SELECT SUM(sums.`sum`) AS 'total' FROM (SELECT SUM(`Amount`) AS 'sum' FROM `table1` GROUP BY `Type`) sums

Mysql select query count until reach the condition with condition

I have lists of users with his points and game id. I need to find the rank of the specified user based on the game order by the max(lb_point).
I have already done the query for getting the rank based on individual game as follows.
select count(*) AS user_rank
from (
select distinct user_id
from leader_board
where lb_point >= (select max( lb_point )
from leader_board
where user_id = 1
and game_id = 2 )
and game_id = 2
) t
But i need to find the rank based on the overall game. Example i have 3 different games (1,2,3). By passing the user_id, i need to find his overall rank among all three games. Can you please help me on this?
lb_id user_id game_id lb_point
------------------------------------------------
1 1 2 670
2 1 1 200
3 1 2 650
4 1 1 400
5 3 2 700
6 4 2 450
7 2 1 550
8 2 1 100
9 1 1 200
10 2 1 100
11 1 1 200
12 2 1 100
13 1 1 200
14 2 1 100
15 1 1 200
16 2 1 100
17 1 1 200
18 2 1 100
19 1 1 200
20 2 1 100
21 1 1 200
22 2 1 800
use sandbox;
/*create table t (lb_id int, user_id int, game_id int, lb_point int);
truncate table t;
insert into t values
(1 , 1, 2, 670),
(2 , 1, 1, 200),
(3 , 1, 2, 650),
(4 , 1, 1, 400),
(5 , 3, 2, 700),
(6 , 4, 2, 450),
(7 , 2, 1, 550),
(8 , 2, 1, 100),
(9 , 1, 1, 200),
(10, 2, 1, 100),
(11, 1, 1, 200),
(12, 2, 1, 100),
(13, 1, 1, 200),
(14, 2, 1, 100),
(15, 1, 1, 200),
(16, 2, 1, 100),
(17, 1, 1, 200),
(18, 2, 1, 100),
(19, 1, 1, 200),
(20, 2, 1, 100),
(21, 1, 1, 200),
(22, 2, 1, 800);
*/
select t.*
from
(
select s.*,#rn:=#rn+1 as rank
from
(
select user_id, sum(lb_point) points
from t
where lb_id = (select t1.lb_id from t t1 where t1.user_id = t.user_id and t1.game_id = t.game_id order by t1.lb_point desc limit 1)
group by user_id
order by points desc
) s
,(select #rn:=0) rn
) t
where t.user_id = 1
The innermost query grabs the highest score per game per user and sums it.
The next query assigns a rank based on the aggregated score per user.
The outermost query selects the user.

Aligning timestamps when not quite synchronized

I have 3 processes A, B and C as defined in the following series of tables:
http://sqlfiddle.com/#!2/48f54
CREATE TABLE processA
(date_time datetime, valueA int);
INSERT INTO processA
(date_time, valueA)
VALUES
('2013-1-8 22:10:00', 100),
('2013-1-8 22:15:00', 100),
('2013-1-8 22:30:00', 100),
('2013-1-8 22:35:00', 100),
('2013-1-8 22:40:00', 100),
('2013-1-8 22:45:00', 100),
('2013-1-8 22:50:00', 100),
('2013-1-8 23:05:00', 100),
('2013-1-8 23:10:00', 100),
('2013-1-8 23:20:00', 100),
('2013-1-8 23:25:00', 100),
('2013-1-8 23:35:00', 100),
('2013-1-8 23:40:00', 100),
('2013-1-9 00:05:00', 100),
('2013-1-9 00:10:00', 100);
CREATE TABLE processB
(date_time datetime, valueB decimal(4,2));
INSERT INTO processB
(date_time, valueB)
VALUES
('2013-1-08 21:46:00', 3),
('2013-1-08 22:11:00', 4),
('2013-1-08 22:31:00', 5),
('2013-1-08 22:36:00', 6),
('2013-1-08 22:41:00', 7),
('2013-1-08 23:06:00', 8),
('2013-1-08 23:20:00', 2),
('2013-1-08 23:46:00', 3),
('2013-1-09 00:34:00', 9);
CREATE TABLE processC
(date_time datetime, status varchar(4));
INSERT INTO processC
VALUES
('2013-1-08 18:00:00', 'yes'),
('2013-1-08 19:00:00', 'yes'),
('2013-1-08 20:00:00', 'yes'),
('2013-1-08 21:00:00', 'yes'),
('2013-1-08 22:00:00', 'yes'),
('2013-1-08 23:00:00', 'no'),
('2013-1-08 00:00:00', 'no'),
('2013-1-08 01:00:00', 'no');
As you can see the time at which readings occur for each of the processes is not the same.
ProcessA, IF it occurs, does so at 5 minute intervals
ProcessB, readings occur at unpredictable times but usually occur multiple times within the hour
ProcessC will always have an hourly value (yes or no).
Firstly, I want to convert processB so that there is a reading at ever 5 minute interval so the data aligns with processA, which can then enable me to do a simple join of both tables at the 5 minute interval mark. For the conversion, the data at every 5 minutes should be set to the nearest processB observation available within [-30,30) minute window. If values are equidistant then take the average. If none is available in the 30 minute window then set it to null.
Once I have that, I can do a simple join on %Y%m%d%H with ProcessC using something like the following to get a final table with all data aligned at the 5 minute interval mark:
date_format(date_time, '%Y%m%d%H') = date_format(date_time, '%Y%m%d%H')
If anyone has any pointers/guidance I would appreciate some direction. I appreciate it.
Sample output:
'2013-1-8 22:10:00', 100, 4, yes <--- closer to 22:11 than 21:46
'2013-1-8 22:15:00', 100, 4, yes <--- closer to 22:11 than 21:31
'2013-1-8 22:30:00', 100, 5, yes <--- closer to 22:31 than 22:11
'2013-1-8 22:35:00', 100, 6, yes <--- closer to 22:36 than 22:31
'2013-1-8 22:40:00', 100, 7, yes <--- closer to 22:41 than 22:36
'2013-1-8 22:45:00', 100, 7, yes <--- closer to 22:41 than 23:06
'2013-1-8 22:50:00', 100, 7, yes <--- closer to 22:41 than 23:06
'2013-1-8 23:05:00', 100, 8, yes <--- closer to 23:06 than 23:06
'2013-1-8 23:10:00', 100, 8, no <--- closer to 23:06 than 23:20
'2013-1-8 23:20:00', 100, 2, no <--- closer to 23:20 than 23:10
'2013-1-8 23:25:00', 100, 2, no <--- closer to 23:20 than 23:10
'2013-1-8 23:35:00', 100, 3, no <--- closer to 23:46 than 23:20
'2013-1-9 00:05:00', 100, 3, no <--- closer to 23:46 than 00:34
'2013-1-9 00:10:00', 100, 6, no <--- takes the avg of 3 and 9
The tricky part of this is the retrieval of the appropriate row or rows from processB that correspond to each row of processA as you figured out.
Let's take it step by step.
First, we need to be able to join processA and processB to retrieve the candidate timestamp pairs. Let's do it like this:
SELECT a.date_time a,
TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) timediff
FROM processA a
JOIN processB b
ON TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) >= -1800
AND TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) < 1800
This gets us the a and b times meeting the [-30, 30) criterion. There are a lot of rows in this result; but we can inspect it to make sure we've done the range comparison correctly. http://sqlfiddle.com/#!2/48f54/47/0
Now we need to generate the time window to search for each a record for your one or more matching b records. Like so.
SELECT a,
MIN(ABS(timediff)) windowsize
FROM (
SELECT a.date_time a,
TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) timediff
FROM processA a
JOIN processB b
ON TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) >= -1800
AND TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) < 1800
) d
GROUP BY a
This yields two columns: the first is the timestamp from a, and the second is the time range of the nearest b timestamp (or timestamps, if more than one are to be averaged) that are in range. This resultset doesn't have any row for a records that don't have b records near enough to consider. http://sqlfiddle.com/#!2/48f54/46/0
Finally, we need to retrieve and average the b record values for each a record. Here this is.
SELECT processA.date_time date_time,
processA.valueA valueA,
AVG(processB.valueB) valueB
FROM processA
LEFT JOIN (
SELECT a,
MIN(ABS(timediff)) windowsize
FROM (
SELECT a.date_time a,
TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) timediff
FROM processA a
JOIN processB b
ON TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) >= -1800
AND TIMESTAMPDIFF(SECOND, a.date_time, b.date_time) < 1800
) d
GROUP BY a
) j ON processA.date_time = j.a
LEFT JOIN processB ON ( processB.date_time >= j.a - INTERVAL j.windowsize SECOND
AND processB.date_time <= j.a + INTERVAL j.windowsize SECOND
AND processB.date_time < j.a + INTERVAL 1800 SECOND)
GROUP BY processA.date_time, processA.valueA
Notice there are a couple of open ranges here (< operators instead of <= operators). Those are there to accomodate your [-30, 30) open range. Here's the query. http://sqlfiddle.com/#!2/48f54/45/0
This final query joins together three tables: processA, our virtual table showing the search range for each timestamp, and process B. The last ON clause performs the actual range search. It's made slightly more complicated by the open range.
See how this goes? It's helpful to construct the query from the inside out.
Don't forget to put an index on processB.date_time.
I am taking the liberty of leaving the join of processC to this virtual table to you.