What's the most efficient way to generate this report? - mysql

Given a table (daily_sales) with say 100k rows of the following data/columns:
id rep sales date
1 a 123 12/15/2011
2 b 153 12/15/2011
3 a 11 12/14/2011
4 a 300 12/13/2011
5 a 120 12/12/2011
6 b 161 11/15/2011
7 a 3 11/14/2011
8 c 13 11/14/2011
9 c 44 11/13/2011
What would be the most efficient way to write a report (completely in SQL) showing the two most recent entries (rep, sales, date) for each name, so the output would be:
a 123 12/15/2011
a 11 12/14/2011
b 153 12/15/2011
b 161 11/15/2011
c 13 11/14/2011
c 44 11/13/2011
Thanks!

FYI, your example is using mostly reserved words and makes it horrid for us to attempt to program against. If you've got the real table columns, gives those to us. This is postgres:
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2
That'll give you a list of first name,value,max(date)...though I gotta ask why give us a column called value if it doesn't change in the example?
Lets say you do have an id column...we'll be consistent with your scheme and call it 'ID'...
select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value and a.date = b.date
This gives a list of all ID's that are the max...put it together:
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2
union all
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
where id not in
(select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value and a.date = b.date)
Hoping my syntax is right...should be close at any rate. I'd put a bracket around that entire thing then select * from (above query) order by name...gives you the order you want.

For MySQL, explained in #Quassnoi's blog, an index on (name, date) and using this:
SELECT t.*
FROM (
SELECT name,
COALESCE(
(
SELECT date
FROM tableX ti
WHERE ti.name = dto.name
ORDER BY
ti.name, ti.date DESC
LIMIT 1
OFFSET 1 --- this is set to 2-1
), CAST('1000-01-01' AS DATE)) AS mdate
FROM (
SELECT DISTINCT name
FROM tableX dt
) dto
) tg
, tableX t
WHERE t.name >= tg.name
AND t.name <= tg.name
AND t.date >= tg.mdate

If I understand what you mean.. Then this MIGHT be helpful:
SELECT main.name, main.value, main.date
FROM tablename AS main
LEFT OUTER JOIN tablename AS ctr
ON main.name = ctr.rname
AND main.date <= ctr.rdate
GROUP BY main.name, main.date
HAVING COUNT(*) <= 2
ORDER BY main.name ASC, main.date DESC
I know the SQL is shorter than the other posts, but just give it a try first..

Related

Get distinct values from groups in MySQL

I want to get the id of the lowest points from each team (the team field).
My query works but i need to make sure the following query is good enough with a large table.
I need Simplification and Optimization.
Query:
SELECT T.id from teams as T
INNER JOIN (
SELECT MIN(T1.points) AS P FROM teams AS T1
GROUP BY T1.team LIMIT 5
) TJOIN ON T.points IN (TJOIN.P)
GROUP BY T.team
ORDER BY T.points ASC LIMIT 5
Table teams
id
team (foreign_key)
points (indexed)
1
a
100
2
a
101
3
b
106
4
c
105
5
c
102
Result
id
1
5
3
I believe the query you are looking for is:
SELECT MIN(T.id)
FROM teams as T
INNER JOIN (
SELECT team, MIN(points) AS min_points
FROM teams
GROUP BY team LIMIT 5
) TJOIN
ON T.team = TJOIN.team
AND T.points = TJOIN.min_points
GROUP BY T.team
ORDER BY T.points ASC
LIMIT 5
You need to join based on both the column being grouped by and the min value. Consider the result of your query if multiple teams had a score of 100.
Another way of doing this is to use ROW_NUMBER():
SELECT id
FROM (
SELECT id, points, ROW_NUMBER() OVER (PARTITION BY team ORDER BY points ASC, id ASC) rn
FROM teams
) t
WHERE rn = 1
ORDER BY points ASC
LIMIT 5

I want to calculate the sum of last transaction for A&B

Let's say the table looks like this:
user id
date
Amount
123
2022/11/01
5
456
2022/11/02
6
789
2022/11/03
8
123
2022/11/02
9
456
2022/11/04
6
789
2022/11/05
8
I want to calculate the sum of the very last transaction (only one for each user) for A & B FYI I'm using redash and I'm a beginner not sure what other info would you need, I tried MAX but was not sure how to apply it on more than one specific user.
Get the sum of Amount where user is A or B and date is the most recent date for each user
SELECT SUM(AMOUNT) AS total
FROM (
SELECT AMOUNT, ROW_NUMBER() OVER (PARTITION BY USERID ORDER BY DATE DESC) AS RN
FROM tableyoudidnotname
WHERE userid in ('A','B')
) X
WHERE X.RN = 1
You can try this, where we first calculate the maximum date by user in a common-table expression, then join that result-set to the table to sum the associated values.
WITH dat
AS
(
SELECT user_id, MAX(date) AS max_date
FROM credit.card
WHERE user_id IN ('A','B','ETC')
GROUP BY user_id
)
SELECT SUM(value) AS sum_on_max_dates
FROM credit.card t
INNER JOIN dat d ON t.user_id = d.user_id AND t.date = m.max_date;
You can try this, Used join with the subquery I mention below.
SELECT
SUM(t1.amount) AS count
FROM
transaction t1
JOIN
(SELECT
user_id, MAX(date) AS max_date
FROM
transaction
WHERE
user_id IN ('A', 'B')
GROUP BY user_id) t2 ON t1.user_id = t2.user_id
AND t2.max_date = t1.date;

mysql finding the sum of subgroup maximums

If I have the following table in MySQL:
date type amount
2017-12-01 3 2
2018-01-01 1 100
2018-02-01 1 50
2018-03-01 2 2000
2018-04-01 2 4000
2018-05-01 3 2
2018-06-01 3 1
...is there a way to find the sum of the amounts corresponding to the latest dates of each type? There are guaranteed to be no duplicate dates for any given type.
The answer I'd be looking to get from the data above could broken down like this:
The latest date for type 1 is 2018-02-01, where the amount is 50;
The latest date for type 2 is 2018-04-01, where the amount is 4000;
The latest date for type 3 is 2018-06-01, where the amount is 1;
50 + 4000 + 1 = 4051
Is there a way to arrive directly at 4051 in a single query? This is for a Django project using MySQL if that makes a difference; I wasn't able to find an ORM-related solution either, so figured a raw SQL query might be a better place to start.
Thanks!
Not sure for Django but in raw sql you could use a self join to pick latest row for each type based on latest date and then aggregate your results to get the sum of amounts for each type
select sum(a.amount)
from your_table a
left join your_table b on a.type = b.type
and a.date < b.date
where b.type is null
Demo
Or
select sum(a.amount)
from your_table a
join (
select type, max(date) max_date
from your_table
group by type
) b on a.type = b.type
and a.date = b.max_date
Demo
Or by using a correlated subuery
select sum(a.amount)
from your_table a
where a.date = (
select max(date)
from your_table
where type = a.type
)
Demo
For Mysql 8 you can use window functions to get you desired result as
select sum(amount)
from (select *, row_number() over (partition by type order by date desc) as seq
from your_table
) t
where seq = 1;
Demo

SQL Query Groupby UserID and take the first and last for duplicates

Hello I am trying to do a select of certain elements on a table. here is what my table look like Time,UsrerId,Data. Now what i want to do is for the same userid i want the first and last Value of data according to the timestamp.
Example:
Time UserID Data
8 PM 1 200
9 PM 1 300
10 PM 1 100
8 PM 2 150
9 PM 2 250
10 PM 2 350
8 PM 3 100
So my result should look like:
1 200 100
2 150 350
3 100 100
Find the min and max time per userid
Then Join the result with main table with userid and min time to get the min data per userid
Then again join the result with main table with userid and max time to get the max data per userid
Try this.
select A.UserID, A.Data as Min_data ,c.data as Max_data
from test A
join
(
SELECT UserID, MIN(Times) AS Min_Time,
Max(Times) AS Max_Time
FROM test
GROUP BY UserID
) B
ON a.UserID = B.UserID
and A.times = B.Min_Time
join test C
ON C.UserID = B.UserID
and C.times = B.Max_Time
SqlFiddle Demo
There are several steps to this.
The first is to find the earliest and latest time for each userid. That you do like this:
SELECT UserID,
MIN(Time) startTime,
MAX(Time) endTime
FROM theTable
GROUP BY UserID
Then, you need to use that result to fetch the Data associated with the start time. You do that by joining the above summary query (virtual table) like so.
SELECT b.UserID,
b.Data startData
FROM (
SELECT UserID,
MIN(Time) startTime,
MAX(Time) endTime
FROM theTable
GROUP BY UserID
) a
JOIN theTable b ON a.UserID = b.UserID AND a.startTime = b.Time
Finally, you need to cope with the end value in a similar way.
SELECT b.UserID,
b.Data startData,
c.Data endData
FROM (
SELECT UserID,
MIN(Time) startTime,
MAX(Time) endTime
FROM theTable
GROUP BY UserID
) a
JOIN theTable b ON a.UserID = b.UserID AND a.startTime = b.Time
JOIN theTable c ON a.UserID = c.UserID AND a.startTime = c.Time
This is a slightly tricky query because you have to join your table twice to get the two detail rows (start and end time rows) from it.
The "club sandwich" approach to building up your query should serve to make it clear how it works.

Calculate the time difference between of two rows

I have a table with column StartDate, I want to calculate the time difference between two consecutive record.
Thanks.
# Mark Byers and # Yahia, I have request table as requestId, startdate
requestId startdate
1 2011-10-16 13:15:56
2 2011-10-16 13:15:59
3 2011-10-16 13:15:59
4 2011-10-16 13:16:02
5 2011-10-16 13:18:07
and i want to know what is the time difference between requestid 1 & 2, 2 & 3, 3 & 4 and so on. i know i will need self join on table, but i am not getting correct on clause.
To achieve what you are asking try the following (UPDATE after edit from OP):
SELECT A.requestid, A.starttime, (B.starttime - A.starttime) AS timedifference
FROM MyTable A INNER JOIN MyTable B ON B.requestid = (A.requestid + 1)
ORDER BY A.requestid ASC
IF requestid is not consecutive then you can use
SELECT A.requestid, A.starttime, (B.starttime - A.starttime) AS timedifference
FROM MyTable A CROSS JOIN MyTable B
WHERE B.requestid IN (SELECT MIN (C.requestid) FROM MyTable C WHERE C.requestid > A.requestid)
ORDER BY A.requestid ASC
The accepted answer is correct but gives the difference of numbers.
As an example if I have the following 2 timestamps:
2014-06-09 09:48:15
2014-06-09 09:50:11
The difference is given as 196. This is simply 5011 - 4815.
In order to get the time difference, you may modify the script as follows:
SELECT A.requestid, A.starttime, TIMESTAMPDIFF(MINUTE,A.starttime,B.starttime) AS timedifference
FROM MyTable A INNER JOIN MyTable B ON B.requestid = (A.requestid + 1)
ORDER BY A.requestid ASC
SELECT TIMESTAMPDIFF(SECOND, grd.startdate, grd1.startdate) as TD FROM myTable A
inner join myTable B on A.requestId = B.requestId - 1 and
A.startdate >= '2019-07-01' order by TD desc