+-----+------------+------------+---------------------+
| id | seller_id | prod_id | date |
+-----+------------+----------------------------------+
| 1 | 283 | 4243 | 2016-10-10 23:55:01 |
| 2 | 287 | 4243 | 2016-10-10 02:01:06 |
| 3 | 283 | 4243 | 2016-10-11 23:55:06 |
| 4 | 311 | 4243 | 2016-10-11 23:55:07 |
| 5 | 283 | 4243 | 2016-10-12 23:55:07 |
| 6 | 283 | 4243 | 2016-10-13 23:55:07 |
| 7 | 311 | 4243 | 2016-10-13 23:55:07 |
| 8 | 287 | 4243 | 2016-10-14 23:57:06 |
| 9 | 311 | 4243 | 2016-10-14 23:57:06 |
| 10 | 311 | 4243 | 2016-10-15 23:57:06 |
+-----+------------+------------+---------------------+
From the table above how would I extract the following information using an MySQL query?
+------------+---------+----------------+---------------+
| seller_id | prod_id | streak in days | begin streak |
+-----+------------+--------------------+---------------+
| 283 | 4243 | 4 | 2016-10-10 |
| 287 | 4243 | 1 | 2016-10-10 |
| 311 | 4243 | 1 | 2016-10-11 |
| 311 | 4243 | 3 | 2016-10-13 |
| 287 | 4243 | 1 | 2016-10-14 |
+------------+---------+----------------+---------------|
So basically I need to identify each block of consecutive dates for each seller (seller_id) selling products (prod_id).
I limited this example to 1 prod_id and only a range of a few days, but sellers do sell more than 1 product (prod_id)
SELECT
seller_id
,prod_id
,COUNT(*) as StreakInDays
,MIN(DateCol) as BeginStreak
FROM
(
SELECT
seller_id
,prod_id
,DATE(DateCol) as DateCol
,(#rn:= if((#seller = seller_id) AND (#prod = prod_id), #rn + 1,
if((#seller:= seller_id) AND (#prod:= prod_id), 1, 1)
)
) as RowNumber
FROM
Transact t
CROSS JOIN (SELECT #seller:=0, #prod:=0, #rn:=0) var
ORDER BY
seller_id
,prod_id
,DATE(DateCol)
) t
GROUP BY
seller_id
,prod_id
,DATE_SUB(DateCol, INTERVAL RowNumber Day)
ORDER BY
prod_id
,DATE_SUB(DateCol, INTERVAL RowNumber Day)
,seller_id
Generate a partitioned row number partitioned by seller_id and prod_id. Then use the Date - RownNumber as a grouping and you can get to your answer by simple aggregation.
SQL Fiddle to show you it works for multiple products, sellers etc. http://sqlfiddle.com/#!9/0a0c44/8/0
Note if it is possible that the same seller can have more than 1 transaction for a product on the same day then you will need to replace the Transact with a derived table of DISTINCT seller_id, prod_id, DATE(date) before generating the row number like this:
SELECT
seller_id
,prod_id
,COUNT(*) as StreakInDays
,MIN(DateCol) as BeginStreak
FROM
(
SELECT
seller_id
,prod_id
,DateCol
,(#rn:= if((#seller = seller_id) AND (#prod = prod_id), #rn + 1,
if((#seller:= seller_id) AND (#prod:= prod_id), 1, 1)
)
) as RowNumber
FROM
(SELECT DISTINCT seller_id, prod_id, DATE(DateCol) as DateCol
FROM
Transact
)t
CROSS JOIN (SELECT #seller:=0, #prod:=0, #rn:=0) var
ORDER BY
seller_id
,prod_id
,DateCol
) t
GROUP BY
seller_id
,prod_id
,DATE_SUB(DateCol, INTERVAL RowNumber Day)
ORDER BY
prod_id
,DATE_SUB(DateCol, INTERVAL RowNumber Day)
,seller_id
http://sqlfiddle.com/#!9/0a0c44/11
Related
I have two queries that retrieve records from 2 different tables that are almost alike and I need to merge them together.
Both have created_date which is of type datetime and I'm casting this column to date because I want to group and order them by date only, I don't need the time.
First query:
select cast(created_date as date) the_date, count(*)
from question
where user_id = 2
group by the_date
order by the_date;
+------------+----------+
| the_date | count(*) |
+------------+----------+
| 2021-01-02 | 1 |
| 2021-02-10 | 1 |
| 2021-02-14 | 5 | -- this line contains a mutual date
| 2021-03-16 | 1 |
| 2021-03-26 | 3 |
| 2021-03-27 | 23 |
| 2021-03-28 | 5 |
| 2021-03-29 | 1 |
+------------+----------+
Second query:
select cast(created_date as date) the_date, count(*)
from answer
where user_id = 2
group by the_date
order by the_date;
+------------+----------+
| the_date | count(*) |
+------------+----------+
| 2021-02-08 | 2 |
| 2021-02-14 | 1 | -- this line contains a mutual date
| 2021-04-05 | 5 |
| 2021-04-06 | 2 |
+------------+----------+
What I need is to merge them like this:
+------------+---------------+---------------+
| the_date | count(query1) | count(query2) |
+------------+---------------+---------------+
| 2021-01-02 | 1 | 0 | -- count(query2) is 0 bc. it's not in the second query
| 2021-02-08 | 0 | 2 | -- count(query1) is 0 bc. it's not in the first query
| 2021-02-10 | 1 | 0 |
| 2021-02-14 | 5 | 1 | -- mutual date
| 2021-03-16 | 1 | 0 |
| 2021-03-26 | 3 | 0 |
| 2021-03-27 | 23 | 0 |
| 2021-03-28 | 5 | 0 |
| 2021-03-29 | 1 | 0 |
| 2021-04-05 | 0 | 5 |
| 2021-04-06 | 0 | 2 |
+------------+---------------+---------------+
Basically what I need is to have all dates together and for each date to have the corresponding values from those two queries.
try something like this.
SELECT the_date , max(cnt1) , max(cnt2)
FROM (
select cast(created_date as date) the_date, count(*) AS cnt1 , 0 as cnt2
from question
where user_id = 2
group by the_date
order by the_date
UNION ALL
select cast(created_date as date) the_date, 0, count(*)
from answer
where user_id = 2
group by the_date
order by the_date
) as t1
GROUP BY the_date
ORDeR BY the_date;
Given a sample table:
| id | locationID | Date_Time | temp |
|----|------------|---------------------|------|
| 1 | L001 | 2018-09-04 11:25:00 | 52.6 |
| 2 | L002 | 2018-09-04 11:35:00 | 66.1 |
| 3 | L003 | 2018-09-04 03:30:00 | 41.2 |
| 4 | L003 | 2018-09-05 10:22:00 | 71.8 |
| 5 | L003 | 2018-09-06 14:21:00 | 63.4 |
| 6 | L003 | 2018-09-06 18:18:00 | 50.1 |
I would like to return the latest N number of records for each group as below:
Expected output:
| id | locationID | Date_Time | temp |
|----|------------|---------------------|------|
| 1 | L001 | 2018-09-04 11:25:00 | 52.6 |
| 2 | L002 | 2018-09-04 11:35:00 | 66.1 |
| 4 | L003 | 2018-09-05 10:22:00 | 71.8 |
| 5 | L003 | 2018-09-06 14:21:00 | 63.4 |
| 6 | L003 | 2018-09-06 18:18:00 | 50.1 |
I have this query but it only returns the latest row for each group? I would like to return more than one row (N number of rows) for each group?
SELECT *
FROM HealthStatus
WHERE Date_Time IN (
SELECT MAX(Date_Time)
FROM HealthStatus
GROUP BY LocationID
)
Would really appreciate some help on how I can achieve my desired output. Thanks in advance.
is there any way that I can get the latest N rows for each group?
Prior to the availability of row_number() you can use variables to mimic that function
SELECT
*
FROM (
SELECT
#row_num :=IF(#prev_value=locationID,#row_num+1,1)AS RowNumber
, id , locationID , Date_Time,temp
, #prev_value := locationID
FROM HealthStatus
CROSS JOIN (SELECT #row_num :=1, #prev_value :='') vars
ORDER BY
locationID , Date_Time DESC
) derived
WHERE RowNumber < 4
If you upgraded to MySQL 8.0, then this would solve it:
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY locationID ORDER BY Date_Time DESC) AS r
FROM HealthStatus
) T
WHERE r <= 3;
On 5.7... I can't think of a way. :(
This is a follow up question to MySQL count / track streaks or consecutive dates
The solution provided by Matt to my earlier question works great, but I'm running into an issue now that I'm dealing with 1 extra column (prod_cond). A product can be used or new, and will be listed under the same prod_id. In these cases the streaks are not calculated correctly anymore.
I created an example here: http://sqlfiddle.com/#!9/3f04c3/17
I haven't been able to get them to display correctly with this additional column.
+-----+------------+------------+-----------------------------------+
| id | seller_id | prod_id | prod_cond | date |
+-----+------------+------------------------------------------------+
| 1 | 283 | 4243 | 1 | 2016-10-10 23:55:01 |
| 2 | 283 | 4243 | 2 | 2016-10-10 02:01:06 |
| 3 | 283 | 4243 | 1 | 2016-10-11 23:55:06 |
| 4 | 283 | 4243 | 2 | 2016-10-11 23:55:07 |
| 5 | 283 | 4243 | 1 | 2016-10-12 23:55:07 |
| 6 | 283 | 4243 | 2 | 2016-10-13 23:55:07 |
| 7 | 283 | 4243 | 1 | 2016-10-14 23:55:07 |
| 8 | 283 | 4243 | 2 | 2016-10-14 23:57:06 |
| 9 | 283 | 4243 | 1 | 2016-10-15 23:57:06 |
| 10 | 283 | 4243 | 2 | 2016-10-15 23:57:06 |
+-----+------------+------------+-------------+---------------------+
So basically I need to identify each block of consecutive dates for each seller (seller_id) selling products (prod_id) with product condition (prod_cond) new (1) or (2) used.
This is what the result should look like:
+------------+---------+---------+----------------+---------------+
| seller_id | prod_id | cond_id | streak in days | begin streak |
+------------+---------+---------+----------------+---------------+
| 283 | 4243 | 1 | 3 | 2016-10-10 |
| 283 | 4243 | 1 | 2 | 2016-10-14 |
| 283 | 4243 | 2 | 2 | 2016-10-10 |
| 283 | 4243 | 2 | 3 | 2016-10-13 |
+------------+---------+---------+----------------+---------------|
But as you can see here: http://sqlfiddle.com/#!9/3f04c3/17
It is not working correctly.
In MySQL, you would do this using variables:
select seller_id, prod_id, cond_id, count(*) as numdays,
min(date), max(date)
from (select t.*,
(#rn := if(#grp = concat_ws(':', seller_id, prod_id, cond_id), #rn + 1,
if(#grp := concat_ws(':', seller_id, prod_id, cond_id), #rn + 1, #rn + 1)
)
) rn
from transact t cross join
(select #grp := 0, #rn := '') params
order by seller_id, prod_id, cond_id, date
) t
group by seller_id, prod_id, cond_id, date_sub(date(date), interval rn day)
The idea is that for each group -- based on seller, product, and condition -- the query enumerates the dates. Then, the date minus the enumerated value is constant for consecutive dates.
Here is a SQL Fiddle showing it working.
I am trying to come up with a single query which will take the following table (named sales):
user_id | order_total | order_date |
1 | 100 | 2012-01-01 |
1 | 200 | 2013-06-04 |
1 | 150 | 2012-01-08 |
2 | 100 | 2015-02-01 |
3 | 105 | 2014-10-27 |
And will return the following:
user_id | order_total | num_orders | last_order |
1 | 450 | 3 | 2013-06-04 |
3 | 105 | 1 | 2014-10-27 |
2 | 100 | 1 | 2015-02-01 |
So far I have come up with the following SQL to get the result:
SELECT
DISTINCT a.user_id,
SUM(order_total) AS order_total,
COUNT(*) AS num_orders,
b.order_date as last_order
FROM
`sales` AS a,
(
SELECT
order_date,
user_id
FROM `sales`
ORDER BY order_date DESC
) AS b
WHERE a.user_id = b.user_id
GROUP BY user_id
ORDER BY order_total DESC
The problem, however is that it returns:
user_id | order_total | num_orders | last_order |
1 | 1350 | 9 | 2013-06-04 |
3 | 105 | 1 | 2014-10-27 |
2 | 100 | 1 | 2015-02-01 |
Is there some way to prevent the sub-query from affecting the results of Sum and Count? Or am I going about this the wrong way?
why are you using a subselect?
SELECT user_id,
SUM(order_total) AS order_total,
MAX(order_date) AS lastOrder,
COUNT(*) AS num_orders
FROM table
GROUP BY user_id
ORDER BY order_total DESC
id | userid | exerciseid | date | time | weight | distance | reps
1 | 24 | 1 | 2013-09-28 00:00:00 | 2321 | 231 | 121 | NULL
2 | 24 | 24 | 2013-09-28 00:00:00 | 2321 | 231 | 121 | NULL
3 | 24 | 1 | 2013-09-28 00:00:00 | 2321 | 231 | 121 | NULL
4 | 24 | 1 | 2000-00-00 00:00:00 | NULL | 100 | NULL | 2
5 | 24 | 1 | 2013-09-28 00:00:00 | 2321 | 231 | 121 | NULL
Rows 1, 3, and 5 are the same. I want to do a count that groups them together, whilst also adding a column with the count value.
SELECT id, userid, exerciseid, date, time, weight, distance, reps
FROM `exercises`
WHERE `userid` = 1 AND `date` < now()
So I want this to return something similar too:
id | userid | exerciseid | date | time | weight | distance | reps | count
1 | 24 | 1 | 2013-09-28 00:00:00 | 2321 | 231 | 121 | NULL | 3
4 | 24 | 1 | 2000-00-00 00:00:00 | NULL | 100 | NULL | 2 | NULL
Try this
SELECT id, userid, exerciseid, date, time, weight, distance, reps,
count(*) as count
FROM `exercises`
WHERE userid = 1 AND date < now()
GROUP BY id, userid, exerciseid, date, time, weight, distance, reps
Try this :
SELECT Count(*) As CountM, id, userid, exerciseid, date, time, weight, distance, reps
FROM `exercises`
WHERE `userid` = 1 AND `date` < now()
Group by id, userid, exerciseid, date, time, weight, distance, reps