Selecting first and last time stamps of a section - mysql

I have a MySQL database with a table like:
CREATE TABLE example (Batch_Num int, Time_Stamp datetime);
INSERT INTO example VALUES
(1, '2020-12-10 16:37:43'),
(1, '2020-12-11 09:47:31'),
(1, '2020-12-11 14:02:17'),
(1, '2020-12-11 15:28:02'),
(2, '2020-12-12 15:08:52'),
(2, '2020-12-14 10:38:02'),
(2, '2020-12-14 16:22:35'),
(2, '2020-12-15 08:44:13'),
(3, '2020-12-16 11:38:05'),
(3, '2020-12-17 10:19:13'),
(3, '2020-12-17 14:45:28');
+-----------+-----------------------+
| Batch_Num | Time_Stamp |
+-----------+-----------------------+
| 1 | '2020-12-10 16:37:43' |
| 1 | '2020-12-11 09:47:31' |
| 1 | '2020-12-11 14:02:17' |
| 1 | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' |
| 2 | '2020-12-14 10:38:02' |
| 2 | '2020-12-14 16:22:35' |
| 2 | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' |
| 3 | '2020-12-17 10:19:13' |
| 3 | '2020-12-17 14:45:28' |
+-----------+-----------------------+
I would like to select from this table the first and last timestamp for each value of each Batch_Number. I would like the table to look like:
+-----------+-----------------------+-----------------------+
| Batch_Num | Beginning_Time_Stamp | End_Time_Stamp |
+-----------+-----------------------+-----------------------+
| 1 | '2020-12-10 16:37:43' | '2020-12-11 15:28:02' |
| 2 | '2020-12-12 15:08:52' | '2020-12-15 08:44:13' |
| 3 | '2020-12-16 11:38:05' | '2020-12-17 14:45:28' |
+-----------+-----------------------+-----------------------+
I am not sure how to select both, when the previous Batch_Num is different from the curent one, and also when the next one is different.

A basic GROUP BY query should work here:
SELECT
Batch_Num,
MIN(Time_Stamp) AS Beginning_Time_Stamp,
MAX(Time_Stamp) AS End_Time_Stamp
FROM example
GROUP BY
Batch_Num
ORDER BY
Batch_Num;
Demo

If the same batch number might appear in different series, then aggrgaation alone cannot solve the problem. You would typically approach this with some gaps-and-island technique; here, a simple approach uses the difference between row numbers to identify groups of adjacent records (islands):
select batch_num,
min(time_stamp) as start_time_stamp,
max(time_stamp) as end_time_stamp,
count(*) as cnt
from (
select e.*,
row_number() over(order by time_stamp) as rn1,
row_number() over(partition by batch_num order by time_stamp) as rn2
from example e
) t
group by batch_num, rn1 - rn2
order by start_time_stamp
Here is a demo. I added a new occurence of batch 1 at the end of the dataset:
batch_num | start_time_stamp | end_time_stamp | cnt
--------: | :------------------ | :------------------ | --:
1 | 2020-12-10 16:37:43 | 2020-12-11 15:28:02 | 4
2 | 2020-12-12 15:08:52 | 2020-12-15 08:44:13 | 4
3 | 2020-12-16 11:38:05 | 2020-12-17 14:45:28 | 3
1 | 2020-12-18 14:02:17 | 2020-12-18 15:28:02 | 2

Related

islands and gaps ordering issue MYSQL 8.0

I am trying to use partition by & row_number() to count consecutive duplicate values for a given date range.Essentially its attempting to capture "streaks" If there is a break in the streak the count should start over when the value occurs again.
To reproduce these results here is the code:
CREATE TABLE partion_test (
daily DATE,
response_short_name VARCHAR(10)
);
INSERT INTO `partion_test` (`daily`, `response_short_name`) VALUES
('2020-09-21', 'A'),
('2020-09-25', 'A'),
('2020-09-26', 'A'),
('2020-09-27', 'A'),
('2020-09-28', 'A'),
('2020-09-22', 'B'),
('2020-09-20', 'C'),
('2020-09-23', 'C'),
('2020-09-24', 'C');
SELECT
daily,
response_short_name
,row_number() over (partition by response_short_name order by daily) as seqnum
FROM (
select
daily,
response_short_name
FROM partion_test
order by daily limit 1000
) A;
HERE IS THE CURRENT OUTPUT
| daily | response_short_name | seqnum | |
+------------+---------------------+--------+--+
| 2020-09-21 | A | 1 | |
| 2020-09-25 | A | 2 | |
| 2020-09-26 | A | 3 | |
| 2020-09-27 | A | 4 | |
| 2020-09-28 | A | 5 | |
| 2020-09-22 | B | 1 | |
| 2020-09-20 | C | 1 | |
| 2020-09-23 | C | 2 | |
| 2020-09-24 | C | 3 | |
+------------+---------------------+--------+--+
HERE IS THE DESIRED OUTPOUT
+------------+---------------------+--------+--+
| daily | response_short_name | seqnum | |
+------------+---------------------+--------+--+
| 2020-09-20 | C | 1 | |
| 2020-09-21 | A | 1 | |
| 2020-09-22 | B | 1 | |
| 2020-09-23 | C | 1 | |
| 2020-09-24 | C | 2 | |
| 2020-09-25 | A | 1 | |
| 2020-09-26 | A | 2 | |
| 2020-09-27 | A | 3 | |
| 2020-09-28 | A | 4 | |
+------------+---------------------+--------+--+
Ive been scratching at my brain for a while on this. Any help would be appreciated
You can do:
select *,
row_number() over(partition by grp order by daily) as seqnum
from (
select *,
sum(inc) over(order by daily) as grp
from (
select *,
case when lag(response_short_name) over(order by daily) = response_short_name
then 0 else 1 end as inc
from partion_test
order by daily
) x
) y
order by daily
Result:
daily response_short_name inc grp seqnum
----------- -------------------- ---- ---- ------
2020-09-20 C 1 1 1
2020-09-21 A 1 2 1
2020-09-22 B 1 3 1
2020-09-23 C 1 4 1
2020-09-24 C 0 4 2
2020-09-25 A 1 5 1
2020-09-26 A 0 5 2
2020-09-27 A 0 5 3
2020-09-28 A 0 5 4
See running example at DB Fiddle:
Your data doesn't fit your result, so it is quite diffcult t achieve your result
CREATE TABLE partion_test (
daily DATE,
response_short_name VARCHAR(10)
);
INSERT INTO `partion_test` (`daily`, `response_short_name`) VALUES
('2020-09-21', 'A'),
('2020-09-25', 'A'),
('2020-09-26', 'A'),
('2020-09-27', 'A'),
('2020-09-28', 'A'),
('2020-09-22', 'B'),
('2020-09-20', 'C'),
('2020-09-23', 'C'),
('2020-09-24', 'C');
select `daily`,`response_short_name`,
row_number() over (partition by `response_short_name`, grp order by `daily`) as row_num
from (select t.*,
(row_number() over (order by `daily`) -
row_number() over (partition by `response_short_name` order by `daily`)
) as grp
from partion_test t
) t
ORDER BY `daily`
daily | response_short_name | row_num
:--------- | :------------------ | ------:
2020-09-20 | C | 1
2020-09-21 | A | 1
2020-09-22 | B | 1
2020-09-23 | C | 1
2020-09-24 | C | 2
2020-09-25 | A | 1
2020-09-26 | A | 2
2020-09-27 | A | 3
2020-09-28 | A | 4
db<>fiddle here

Calculate tax amount between 3 different tables with MySQL

I have the following tables structure and trying to make a report from these:
___BillableDatas
|--------|------------|---------|--------------|------------|
| BIL_Id | BIL_Date |BIL_Rate | BIL_Quantity | BIL_Status |
|--------|------------|---------|--------------|------------|
| 1 | 2018-03-01 | 105 | 1 | charged |
| 2 | 2018-03-02 | 105 | 1 | cancelled |
| 3 | 2018-03-01 | 15 | 2 | notcharged |
| 4 | 2018-03-01 | 21 | 1 | notcharged |
| 5 | 2018-03-02 | 15 | 2 | notcharged |
| 6 | 2018-03-02 | 21 | 1 | notcharged |
|--------|------------|---------|--------------|------------|
___SalesTaxes
|--------|--------------|------------|
| STX_Id | STX_TaxeName | STX_Amount |
|--------|--------------|------------|
| 8 | Tax 1 | 5.000 |
| 9 | Tax 2 | 5.000 |
| 10 | Tax 3 | 19.975 |
|--------|--------------|------------|
STX_Amount is a percentage.
___ApplicableTaxes
|-----------|-----------|
| ATX_BILId | ATX_STXId |
|-----------|-----------|
| 1 | 8 |
| 1 | 9 |
| 1 | 10 |
| 2 | 8 |
| 2 | 9 |
| 2 | 10 |
| 3 | 9 |
| 3 | 10 |
| 4 | 9 |
| 5 | 9 |
| 5 | 10 |
| 6 | 9 |
|-----------|-----------|
ATX_BILId is the item ID link with ___BillableDatas.
ATX_STXId is the tax ID link with ___SalesTaxes.
I need to get to sum of the items per day
- without tax
- with tax
So mething like this:
|------------------|---------------|------------|
| BIL_RateNonTaxed | BIL_RateTaxed | BIL_Status |
|------------------|---------------|------------|
| 105.00 | 136.47 | charged | <- Taxes #8, #9 and #10 applicable
| 102.00 | 118.035 | notcharged | <- Taxes #9 and #10 applicable
|------------------|---------------|------------|
Explications on the totals:
105 = 105*1 -- (total of the charged item multiply by the quantity)
102 = (15*2)*2+(21*2) -- (total of the notcharged items multiply by the quantity)
136.47 = 105+(105*(5+5+19.975)/100)
119.085 = 102+(((15*2)*2)*(5+19.975)/100+(21*2)*5/100)
My last try was this one:
SELECT
BIL_Date,
(BIL_Rate*BIL_Quantity) AS BIL_RateNonTaxed,
(((BIL_Rate*BIL_Quantity)*SUM(STX_Amount)/100)+BIL_Rate*BIL_Quantity) AS BIL_RateTaxed,
BIL_Status
FROM ___BillableDatas
LEFT JOIN ___SalesTaxes
ON FIND_IN_SET(STX_Id, BIL_ApplicableTaxes) > 0
LEFT JOIN ___ApplicableTaxes
ON ___BillableDatas.BIL_Id = ___ApplicableTaxes.ATX_BILId
WHERE BIL_BookingId=1
GROUP BY BIL_Id AND BIL_Status
ORDER BY BIL_Date
ASC
Please see this SQLFiddle to help you if needed:
http://sqlfiddle.com/#!9/425854f
Thanks.
I cannot bear to work with your naming policy, so I made my own...
DROP TABLE IF EXISTS bills;
CREATE TABLE bills
(bill_id SERIAL PRIMARY KEY
,bill_date DATE NOT NULL
,bill_rate INT NOT NULL
,bill_quantity INT NOT NULL
,bill_status ENUM('charged','cancelled','notcharged')
);
INSERT INTO bills VALUES
(1,'2018-03-01',105,1,'charged'),
(2,'2018-03-02',105,1,'cancelled'),
(3,'2018-03-01',15,2,'notcharged'),
(4,'2018-03-01',21,1,'notcharged'),
(5,'2018-03-02',15,2,'notcharged'),
(6,'2018-03-02',21,1,'notcharged');
DROP TABLE IF EXISTS sales_taxes;
CREATE TABLE sales_taxes
(sales_tax_id SERIAL PRIMARY KEY
,sales_tax_name VARCHAR(12) NOT NULL
,sales_tax_amount DECIMAL(5,3) NOT NULL
);
INSERT INTO sales_taxes VALUES
( 8,'Tax 1', 5.000),
( 9,'Tax 2', 5.000),
(10,'Tax 3',19.975);
DROP TABLE IF EXISTS applicable_taxes;
CREATE TABLE applicable_taxes
(bill_id INT NOT NULL
,sales_tax_id INT NOT NULL
,PRIMARY KEY(bill_id,sales_tax_id)
);
INSERT INTO applicable_taxes VALUES
(1, 8),
(1, 9),
(1,10),
(2, 8),
(2, 9),
(2,10),
(3, 9),
(3,10),
(4, 9),
(5, 9),
(5,10),
(6, 9);
SELECT bill_status
, SUM(bill_rate*bill_quantity) nontaxed
, SUM((bill_rate*bill_quantity)+(bill_rate*bill_quantity*total_sales_tax/100)) taxed
FROM
( SELECT b.*
, SUM(t.sales_tax_amount) total_sales_tax
FROM bills b
JOIN applicable_taxes bt
ON bt.bill_id = b.bill_id
JOIN sales_taxes t
ON t.sales_tax_id = bt.sales_tax_id
GROUP
BY bill_id
) x
GROUP
BY bill_status;
+-------------+---------+-------------+
| bill_status | untaxed | total |
+-------------+---------+-------------+
| charged | 105 | 136.4737500 |
| cancelled | 105 | 136.4737500 |
| notcharged | 102 | 119.0850000 |
+-------------+---------+-------------+
My answer is very slightly different from yours, so one of us has made a mistake somewhere. Either way, this should get you pretty close.
SELECT a.BIL_Date, BIL_RateNonTaxed, BIL_RateNonTaxed+BIL_RateTaxed AS BIL_RateTaxed FROM (
SELECT BIL_Date,
SUM(BIL_Rate*BIL_Quantity) AS BIL_RateNonTaxed
FROM ___BillableDatas
WHERE BIL_Status != 'cancelled'
GROUP BY BIL_Date
) a INNER JOIN (
SELECT BIL_Date,
(((BIL_Rate*BIL_Quantity)*SUM(STX_Amount)/100)) AS BIL_RateTaxed
FROM ___BillableDatas
LEFT JOIN ___ApplicableTaxes
ON ___BillableDatas.BIL_Id = ___ApplicableTaxes.ATX_BILId
LEFT JOIN ___SalesTaxes
ON STX_Id = ATX_STXId
WHERE BIL_Status != 'cancelled'
GROUP BY BIL_Date
) b
ON a.BIL_Date = b.BIL_Date
ORDER BY a.BIL_Date;
Explanation:
Your BIL_RateNonTaxed calculation is not using the ___SalesTaxes table, so it must not appear on the query otherwise it would interfere the SUM function.
Howerver, your BIL_RateTaxed does use the ___SalesTaxes table. In that case, I solved by creating 2 subqueries and joining the results.
I know there are better answers, but I'm not familiar with MySQL syntax.

Order result by IDs with the most recent date

I tried a lot, but I cannot figure out a way to do this:
I have a table with (not unique) IDs and dates. All entries should be selected in the end, but they need to be sorted.
Table:
+----+------------+
| id | date |
+----+------------+
| 1 | 2017-12-10 |
| 1 | 2015-05-22 |
| 7 | 2016-04-05 |
| 2 | 2017-12-12 |
| 2 | 2014-03-10 |
| 7 | 2016-01-14 |
| 1 | 2016-08-17 |
+----+------------+
What I need:
+----+------------+
| id | date |
+----+------------+
| 2 | 2017-12-12 |
| 2 | 2014-03-10 |
| 1 | 2017-12-10 |
| 1 | 2016-08-17 |
| 1 | 2015-05-22 |
| 7 | 2016-04-05 |
| 7 | 2016-01-14 |
+----+------------+
I need everything "grouped" by the ids, starting with the id that has the most recent date linked to it.
id: 2 / date: 2017-12-12
has the most recent date, so now all rows with Id 2 follow, ordered by the date descending. After that, which "block" of ids comes next is determined again by the next most recent date and so on.
Using a subquery that groups by id, we get the max date, then joining this to the source data gives us the max date on every row to sort by.
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE Table1
(`id` int, `date` datetime)
;
INSERT INTO Table1
(`id`, `date`)
VALUES
(1, '2017-12-10 00:00:00'),
(1, '2015-05-22 00:00:00'),
(7, '2016-04-05 00:00:00'),
(2, '2017-12-12 00:00:00'),
(2, '2014-03-10 00:00:00'),
(7, '2016-01-14 00:00:00'),
(1, '2016-08-17 00:00:00')
;
Query 1:
select t.*
from table1 t
inner join (
select id, max(`date`) maxdate
from table1
group by id
) g on t.id = g.id
order by g.maxdate DESC, t.id, t.date DESC
Results:
| id | date |
|----|----------------------|
| 2 | 2017-12-12T00:00:00Z |
| 2 | 2014-03-10T00:00:00Z |
| 1 | 2017-12-10T00:00:00Z |
| 1 | 2016-08-17T00:00:00Z |
| 1 | 2015-05-22T00:00:00Z |
| 7 | 2016-04-05T00:00:00Z |
| 7 | 2016-01-14T00:00:00Z |
if your table is stack
SELECT * FROM `stack` LEFT OUTER JOIN (SELECT * FROM `stack` GROUP BY `id` )t1 ON
`stack`.`id` = t1.`id` ORDER BY t1.`date` DESC,`stack`.`date` DESC

Correlated Subqueries with MAX() and GROUP BY

I have the issue using MAX() and GROUP BY.
I have next tables:
personal_prizes
___________ ___________ _________ __________
| id | userId | specId| group |
|___________|___________|_________|__________|
| 1 | 1 | 1 | 1 |
|___________|___________|_________|__________|
| 2 | 1 | 2 | 1 |
|___________|___________|_________|__________|
| 3 | 2 | 3 | 1 |
|___________|___________|_________|__________|
| 4 | 2 | 4 | 2 |
|___________|___________|_________|__________|
| 5 | 1 | 5 | 2 |
|___________|___________|_________|__________|
| 6 | 1 | 6 | 2 |
|___________|___________|_________|__________|
| 7 | 2 | 7 | 3 |
|___________|___________|_________|__________|
prizes
___________ ___________ _________
| id | title | group |
|___________|___________|_________|
| 1 | First | 1 |
|___________|___________|_________|
| 2 | Second | 1 |
|___________|___________|_________|
| 3 | Newby | 1 |
|___________|___________|_________|
| 4 | General| 2 |
|___________|___________|_________|
| 5 | Leter | 2 |
|___________|___________|_________|
| 6 | Ter | 2 |
|___________|___________|_________|
| 7 | Mentor | 3 |
|___________|___________|_________|
So, I need to select highest title for user.
E.g. user with id = 1 must have prizes 'Second', 'Ter'.
I don't know how to implement it in one query(((
So, first of all, I try to select highest specID for user.
I try next:
SELECT pp.specID
FROM personal_prizes pp
WHERE pp.specID IN (SELECT MAX(pp1.id)
FROM personal_prizes pp1
WHERE pp1.userId = 1
GROUP BY pp1.group)
And it doesnt work.
So please help me to solve this problem.
And if you help to select prizes for user it will be great!
The problem I perceive here is that prizes.id isn't really a reliable way to determine which is the "highest" prize. Ignoring this however I suggest using ROW_NUMBER() OVER() to locate the "highest" prize per user as follows:
Refer to this SQL Fiddle
CREATE TABLE personal_prizes
([id] int, [userId] int, [specId] int, [group] int)
;
INSERT INTO personal_prizes
([id], [userId], [specId], [group])
VALUES
(1, 1, 1, 1),
(2, 1, 2, 1),
(3, 2, 3, 1),
(4, 2, 4, 2),
(5, 1, 5, 2),
(6, 1, 6, 2),
(7, 2, 7, 3)
;
CREATE TABLE prizes
([id] int, [title] varchar(7), [group] int)
;
INSERT INTO prizes
([id], [title], [group])
VALUES
(1, 'First', 1),
(2, 'Second', 1),
(3, 'Newby', 1),
(4, 'General', 2),
(5, 'Leter', 2),
(6, 'Ter', 2),
(7, 'Mentor', 3)
;
Query 1:
select
*
from (
select
pp.*, p.title
, row_number() over(partition by pp.userId order by p.id ASC) as prize_order
from personal_prizes pp
inner join prizes p on pp.specid = p.id
) d
where prize_order = 1
Results:
| id | userId | specId | group | title | prize_order |
|----|--------|--------|-------|-------|-------------|
| 1 | 1 | 1 | 1 | First | 1 |
| 3 | 2 | 3 | 1 | Newby | 1 |
The result can be "reversed" by changing the ORDER BY within the over clause:
select
*
from (
select
pp.*, p.title
, row_number() over(partition by pp.userId order by p.id DESC) as prize_order
from personal_prizes pp
inner join prizes p on pp.specid = p.id
) d
where prize_order = 1
| id | userId | specId | group | title | prize_order |
|----|--------|--------|-------|--------|-------------|
| 6 | 1 | 6 | 2 | Ter | 1 |
| 7 | 2 | 7 | 3 | Mentor | 1 |
You could expand on this logic to locate "highest prize per group" too
select
*
from (
select
pp.*, p.title
, row_number() over(partition by pp.userId, p.[group] order by p.id ASC) as prize_order
from personal_prizes pp
inner join prizes p on pp.specid = p.id
) d
where prize_order = 1
| id | userId | specId | group | title | prize_order |
|----|--------|--------|-------|---------|-------------|
| 1 | 1 | 1 | 1 | First | 1 |
| 5 | 1 | 5 | 2 | Leter | 1 |
| 3 | 2 | 3 | 1 | Newby | 1 |
| 4 | 2 | 4 | 2 | General | 1 |
| 7 | 2 | 7 | 3 | Mentor | 1 |

Manipulating user data in MySQL

New to MySQL and need help manipulating user data in table 1 into the structure shown in table 2.
table 1
table 2
A user session is defined as a period of user activity with requests at least every 30 minutes. A session ends when the user has been inactive for over 30 minutes.
Does anyone know how to write mysql code that transforms table 1 into 2?
The following code can be used to create the log table:
CREATE TABLE log
( user_id int, request_timestamp datetime);
INSERT INTO log
VALUES
(1, '2014-10-26 10:51:18'), (1, '2014-10-26 10:52:20'), (1, '2014-10-26 11:15:03'), (1, '2014-10-26 11:39:18'), (1, '2014-10-26 15:01:18'), (1, '2014-10-26 15:01:21'), (1, '2014-10-27 21:22:19'),
(2, '2014-10-15 12:19:01'), (2, '2014-10-15 12:19:12'), (2, '2014-10-15 12:19:45'), (2, '2014-10-15 12:20:03'), (2, '2014-10-17 14:55:13'), (2, '2014-10-17 14:55:19'),(2, '2014-10-17 14:55:22')
;
Schema
CREATE TABLE log
( user_id int, request_timestamp datetime);
INSERT INTO log
VALUES
(1, '2014-10-26 10:51:18'), (1, '2014-10-26 10:52:20'), (1, '2014-10-26 11:15:03'), (1, '2014-10-26 11:39:18'), (1, '2014-10-26 15:01:18'), (1, '2014-10-26 15:01:21'), (1, '2014-10-27 21:22:19'),
(2, '2014-10-15 12:19:01'), (2, '2014-10-15 12:19:12'), (2, '2014-10-15 12:19:45'), (2, '2014-10-15 12:20:03'), (2, '2014-10-17 14:55:13'), (2, '2014-10-17 14:55:19'),(2, '2014-10-17 14:55:22');
First we will give the following a name just to visualize it:
Note below the 1800 means 30 min * 60 sec/minute
Specimen A
-----
select l.user_id,l.request_timestamp,
#sessionnum :=
if((#curuser = user_id and TIME_TO_SEC(TIMEDIFF(request_timestamp,#theDt))>1800),#sessionnum + 1,
if(#curuser <> user_id,1,#sessionnum)) as sessionnum,
#curuser := user_id as v_curuser,
#theDt:=request_timestamp as v_theDt
from log l cross join
(select #curuser := '', #sessionnum := 0,#theDt:='') gibberish
order by l.user_id,l.request_timestamp
+---------+---------------------+------------+-----------+---------------------+
| user_id | request_timestamp | sessionnum | v_curuser | v_theDt |
+---------+---------------------+------------+-----------+---------------------+
| 1 | 2014-10-26 10:51:18 | 1 | 1 | 2014-10-26 10:51:18 |
| 1 | 2014-10-26 10:52:20 | 1 | 1 | 2014-10-26 10:52:20 |
| 1 | 2014-10-26 11:15:03 | 1 | 1 | 2014-10-26 11:15:03 |
| 1 | 2014-10-26 11:39:18 | 1 | 1 | 2014-10-26 11:39:18 |
| 1 | 2014-10-26 15:01:18 | 2 | 1 | 2014-10-26 15:01:18 |
| 1 | 2014-10-26 15:01:21 | 2 | 1 | 2014-10-26 15:01:21 |
| 1 | 2014-10-27 21:22:19 | 3 | 1 | 2014-10-27 21:22:19 |
| 2 | 2014-10-15 12:19:01 | 1 | 2 | 2014-10-15 12:19:01 |
| 2 | 2014-10-15 12:19:12 | 1 | 2 | 2014-10-15 12:19:12 |
| 2 | 2014-10-15 12:19:45 | 1 | 2 | 2014-10-15 12:19:45 |
| 2 | 2014-10-15 12:20:03 | 1 | 2 | 2014-10-15 12:20:03 |
| 2 | 2014-10-17 14:55:13 | 2 | 2 | 2014-10-17 14:55:13 |
| 2 | 2014-10-17 14:55:19 | 2 | 2 | 2014-10-17 14:55:19 |
| 2 | 2014-10-17 14:55:22 | 2 | 2 | 2014-10-17 14:55:22 |
+---------+---------------------+------------+-----------+---------------------+
Then we are done if you want. But for pretty printing, can wrap Specimen A inside of another:
select user_id,request_timestamp,sessionnum
from
( select l.user_id,l.request_timestamp,
#sessionnum :=
if((#curuser = user_id and TIME_TO_SEC(TIMEDIFF(request_timestamp,#theDt))>1800),#sessionnum + 1,
if(#curuser <> user_id,1,#sessionnum)) as sessionnum,
#curuser := user_id as v_curuser,
#theDt:=request_timestamp as v_theDt
from log l cross join
(select #curuser := '', #sessionnum := 0,#theDt:='') gibberish
order by l.user_id,l.request_timestamp
) SpecimenA
order by user_id,sessionnum
+---------+---------------------+------------+
| user_id | request_timestamp | sessionnum |
+---------+---------------------+------------+
| 1 | 2014-10-26 10:51:18 | 1 |
| 1 | 2014-10-26 10:52:20 | 1 |
| 1 | 2014-10-26 11:15:03 | 1 |
| 1 | 2014-10-26 11:39:18 | 1 |
| 1 | 2014-10-26 15:01:18 | 2 |
| 1 | 2014-10-26 15:01:21 | 2 |
| 1 | 2014-10-27 21:22:19 | 3 |
| 2 | 2014-10-15 12:19:01 | 1 |
| 2 | 2014-10-15 12:19:12 | 1 |
| 2 | 2014-10-15 12:19:45 | 1 |
| 2 | 2014-10-15 12:20:03 | 1 |
| 2 | 2014-10-17 14:55:13 | 2 |
| 2 | 2014-10-17 14:55:19 | 2 |
| 2 | 2014-10-17 14:55:22 | 2 |
+---------+---------------------+------------+
14 rows in set (0.02 sec)
Note the OP's definition of a session. It is one of inactivity, not duration.
Try this:
SELECT user_id,
count(*) as request_count,
min(request_timestamp) as session_start,
max(request_timestamp) as session_end,
timestampdiff(
SECOND,
min(request_timestamp),
max(request_timestamp)
) as session_duration
FROM `log`
GROUP BY user_id
APPENDED
Now with the valued answer of #drew you can get the exactly proposed table2:
Take my output table and insert his code inside the brackets.
SELECT user_id,
sessionnum as `session`,
count(*) as request_count,
min(request_timestamp) as session_start,
max(request_timestamp) as session_end,
timestampdiff(
SECOND,
min(request_timestamp),
max(request_timestamp)
) as session_duration
FROM (put code of drew here) ttt
GROUP BY user_id, sessionnum
However
I am still thinking that you'd better set the session number in a separate field by inserting trigger fired from the table(s) with observed activity to prevent heavy load of the DB in the future when the log becomes too large.
Stop using reserved words and MySQL functions' names for aliases of your table (column) names (ex. log, session in your sample).