Join two tables with same id and date - mysql

I'm having trouble making this query, can I get some help?
I have a table named measurements that looks like this:
+----+----------+-------+------+
| id | cost | month | year |
+----+----------+-------+------+
| 1 | 6860.52 | 5 | 2018 |
| 1 | 11993.52 | 6 | 2018 |
| 1 | 3823.2 | 7 | 2018 |
| 1 | 3557.7 | 8 | 2018 |
| 1 | 3355.92 | 9 | 2018 |
| 1 | 357.54 | 10 | 2018 |
+----+----------+-------+------+
and a table named payment
+------------+---------------+-----------------+
| id | period | payment |
+------------+---------------+-----------------+
| 1 | 2018-05-01 | 0 |
| 1 | 2018-06-01 | 0 |
| 1 | 2018-06-01 | 34327 |
| 1 | 2018-07-01 | 100 |
| 1 | 2018-07-01 | 500 |
| 1 | 2018-07-01 | 400 |
| 1 | 2018-08-01 | 0 |
+------------+---------------+-----------------+
I'm in trouble trying to make a select stament that returns this:
+------------+---------------+----------------+-----------------+
| id | period | date | payment |
+------------+---------------+----------------+-----------------+
| 1 | 2018-05-01 | 2018-05-01 | 0 |
| 1 | 2018-06-01 | 2018-06-01 | 34327 |
| 1 | 2018-07-01 | 2018-07-01 | 1000 |
| 1 | 2018-08-01 | 2018-08-01 | 0 |
| 1 | 2018-09-01 | NULL | 0 |
| 1 | 2018-10-01 | NULL | 0 |
+------------+---------------+----------------+-----------------+
date is from concat(year,'-',month,'-',1)
Thank you
Schema:
CREATE TABLE measurements (id INT, cost FLOAT, month INT, year INT);
INSERT INTO measurements VALUES (1, 6860.52, 5, 2018),
(1, 11993.52, 6, 2018), (1, 3823.2, 7, 2018),
(1, 3557.7, 8, 2018), (1, 3355.92, 9, 2018), (1, 357.54, 10, 2018);
CREATE TABLE payment (id INT, period DATE, payment INT);
INSERT INTO payment VALUES (1, '2018-05-01', 0),
(1, '2018-06-01', 0),(1, '2018-06-01', 34327 ),(1, '2018-07-01', 100),
(1, '2018-07-01', 500),(1, '2018-07-01', 400), (1, '2018-08-01', 0);

Are you searching for this?
select measurements.id,
cast(concat(measurements.year, '-', measurements.month, '-01') as date) as period,
payment.period as date, sum(payment) payment
from measurements
left join payment on measurements.id = payment.id and cast(concat(measurements.year, '-', measurements.month, '-01') as date) = payment.period
group by measurements.id, measurements.year, measurements.month, payment.period
order by measurements.year, measurements.month;

You can try below using left join
select id,str_to_date(concat(year,'-',month,'-',1),'%Y-%m-%d') as period,b.period as `date`,sum(payment) as payment
from measurements a left join payment b
on a.id=b.id and str_to_date(concat(year,'-',month,'-',1),'%Y-%m-%d')=b.period
group by str_to_date(concat(year,'-',month,'-',1),'%Y-%m-%d')

Related

SQL — get all rows where column A's date is a multiple of 7 days apart from column B's date

CREATE TABLE table_1 (
`userid` VARCHAR(2),
`date_accessed` DATE,
`rank` INT,
`country` VARCHAR(2)
);
INSERT INTO table_1
(`userid`, `date_accessed`, `rank`, `country`)
VALUES
('A.', '2019-01-01', 1, 'US'),
('B.', '2019-01-02', 1, 'FR'),
('A.', '2019-01-03', 2, 'US'),
('A.', '2019-01-04', 3, 'US'),
('B.', '2019-01-04', 2, 'FR');
Here's the fiddle: https://www.db-fiddle.com/f/9F7XPiGtuQAYXQ99HfNJGN/0
And below is an example of the database. I want all the rows where the record date is a multiple of 7 days apart from the start date. The start date and record dates aren't unique; it'll be unique for each country, but both US and FR can have start dates of January 1 and record dates of January 8, for example. In the below table, I'd like to pull the rows where start date is 2019-01-01 and record date is 2019-01-08, for example.
| start_date | num_people | record_date | rating | country |
| ---------- | ---------- | ----------- | ------ | ------- |
| 2019-01-01 | 275 | 2019-01-08 | 4 | FR |
| 2019-01-02 | 150 | 2019-01-10 | 4 | FR |
| 2019-01-03 | 175 | 2019-01-09 | 5 | FR |
| 2019-01-04 | 300 | 2019-01-11 | 2 | FR |
| 2019-01-01 | 100 | 2019-01-08 | 8.5 | US |
| 2019-01-03 | 50 | 2019-01-10 | 5.5 | US |
| 2019-01-03 | 50 | 2019-01-17 | 5 | US |
---
I want to do this out to 84 days (every 7 days/every week for 12 weeks).
You just need the difference and a modulo function.
In MySQL:
select t.*
from t
where mod(datediff(record_date, start_date), 7) = 0;
In PrestoDB, that wouldbe:
where mod(date_diff('day', start_date, record_date), 7) = 0

Calculate tax amount between 3 different tables with MySQL

I have the following tables structure and trying to make a report from these:
___BillableDatas
|--------|------------|---------|--------------|------------|
| BIL_Id | BIL_Date |BIL_Rate | BIL_Quantity | BIL_Status |
|--------|------------|---------|--------------|------------|
| 1 | 2018-03-01 | 105 | 1 | charged |
| 2 | 2018-03-02 | 105 | 1 | cancelled |
| 3 | 2018-03-01 | 15 | 2 | notcharged |
| 4 | 2018-03-01 | 21 | 1 | notcharged |
| 5 | 2018-03-02 | 15 | 2 | notcharged |
| 6 | 2018-03-02 | 21 | 1 | notcharged |
|--------|------------|---------|--------------|------------|
___SalesTaxes
|--------|--------------|------------|
| STX_Id | STX_TaxeName | STX_Amount |
|--------|--------------|------------|
| 8 | Tax 1 | 5.000 |
| 9 | Tax 2 | 5.000 |
| 10 | Tax 3 | 19.975 |
|--------|--------------|------------|
STX_Amount is a percentage.
___ApplicableTaxes
|-----------|-----------|
| ATX_BILId | ATX_STXId |
|-----------|-----------|
| 1 | 8 |
| 1 | 9 |
| 1 | 10 |
| 2 | 8 |
| 2 | 9 |
| 2 | 10 |
| 3 | 9 |
| 3 | 10 |
| 4 | 9 |
| 5 | 9 |
| 5 | 10 |
| 6 | 9 |
|-----------|-----------|
ATX_BILId is the item ID link with ___BillableDatas.
ATX_STXId is the tax ID link with ___SalesTaxes.
I need to get to sum of the items per day
- without tax
- with tax
So mething like this:
|------------------|---------------|------------|
| BIL_RateNonTaxed | BIL_RateTaxed | BIL_Status |
|------------------|---------------|------------|
| 105.00 | 136.47 | charged | <- Taxes #8, #9 and #10 applicable
| 102.00 | 118.035 | notcharged | <- Taxes #9 and #10 applicable
|------------------|---------------|------------|
Explications on the totals:
105 = 105*1 -- (total of the charged item multiply by the quantity)
102 = (15*2)*2+(21*2) -- (total of the notcharged items multiply by the quantity)
136.47 = 105+(105*(5+5+19.975)/100)
119.085 = 102+(((15*2)*2)*(5+19.975)/100+(21*2)*5/100)
My last try was this one:
SELECT
BIL_Date,
(BIL_Rate*BIL_Quantity) AS BIL_RateNonTaxed,
(((BIL_Rate*BIL_Quantity)*SUM(STX_Amount)/100)+BIL_Rate*BIL_Quantity) AS BIL_RateTaxed,
BIL_Status
FROM ___BillableDatas
LEFT JOIN ___SalesTaxes
ON FIND_IN_SET(STX_Id, BIL_ApplicableTaxes) > 0
LEFT JOIN ___ApplicableTaxes
ON ___BillableDatas.BIL_Id = ___ApplicableTaxes.ATX_BILId
WHERE BIL_BookingId=1
GROUP BY BIL_Id AND BIL_Status
ORDER BY BIL_Date
ASC
Please see this SQLFiddle to help you if needed:
http://sqlfiddle.com/#!9/425854f
Thanks.
I cannot bear to work with your naming policy, so I made my own...
DROP TABLE IF EXISTS bills;
CREATE TABLE bills
(bill_id SERIAL PRIMARY KEY
,bill_date DATE NOT NULL
,bill_rate INT NOT NULL
,bill_quantity INT NOT NULL
,bill_status ENUM('charged','cancelled','notcharged')
);
INSERT INTO bills VALUES
(1,'2018-03-01',105,1,'charged'),
(2,'2018-03-02',105,1,'cancelled'),
(3,'2018-03-01',15,2,'notcharged'),
(4,'2018-03-01',21,1,'notcharged'),
(5,'2018-03-02',15,2,'notcharged'),
(6,'2018-03-02',21,1,'notcharged');
DROP TABLE IF EXISTS sales_taxes;
CREATE TABLE sales_taxes
(sales_tax_id SERIAL PRIMARY KEY
,sales_tax_name VARCHAR(12) NOT NULL
,sales_tax_amount DECIMAL(5,3) NOT NULL
);
INSERT INTO sales_taxes VALUES
( 8,'Tax 1', 5.000),
( 9,'Tax 2', 5.000),
(10,'Tax 3',19.975);
DROP TABLE IF EXISTS applicable_taxes;
CREATE TABLE applicable_taxes
(bill_id INT NOT NULL
,sales_tax_id INT NOT NULL
,PRIMARY KEY(bill_id,sales_tax_id)
);
INSERT INTO applicable_taxes VALUES
(1, 8),
(1, 9),
(1,10),
(2, 8),
(2, 9),
(2,10),
(3, 9),
(3,10),
(4, 9),
(5, 9),
(5,10),
(6, 9);
SELECT bill_status
, SUM(bill_rate*bill_quantity) nontaxed
, SUM((bill_rate*bill_quantity)+(bill_rate*bill_quantity*total_sales_tax/100)) taxed
FROM
( SELECT b.*
, SUM(t.sales_tax_amount) total_sales_tax
FROM bills b
JOIN applicable_taxes bt
ON bt.bill_id = b.bill_id
JOIN sales_taxes t
ON t.sales_tax_id = bt.sales_tax_id
GROUP
BY bill_id
) x
GROUP
BY bill_status;
+-------------+---------+-------------+
| bill_status | untaxed | total |
+-------------+---------+-------------+
| charged | 105 | 136.4737500 |
| cancelled | 105 | 136.4737500 |
| notcharged | 102 | 119.0850000 |
+-------------+---------+-------------+
My answer is very slightly different from yours, so one of us has made a mistake somewhere. Either way, this should get you pretty close.
SELECT a.BIL_Date, BIL_RateNonTaxed, BIL_RateNonTaxed+BIL_RateTaxed AS BIL_RateTaxed FROM (
SELECT BIL_Date,
SUM(BIL_Rate*BIL_Quantity) AS BIL_RateNonTaxed
FROM ___BillableDatas
WHERE BIL_Status != 'cancelled'
GROUP BY BIL_Date
) a INNER JOIN (
SELECT BIL_Date,
(((BIL_Rate*BIL_Quantity)*SUM(STX_Amount)/100)) AS BIL_RateTaxed
FROM ___BillableDatas
LEFT JOIN ___ApplicableTaxes
ON ___BillableDatas.BIL_Id = ___ApplicableTaxes.ATX_BILId
LEFT JOIN ___SalesTaxes
ON STX_Id = ATX_STXId
WHERE BIL_Status != 'cancelled'
GROUP BY BIL_Date
) b
ON a.BIL_Date = b.BIL_Date
ORDER BY a.BIL_Date;
Explanation:
Your BIL_RateNonTaxed calculation is not using the ___SalesTaxes table, so it must not appear on the query otherwise it would interfere the SUM function.
Howerver, your BIL_RateTaxed does use the ___SalesTaxes table. In that case, I solved by creating 2 subqueries and joining the results.
I know there are better answers, but I'm not familiar with MySQL syntax.

Include NULL in SQL Join when using WHERE

I have the following two tables:
Table TempUser22 : 57,000 rows:
+------+-----------+
| Id | Followers |
+------+-----------+
| 874 | 55542 |
| 1081 | 330624 |
| 1378 | 17919 |
| 1621 | 920 |
| 1688 | 255463 |
| 2953 | 751 |
| 3382 | 204466 |
| 3840 | 273489 |
| 4145 | 376 |
| ... | ... |
+------+-----------+
Table temporal_users : 10,000,000 rows total, 3200 rows Where Date=2010-12-31:
+---------------------+---------+--------------------+
| Date | User_Id | has_original_tweet |
+---------------------+---------+--------------------+
| 2008-02-22 12:00:00 | 676493 | 2 |
| 2008-02-22 12:00:00 | 815263 | 1 |
| 2008-02-22 12:00:00 | 6245822 | 1 |
| 2008-02-22 12:00:00 | 8854092 | 1 |
| 2008-02-23 12:00:00 | 676493 | 2 |
| 2008-02-23 12:00:00 | 815263 | 1 |
| 2008-02-23 12:00:00 | 6245822 | 1 |
| 2008-02-23 12:00:00 | 8854092 | 1 |
| 2008-02-24 12:00:00 | 676493 | 2 |
| ............. | ... | .. |
+---------------------+---------+--------------------+
I am running the following join query on these tables:
SELECT sum(has_original_tweet), b.Id
FROM temporal_users AS a
RIGHT JOIN TempUser22 AS b
ON a.User_ID = b.Id
GROUP BY b.Id;
Which returns 57,00 rows as expected, with NULL answers on the first field:
+-------------------------+------+
| sum(has_original_tweet) | Id |
+-------------------------+------+
| NULL | 874 |
| NULL | 1081 |
| 135 | 1378 |
| 164 | 1621 |
| 652 | 1688 |
| 691 | 2953 |
| NULL | 3382 |
| NULL | 3840 |
| NULL | 4145 |
| ... | .... |
+-------------------------+------+
However, when adding the WHERE line specifying a date as below:
SELECT sum(has_original_tweet), b.Id
FROM temporal_users AS a
RIGHT JOIN TempUser22 AS b
ON a.User_ID = b.Id
WHERE a.Date BETWEEN '2010-12-31-00:00:00' AND '2010-12-31-23:59:59'
GROUP BY b.Id;
I receive the following answer, of only 3200 rows, and without any NULL in the first field.
+-------------------------+---------+
| sum(has_original_tweet) | Id |
+-------------------------+---------+
| 1 | 797194 |
| 1 | 815263 |
| 0 | 820678 |
| 1 | 1427511 |
| 0 | 4653731 |
| 1 | 5933862 |
| 2 | 7530552 |
| 1 | 7674072 |
| 1 | 8149632 |
| .. | .... |
+-------------------------+---------+
My question is: How to get, for a given date, an answer of size 57,000 rows for each user in TempUser22 with NULL values when has_original_tweet is not present in temporal_user for the given date?
Thanks.
SELECT b.Id, SUM(a.has_original_tweet) s
FROM TempUser22 b
LEFT JOIN temporal_users a ON b.Id = a.User_Id
AND a.Date BETWEEN '2010-12-31-00:00:00' AND '2010-12-31-23:59:59'
GROUP BY b.Id;
Id s
1 null
2 1
3 null
4 3
5 null
6 null
For debugging, I used:
CREATE TEMPORARY TABLE TempUser22(Id INT, Followers INT)
SELECT 1 Id, 10 Followers UNION ALL
SELECT 2, 20 UNION ALL
SELECT 3, 30 UNION ALL
SELECT 4, 40 UNION ALL
SELECT 5, 50 UNION ALL
SELECT 6, 60
;
CREATE TEMPORARY TABLE temporal_users(`Date` DATETIME, User_Id INT, has_original_tweet INT)
SELECT '2008-02-22 12:00:00' `Date`, 1 User_Id, 1 has_original_tweet UNION ALL
SELECT '2008-12-31 12:00:00', 2, 1 UNION ALL
SELECT '2010-12-31 12:00:00', 2, 1 UNION ALL
SELECT '2012-12-31 12:00:00', 2, 1 UNION ALL
SELECT '2008-12-31 12:00:00', 4, 9 UNION ALL
SELECT '2010-12-31 12:00:00', 4, 1 UNION ALL
SELECT '2010-12-31 12:00:00', 4, 2 UNION ALL
SELECT '2012-12-31 12:00:00', 4, 9
;
That's because NULL values will always be discarded from the where clause
You can use a coalesce in your where clause.
WHERE coalesce(a.Date, 'some-date-in-the-range') BETWEEN '2010-12-31-00:00:00' AND '2010-12-31-23:59:59'
With this instead, you force null values to be considered as valid.

How to query multiple tables using a single query?

I want my tables to output something like this
---------------------------------------------------------------------------------------------
| date | location | time | delegate 1 | delegate 2 |
|--------------------------------------------------------------------------------------------
| 2015-12-07 | Table 1 | 9:00 | first_name_4 last_name_4 | first_name_5 last_name_5 |
|--------------------------------------------------------------------------------------------
| | 9:30 | first_name_4 last_name_4 | first_name_6 last_name_6 |
|--------------------------------------------------------------------------------------------
| | 9:30 | first_name_3 last_name_3 | first_name_7 last_name_7 |
|--------------------------------------------------------------------------------------------
| | 9:00 | first_name_3 last_name_3 | first_name_7 last_name_7 |
|--------------------------------------------------------------------------------------------
Here are the tables on my db
meetings table
-------------------------------------------------------------------------------------------------
| id | date_id | time_id | location_id | delegate_id_1 | delegate_id_2 | status |
|------------------------------------------------------------------------------------------------
| 1 | 1 | 1 | 1 | 4 | 5 | A |
|------------------------------------------------------------------------------------------------
| 2 | 1 | 2 | 1 | 4 | 6 | A |
|------------------------------------------------------------------------------------------------
| 3 | 1 | 1 | 1 | 2 | 6 | P |
|------------------------------------------------------------------------------------------------
| 4 | 1 | 2 | 1 | 1 | 3 | A |
|------------------------------------------------------------------------------------------------
| 5 | 1 | 1 | 1 | 1 | 3 | A |
|------------------------------------------------------------------------------------------------
users table
-----------------------------------------
| id | first_name | last_name |
|----------------------------------------
| 1 | first_name_1 | last_name_1 |
|----------------------------------------
| 2 | first_name_2 | last_name_2 |
|----------------------------------------
| 3 | first_name_3 | last_name_3 |
|----------------------------------------
| 4 | first_name_4 | last_name_4 |
|----------------------------------------
| 5 | first_name_5 | last_name_5 |
|----------------------------------------
| 6 | first_name_6 | last_name_6 |
|----------------------------------------
locations table
-----------------------------
| id | location_name |
|----------------------------
| 1 | Table 1 |
|----------------------------
time table
-------------------------
| id | meeting_time |
|------------------------
| 1 | 9:00:00 |
|------------------------
| 1 | 9:30:00 |
|------------------------
dates table
-------------------------
| id | meeting_date |
|------------------------
| 1 | 2015-12-07 |
|------------------------
| 2 | 2015-12-08 |
|------------------------
| 3 | 2015-12-09 |
|------------------------
My initial query goes like this
-- $query_date
SELECT meeting_date
FROM dates
WHERE meeting_date = '2015-12-07'
-- $query_location
SELECT location_name.location
from location
LEFT JOIN meetings
ON meetings.location_id=location.id
LEFT JOIN date
ON meetings.date_id=date.id
WHERE meeting_date.dates = '2015-12-07'
Now, here's the part where I got it wrong.
-- $query_final
SELECT meeting_time.time, delegate1.first_name AS first_name_1,
delegate1.last_name AS last_name_1, delegate2.first_name AS first_name_2,
delegate2.last_name AS last_name_2
FROM meetings
INNER JOIN users delegate1
ON meetings.delegate_id_1=users.id
LEFT JOIN users delegate2
ON meetings.delegate_id_2=users.id
WHERE meetings.status='A'
The results on my last query give me unexpected results since the results show more entries than my meetings table.
I know the queries I made are costly but I don't know how to make a more optimized query. I don't even know if it's possible to get the results into a single query only. Any help well do. Thanks.
You can bring back everything with a single query with the right JOIN.
Be Careful, when you use column name on SQL, the syntax is TABLE.COLUMN_NAME, it seem you mistake on the order quit often...
I changed some table name as you sometime use an s at the end and sometime no.
As time and date are SQL keyword, it's better with s everywhere
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE meetings (`id` int, `date_id` int, `time_id` int, `location_id` int, `delegate_id_1` int, `delegate_id_2` int, `status` varchar(1));
INSERT INTO meetings (`id`, `date_id`, `time_id`, `location_id`, `delegate_id_1`, `delegate_id_2`, `status`)
VALUES (1, 1, 1, 1, 4, 5, 'A'),
(2, 1, 2, 1, 4, 6, 'A'),
(3, 1, 1, 1, 2, 6, 'P'),
(4, 1, 2, 1, 1, 3, 'A'),
(5, 1, 1, 1, 1, 3, 'A');
CREATE TABLE users (`id` int, `first_name` varchar(12), `last_name` varchar(11));
INSERT INTO users (`id`, `first_name`, `last_name`)
VALUES (1, 'first_name_1', 'last_name_1'),
(2, 'first_name_2', 'last_name_2'),
(3, 'first_name_3', 'last_name_3'),
(4, 'first_name_4', 'last_name_4'),
(5, 'first_name_5', 'last_name_5'),
(6, 'first_name_6', 'last_name_6');
CREATE TABLE locations (`id` int, `location_name` varchar(7));
INSERT INTO locations (`id`, `location_name`)
VALUES (1, 'Table 1');
CREATE TABLE times (`id` int, `meeting_time` varchar(7));
INSERT INTO times (`id`, `meeting_time`)
VALUES (1, '9:00:00'),
(2, '9:30:00') ;
CREATE TABLE dates (`id` int, `meeting_date` varchar(10)) ;
INSERT INTO dates (`id`, `meeting_date`)
VALUES (1, '2015-12-07'),
(2, '2015-12-08'),
(3, '2015-12-09') ;
Query 1:
-- $query_final
SELECT locations.location_name,
`times`.meeting_time,
delegate1.first_name AS first_name_1,
delegate1.last_name AS last_name_1,
delegate2.first_name AS first_name_2,
delegate2.last_name AS last_name_2
FROM meetings
LEFT JOIN locations
ON meetings.location_id=locations.id
LEFT JOIN dates
ON meetings.date_id=`dates`.id
LEFT JOIN times
ON meetings.time_id=`times`.id
INNER JOIN users delegate1
ON meetings.delegate_id_1 = delegate1.id
LEFT JOIN users delegate2
ON meetings.delegate_id_2 = delegate2.id
WHERE
meetings.status = 'A'
AND dates.meeting_date = '2015-12-07'
Results:
| location_name | meeting_time | first_name | last_name | first_name | last_name |
|---------------|--------------|--------------|-------------|--------------|-------------|
| Table 1 | 9:00:00 | first_name_1 | last_name_1 | first_name_3 | last_name_3 |
| Table 1 | 9:30:00 | first_name_1 | last_name_1 | first_name_3 | last_name_3 |
| Table 1 | 9:00:00 | first_name_4 | last_name_4 | first_name_5 | last_name_5 |
| Table 1 | 9:30:00 | first_name_4 | last_name_4 | first_name_6 | last_name_6 |

Manipulating user data in MySQL

New to MySQL and need help manipulating user data in table 1 into the structure shown in table 2.
table 1
table 2
A user session is defined as a period of user activity with requests at least every 30 minutes. A session ends when the user has been inactive for over 30 minutes.
Does anyone know how to write mysql code that transforms table 1 into 2?
The following code can be used to create the log table:
CREATE TABLE log
( user_id int, request_timestamp datetime);
INSERT INTO log
VALUES
(1, '2014-10-26 10:51:18'), (1, '2014-10-26 10:52:20'), (1, '2014-10-26 11:15:03'), (1, '2014-10-26 11:39:18'), (1, '2014-10-26 15:01:18'), (1, '2014-10-26 15:01:21'), (1, '2014-10-27 21:22:19'),
(2, '2014-10-15 12:19:01'), (2, '2014-10-15 12:19:12'), (2, '2014-10-15 12:19:45'), (2, '2014-10-15 12:20:03'), (2, '2014-10-17 14:55:13'), (2, '2014-10-17 14:55:19'),(2, '2014-10-17 14:55:22')
;
Schema
CREATE TABLE log
( user_id int, request_timestamp datetime);
INSERT INTO log
VALUES
(1, '2014-10-26 10:51:18'), (1, '2014-10-26 10:52:20'), (1, '2014-10-26 11:15:03'), (1, '2014-10-26 11:39:18'), (1, '2014-10-26 15:01:18'), (1, '2014-10-26 15:01:21'), (1, '2014-10-27 21:22:19'),
(2, '2014-10-15 12:19:01'), (2, '2014-10-15 12:19:12'), (2, '2014-10-15 12:19:45'), (2, '2014-10-15 12:20:03'), (2, '2014-10-17 14:55:13'), (2, '2014-10-17 14:55:19'),(2, '2014-10-17 14:55:22');
First we will give the following a name just to visualize it:
Note below the 1800 means 30 min * 60 sec/minute
Specimen A
-----
select l.user_id,l.request_timestamp,
#sessionnum :=
if((#curuser = user_id and TIME_TO_SEC(TIMEDIFF(request_timestamp,#theDt))>1800),#sessionnum + 1,
if(#curuser <> user_id,1,#sessionnum)) as sessionnum,
#curuser := user_id as v_curuser,
#theDt:=request_timestamp as v_theDt
from log l cross join
(select #curuser := '', #sessionnum := 0,#theDt:='') gibberish
order by l.user_id,l.request_timestamp
+---------+---------------------+------------+-----------+---------------------+
| user_id | request_timestamp | sessionnum | v_curuser | v_theDt |
+---------+---------------------+------------+-----------+---------------------+
| 1 | 2014-10-26 10:51:18 | 1 | 1 | 2014-10-26 10:51:18 |
| 1 | 2014-10-26 10:52:20 | 1 | 1 | 2014-10-26 10:52:20 |
| 1 | 2014-10-26 11:15:03 | 1 | 1 | 2014-10-26 11:15:03 |
| 1 | 2014-10-26 11:39:18 | 1 | 1 | 2014-10-26 11:39:18 |
| 1 | 2014-10-26 15:01:18 | 2 | 1 | 2014-10-26 15:01:18 |
| 1 | 2014-10-26 15:01:21 | 2 | 1 | 2014-10-26 15:01:21 |
| 1 | 2014-10-27 21:22:19 | 3 | 1 | 2014-10-27 21:22:19 |
| 2 | 2014-10-15 12:19:01 | 1 | 2 | 2014-10-15 12:19:01 |
| 2 | 2014-10-15 12:19:12 | 1 | 2 | 2014-10-15 12:19:12 |
| 2 | 2014-10-15 12:19:45 | 1 | 2 | 2014-10-15 12:19:45 |
| 2 | 2014-10-15 12:20:03 | 1 | 2 | 2014-10-15 12:20:03 |
| 2 | 2014-10-17 14:55:13 | 2 | 2 | 2014-10-17 14:55:13 |
| 2 | 2014-10-17 14:55:19 | 2 | 2 | 2014-10-17 14:55:19 |
| 2 | 2014-10-17 14:55:22 | 2 | 2 | 2014-10-17 14:55:22 |
+---------+---------------------+------------+-----------+---------------------+
Then we are done if you want. But for pretty printing, can wrap Specimen A inside of another:
select user_id,request_timestamp,sessionnum
from
( select l.user_id,l.request_timestamp,
#sessionnum :=
if((#curuser = user_id and TIME_TO_SEC(TIMEDIFF(request_timestamp,#theDt))>1800),#sessionnum + 1,
if(#curuser <> user_id,1,#sessionnum)) as sessionnum,
#curuser := user_id as v_curuser,
#theDt:=request_timestamp as v_theDt
from log l cross join
(select #curuser := '', #sessionnum := 0,#theDt:='') gibberish
order by l.user_id,l.request_timestamp
) SpecimenA
order by user_id,sessionnum
+---------+---------------------+------------+
| user_id | request_timestamp | sessionnum |
+---------+---------------------+------------+
| 1 | 2014-10-26 10:51:18 | 1 |
| 1 | 2014-10-26 10:52:20 | 1 |
| 1 | 2014-10-26 11:15:03 | 1 |
| 1 | 2014-10-26 11:39:18 | 1 |
| 1 | 2014-10-26 15:01:18 | 2 |
| 1 | 2014-10-26 15:01:21 | 2 |
| 1 | 2014-10-27 21:22:19 | 3 |
| 2 | 2014-10-15 12:19:01 | 1 |
| 2 | 2014-10-15 12:19:12 | 1 |
| 2 | 2014-10-15 12:19:45 | 1 |
| 2 | 2014-10-15 12:20:03 | 1 |
| 2 | 2014-10-17 14:55:13 | 2 |
| 2 | 2014-10-17 14:55:19 | 2 |
| 2 | 2014-10-17 14:55:22 | 2 |
+---------+---------------------+------------+
14 rows in set (0.02 sec)
Note the OP's definition of a session. It is one of inactivity, not duration.
Try this:
SELECT user_id,
count(*) as request_count,
min(request_timestamp) as session_start,
max(request_timestamp) as session_end,
timestampdiff(
SECOND,
min(request_timestamp),
max(request_timestamp)
) as session_duration
FROM `log`
GROUP BY user_id
APPENDED
Now with the valued answer of #drew you can get the exactly proposed table2:
Take my output table and insert his code inside the brackets.
SELECT user_id,
sessionnum as `session`,
count(*) as request_count,
min(request_timestamp) as session_start,
max(request_timestamp) as session_end,
timestampdiff(
SECOND,
min(request_timestamp),
max(request_timestamp)
) as session_duration
FROM (put code of drew here) ttt
GROUP BY user_id, sessionnum
However
I am still thinking that you'd better set the session number in a separate field by inserting trigger fired from the table(s) with observed activity to prevent heavy load of the DB in the future when the log becomes too large.
Stop using reserved words and MySQL functions' names for aliases of your table (column) names (ex. log, session in your sample).