I need to get users visits duration for each day in MySQL.
I have table like:
user_id,date,time_start, time_end
1, 2018-09-01, 09:00:00, 12:30:00
2, 2018-09-01, 13:00:00, 15:10:00
1, 2018-09-03, 09:30:00, 12:30:00
2, 2018-09-03, 13:00:00, 15:10:00
and need to get:
user_id,2018-09-01_duration,2018-09-03_duration
1,03:30:00,03:00:00
2,02:10:00,02:10:00
So columns need to be dynamic as some dates can be missed (2018-09-02).
Is it possible to do with one query without explicit joins per each day (as some days can be null)?
Update #1
Yes, I can generate columns in application side, But I still have terrible query like
SELECT user_id, d1.dt AS "2018-08-01_duration", d2.dt AS "2018-08-03_duration"...
FROM (SELECT
user_id,
time_format(TIMEDIFF(TIMEDIFF(time_out,time_in),time_norm),"%H:%i") AS dt
FROM visits
WHERE date = "2018-09-01") d1
LEFT JOIN(
SELECT
user_id,
time_format(TIMEDIFF(TIMEDIFF(time_out,time_in),time_norm),"%H:%i") AS dt
FROM visits
WHERE date = "2018-09-03") d3
ON users.id = d3.user_id...
Update #2
Yes, data like
select user_id, date, SEC_TO_TIME(SUM(TIME_TO_SEC(time_out) - TIME_TO_SEC(time_in))) as total
from visits
group by user_id, date;
is correct, but in this case data for users goes consistently. And I hope there's the way when I have rows with users and columns with dates (like in example above)
Try something like this:
select user_id, date, sum(time_end - time_start)
from table
group by user_id, date;
You will need to do some tweaking, as you didn't mention the RDBMS provider, but it should give you a clear idea on how to do it.
There's no dynamic way to use pivotting in MySQL but you might use the following for your case :
create table t(user_id int, time_start timestamp, time_end timestamp);
insert into t values(1,'2018-09-01 09:00:00', '2018-09-01 12:30:00');
insert into t values(2,'2018-09-01 13:00:00', '2018-09-01 15:10:00');
insert into t values(1,'2018-09-03 09:30:00', '2018-09-03 12:30:00');
insert into t values(2,'2018-09-03 13:00:00', '2018-09-03 15:10:00');
select min(q.user_id) as user_id,
min(CASE WHEN (q.date='2018-09-01') THEN q.time_diff END) as '2018-09-01_duration',
min(CASE WHEN (q.date='2018-09-03') THEN q.time_diff END) as '2018-09-03_duration'
from
(
select user_id, date(time_start) date,
concat(concat(lpad(hour(timediff(time_start, time_end)),2,'0'),':'),
concat(lpad(minute(timediff(time_start, time_end)),2,'0'),':'),
lpad(second(timediff(time_start, time_end)),2,'0')) as time_diff
from t
) q
group by user_id;
If you know the dates that you want in the result set, you don't need a dynamic query. You can just use conditional aggregation:
select user_id,
SEC_TO_TIME(SUM(CASE WHEN date = '2018-09-01' THEN TIME_TO_SEC(time_out) - TIME_TO_SEC(time_in))) as total_20180901,
SEC_TO_TIME(SUM(CASE WHEN date = '2018-09-02' THEN TIME_TO_SEC(time_out) - TIME_TO_SEC(time_in))) as total_20180902,
SEC_TO_TIME(SUM(CASE WHEN date = '2018-09-03' THEN TIME_TO_SEC(time_out) - TIME_TO_SEC(time_in))) as total_20180903
from visits
group by user_id;
You only need dynamic SQL if you don't know the dates you want in the result set. In that case, I would suggest following the same structure with the dates that you do want.
By the query you can solve your problem. the query is dynamic and you can improve it.
i use TSQL for the query, you can use the idea in MySQL.
declare
#columns as nvarchar(max),
#query as nvarchar(max)
select
#columns =
stuff
((
select
distinct
',' + quotename([date])
from
table_test
for xml path(''), type
).value('.', 'nvarchar(max)'), 1, 1, '')
--select #columns
set #query =
'with
cte_result
as
(
select
[user_id] ,
[date] ,
time_start ,
time_end ,
datediff(minute, time_start, time_end) as duration
from
table_test
)
select
[user_id], ' + #columns + '
from
(
select
[user_id] ,
[date] ,
duration
from
cte_result
)
sourceTable
pivot
(
sum(duration)
for [date] in (' + #columns + ')
)
pivotTable'
execute(#query)
Related
I have to make some SQL query.
I'll only put here tables and results I need - I am sure this is the best way for a clear explanation (at the bottom of the question I provided SQL queries for database filling).
short description:
TASK: After full join concatenation I receive a result where (for example) tableA.point column (that is used in the SELECT statement) in some cells returns NULL. In these cases, I need to change tableA.point column to the tableB.point (from the joined table).
So, tables:
(Columns point + date are composite key.)
outcome_o:
income_o:
The result I need an example (we can see - I need a concatenated table with both out and inc columns in rows)
My attempt:
SELECT outcome_o.point,
outcome_o.date,
inc,
out
FROM income_o
FULL JOIN outcome_o ON income_o.point = outcome_o.point AND income_o.date = outcome_o.date
The result is the same as I need, except NULL in different point and date columns:
I tried to avoid this with CASE statement:
SELECT
CASE outcome_o.point
WHEN NULL
THEN income_o.point
ELSE outcome_o.point
END as point,
....
But this not works as I imagined (all cells became NULL in point column).
Could anyone help me with this solution? I know there is I have to use JOIN, CASE (case-mandatory) and possibly UNION commands.
Thanks
Tables creation:
CREATE TABLE income(
point INT,
date VARCHAR(60),
inc FLOAT
)
CREATE TABLE outcome(
point INT,
date VARCHAR(60),
ou_t FLOAT
)
INSERT INTO income VALUES
(1, '2001-03-22', 15000.0000),
(1, '2001-03-23', 15000.0000),
(1, '2001-03-24', 3400.0000),
(1, '2001-04-13', 5000.0000),
(1, '2001-05-11', 4500.0000),
(2, '2001-03-22', 10000.0000),
(2, '2001-03-24', 1500.0000),
(3, '2001-09-13', 11500.0000),
(3, '2001-10-02', 18000.0000);
INSERT INTO outcome VALUES
(1, '2001-03-14 00:00:00.000', 15348.0000),
(1, '2001-03-24 00:00:00.000', 3663.0000),
(1, '2001-03-26 00:00:00.000', 1221.0000),
(1, '2001-03-28 00:00:00.000', 2075.0000),
(1, '2001-03-29 00:00:00.000', 2004.0000),
(1, '2001-04-11 00:00:00.000', 3195.0400),
(1, '2001-04-13 00:00:00.000', 4490.0000),
(1, '2001-04-27 00:00:00.000', 3110.0000),
(1, '2001-05-11 00:00:00.000', 2530.0000),
(2, '2001-03-22 00:00:00.000', 1440.0000),
(2, '2001-03-29 00:00:00.000', 7848.0000),
(2, '2001-04-02 00:00:00.000', 2040.0000),
(3, '2001-09-13 00:00:00.000', 1500.0000),
(3, '2001-09-14 00:00:00.000', 2300.0000),
(3, '2002-09-16 00:00:00.000', 2150.0000);
The first step is to create a date range reference table. To do that, we can use Common Table Expression (cte):
WITH RECURSIVE cte AS (
SELECT Min(mndate) mindt, MAX(mxdate) maxdt
FROM (SELECT MIN(date) AS mndate, MAX(date) AS mxdate
FROM outcome
UNION
SELECT MIN(date), MAX(date)
FROM income) v
UNION
SELECT mindt + INTERVAL 1 DAY, maxdt
FROM cte
WHERE mindt + INTERVAL 1 DAY <= maxdt)
SELECT mindt
FROM cte
Here I'm trying to generate the dynamic date range based on the minimum & maximum date value from both of your tables. This is particularly useful when you don't to keep on changing the date range but if you don't mind, you can just generate them simply like so:
WITH RECURSIVE cte AS (
SELECT '2001-03-14 00:00:00' dt
UNION
SELECT dt + INTERVAL 1 DAY
FROM cte
WHERE dt + INTERVAL 1 DAY <= '2002-09-16')
SELECT mindt
FROM cte
From here, I'll do a CROSS JOIN to get the distinct point value from both tables:
...
CROSS JOIN (SELECT DISTINCT point FROM outcome
UNION
SELECT DISTINCT point FROM income) p
Now we have a reference table with all the point and date range. Let's wrap those in another cte.
WITH RECURSIVE cte AS (
SELECT Min(mndate) mindt, MAX(mxdate) maxdt
FROM (SELECT MIN(date) AS mndate, MAX(date) AS mxdate
FROM outcome
UNION
SELECT MIN(date), MAX(date)
FROM income) v
UNION
SELECT mindt + INTERVAL 1 DAY, maxdt
FROM cte
WHERE mindt + INTERVAL 1 DAY <= maxdt),
cte2 AS (
SELECT point, mindt
FROM cte
CROSS JOIN (SELECT DISTINCT point FROM outcome
UNION
SELECT DISTINCT point FROM income) p)
SELECT *
FROM cte2;
Next step is taking your current query attempt and LEFT JOIN it to the reference table:
WITH RECURSIVE cte AS (
SELECT Min(mndate) mindt, MAX(mxdate) maxdt
FROM (SELECT MIN(date) AS mndate, MAX(date) AS mxdate
FROM outcome
UNION
SELECT MIN(date), MAX(date)
FROM income) v
UNION
SELECT mindt + INTERVAL 1 DAY, maxdt
FROM cte
WHERE mindt + INTERVAL 1 DAY <= maxdt),
cte2 AS (
SELECT point, CAST(mindt AS DATE) AS rdate
FROM cte
CROSS JOIN (SELECT DISTINCT point FROM outcome
UNION
SELECT DISTINCT point FROM income) p)
SELECT *
FROM cte2
LEFT JOIN outcome
ON cte2.point=outcome.point
AND cte2.rdate=outcome.date
LEFT JOIN income
ON cte2.point=income.point
AND cte2.rdate=income.date
/*added conditions*/
WHERE cte2.point=1
AND COALESCE(outcome.date, income.date) IS NOT NULL
/*****/
ORDER BY cte2.rdate;
I noticed that your date column is using VARCHAR() datatype instead of DATE or DATETIME. Which is why my initial test return only one result. However, I do notice that if I compare YYYY-MM-DD format against your table date value, it returns other results, which is why I did CAST(mindt AS DATE) AS rdate in cte2. I do recommend that you change the date column to MySQL standard date format though.
You probably find the query a bit too long but if you have a table where you store dates or as we call it calendar table, the query will be much shorter, perhaps like this:
SELECT *
FROM calendar
LEFT JOIN outcome
ON calendar.point=outcome.point
AND calendar.rdate=outcome.date
LEFT JOIN income
ON calendar.point=income.point
AND calendar.rdate=income.date
/*added conditions*/
WHERE calendar.point=1
AND COALESCE(outcome.date, income.date) IS NOT NULL
/*****/
ORDER BY calendar.rdate;
Demo fiddle
It seems I was using the wrong syntax for the solution. So, as I found out, dynamically column selection is accessible in the SELECT query:
correct CASE statement:
(
CASE
WHEN outcome_o.point IS NULL
THEN income_o.point
ELSE outcome_o.point
END
) as point,
In this case query selects joined table column in the case the main table column is NULL.
Full query (returns result exactly I need):
SELECT
(
CASE
WHEN outcome_o.point IS NULL
THEN income_o.point
ELSE outcome_o.point
END
) as point,
(
CASE
WHEN outcome_o.date IS NULL
THEN income_o.date
ELSE outcome_o.date
END
) as date,
inc,
out
FROM income_o
FULL JOIN outcome_o ON income_o.point = outcome_o.point AND income_o.date = outcome_o.date
I know this question was already asked in a similar way, but I could not found any with an alias in the where clause.
I have a table structure like this:
CREATE TABLE Orders
( ID int NOT NULL Primary Key
, OrderNr VARCHAR(6) NOT NULL
, Date DATE NOT NULL
, Time CHAR(6) NOT NULL
, GeoCode CHAR(6) NULL) ;
My insert looks like this:
INSERT INTO orders (ID, OrderNr, Date, Time, GeoCode) VALUES (1, '123456', '2022-02-
15', '111110', '4022')
, (2, '123457', '2022-02-15', '121210', '4022')
, (3, '123455', '2021-04-15', '171515', '4020')
, (4, '123455', '2021-04-16', '150302', '4022')
, (5, '123466', '2022-03-03', '191810', '4020')
, (6, '123466', '2022-03-04', '121410', '4022')
Now I´m trying to get the latest Date and Time values for all OrderNr like this:
SELECT ID, OrderNr, MAX(cast(concat(Date, ' ', cast(Time as Time)) as datetime)) as
DateAndTime, GeoCode
FROM Orders o1
GROUP BY OrderNr
The Results shows the right latest date and time but the GeoCode is wrong. E.g for the
OrderNr 123455 it is 4020 but should be 4022.
I know that similar question were already asked but I cant use the alias in the where clause. Can somebody explain to me what I´m doing wrong?
Thank you very much in advance.
If your mysql version support ROW_NUMBER window function you can try this
SELECT *
FROM (
SELECT ID,
OrderNr,
cast(concat(Date, ' ', cast(Time as Time)) as datetime) DateAndTime,
GeoCode,
ROW_NUMBER() OVER(PARTITION BY OrderNr ORDER BY cast(concat(Date, ' ', cast(Time as Time)) as datetime) DESC) rn
FROM Orders o1
) t1
WHERE rn = 1
or use subquery with EXISTS
SELECT *
FROM Orders o1
WHERE EXISTS (
SELECT 1
FROM Orders oo
WHERE oo.OrderNr = o1.OrderNr
HAVING MAX(oo.Date) = o1.Date
)
sqlfiddle
Get only the biggest date:
These are check-in and check-out records of employees, some times they do twice or more entries on the system in a row. In this sample there were two check-out in a row. Assuming these rows always gonna be ordered, in the case of check-out I would like have the biggest date, and in the case of the check-in the smallest date.
In that case I would like to have this:
The smaller date was excluded:
DEMO
Try this, in this big CASE statement I increment column by one, if checkin switches from null to not null and the other way around. Then it's enough to group by this column taking max and min of checkout and checkin respectively:
select #checkinLag := null, #rn := 0;
select max(id),
functionario,
loja,
min(checkin),
max(checkout)
from (
select case when (checkinLag is null and checkin is not null) or
(checkinLag is not null and checkin is null)
then #rn := #rn + 1 else #rn end rn,
checkin,
checkout,
loja,
id,
functionario
from (
select #checkinLag checkinLag,
#checkinLag := checkin,
checkin,
checkout,
loja,
id,
functionario
from dummyTable
order by coalesce(checkin, checkout)
) a
) a group by functionario, loja, rn
I have used subqueries, to guarantee order of evaluating expressions (assigning and using of #checkinLag), as Gordon Linoff pointed.
Demo
My solution:
Select
*
from dummyTable base
where (base.checkout is null or not exists (
select
1
from dummyTable co
where co.checkout between base.checkout and DATE_ADD(base.checkout, INTERVAL 5 SECOND)
and base.id <> co.id
and base.functionario = co.functionario
and base.loja = co.loja
)) and (base.checkin is null or not exists (
select
1
from dummyTable ci
where ci.checkin between DATE_SUB(base.checkin, INTERVAL 5 SECOND) and base.checkin
and base.id <> ci.id
and base.functionario = ci.functionario
and base.loja = ci.loja
));
you can test the query here. There is no need that the rows are orderd. I choose 5 seconds as the interval where check-in/outs should be ignored.
I have the following columns in a table called meetings: meeting_id - int, start_time - time, end_time - time. Assuming that this table has data for one calendar day only, how many minimum number of rooms do I need to accomodate all the meetings. Room size/number of people attending the meetings don't matter.
Here's the solution:
select * from
(select t.start_time,
t.end_time,
count(*) - 1 overlapping_meetings,
count(*) minimum_rooms_required,
group_concat(distinct concat(y.start_time,' to ',t.end_time)
separator ' // ') meeting_details from
(select 1 meeting_id, '08:00' start_time, '09:15' end_time union all
select 2, '13:20', '15:20' union all
select 3, '10:00', '14:00' union all
select 4, '13:55', '16:25' union all
select 5, '14:00', '17:45' union all
select 6, '14:05', '17:45') t left join
(select 1 meeting_id, '08:00' start_time, '09:15' end_time union all
select 2, '13:20', '15:20' union all
select 3, '10:00', '14:00' union all
select 4, '13:55', '16:25' union all
select 5, '14:00', '17:45' union all
select 6, '14:05', '17:45') y
on t.start_time between y.start_time and y.end_time
group by start_time, end_time) z;
My question - is there anything wrong with this answer? Even if there's nothing wrong with this, can someone share a better answer?
Let's say you have a table called 'meeting' like this -
Then You can use this query to get the minimum number of meeting Rooms required to accommodate all Meetings.
select max(minimum_rooms_required)
from (select count(*) minimum_rooms_required
from meetings t
left join meetings y on t.start_time >= y.start_time and t.start_time < y.end_time group by t.id
) z;
This looks clearer and simple and works fine.
Meetings can "overlap". So, GROUP BY start_time, end_time can't figure this out.
Not every algorithm can be done in SQL. Or, at least, it may be grossly inefficient.
I would use a real programming language for the computation, leaving the database for what it is good at -- being a data repository.
Build a array of 1440 (minutes in a day) entries; initialize to 0.
Foreach meeting:
Foreach minute in the meeting (excluding last minute):
increment element in array.
Find the largest element in the array -- the number of rooms needed.
CREATE TABLE [dbo].[Meetings](
[id] [int] NOT NULL,
[Starttime] [time](7) NOT NULL,
[EndTime] [time](7) NOT NULL) ON [PRIMARY] )GO
sample data set:
INSERT INTO Meetings VALUES (1,'8:00','09:00')
INSERT INTO Meetings VALUES (2,'8:00','10:00')
INSERT INTO Meetings VALUES (3,'10:00','11:00')
INSERT INTO Meetings VALUES (4,'11:00','12:00')
INSERT INTO Meetings VALUES (5,'11:00','13:00')
INSERT INTO Meetings VALUES (6,'13:00','14:00')
INSERT INTO Meetings VALUES (7,'13:00','15:00')
To Find Minimum number of rooms required run the below query:
create table #TempMeeting
(
id int,Starttime time,EndTime time,MeetingRoomNo int,Rownumber int
)
insert into #TempMeeting select id, Starttime,EndTime,0 as MeetingRoomNo,ROW_NUMBER()
over (order by starttime asc) as Rownumber from Meetings
declare #RowCounter int
select top 1 #RowCounter=Rownumber from #TempMeeting order by Rownumber
WHILE #RowCounter<=(Select count(*) from #TempMeeting)
BEGIN
update #TempMeeting set MeetingRoomNo=1
where Rownumber=(select top 1 Rownumber from #TempMeeting where
Rownumber>#RowCounter and Starttime>=(select top 1 EndTime from #TempMeeting
where Rownumber=#RowCounter)and MeetingRoomNo=0)set #RowCounter=#RowCounter+1
END
select count(*) from #TempMeeting where MeetingRoomNo=0
Consider a table meetings with columns id, start_time and end_time. Then the following query should give correct answer.
with mod_meetings as (select id, to_timestamp(start_time, 'HH24:MI')::TIME as start_time,
to_timestamp(end_time, 'HH24:MI')::TIME as end_time from meetings)
select CASE when max(a_cnt)>1 then max(a_cnt)+1
when max(a_cnt)=1 and max(b_cnt)=1 then 2 else 1 end as rooms
from
(select count(*) as a_cnt, a.id, count(b.id) as b_cnt from mod_meetings a left join mod_meetings b
on a.start_time>b.start_time and a.start_time<b.end_time group by a.id) join_table;
Sample DATA:
DROP TABLE IF EXISTS meeting;
CREATE TABLE "meeting" (
"meeting_id" INTEGER NOT NULL UNIQUE,
"start_time" TEXT NOT NULL,
"end_time" TEXT NOT NULL,
PRIMARY KEY("meeting_id")
);
INSERT INTO meeting values (1,'08:00','14:00');
INSERT INTO meeting values (2,'09:00','10:30');
INSERT INTO meeting values (3,'11:00','12:00');
INSERT INTO meeting values (4,'12:00','13:00');
INSERT INTO meeting values (5,'10:15','11:00');
INSERT INTO meeting values (6,'12:00','13:00');
INSERT INTO meeting values (7,'10:00','10:30');
INSERT INTO meeting values (8,'11:00','13:00');
INSERT INTO meeting values (9,'11:00','14:00');
INSERT INTO meeting values (10,'12:00','14:00');
INSERT INTO meeting values (11,'10:00','14:00');
INSERT INTO meeting values (12,'12:00','14:00');
INSERT INTO meeting values (13,'10:00','14:00');
INSERT INTO meeting values (14,'13:00','14:00');
Solution:
DROP VIEW IF EXISTS Final;
CREATE VIEW Final AS SELECT time, group_concat(event), sum(num) num from (
select start_time time, 's' event, 1 num from meeting
union all
select end_time time, 'e' event, -1 num from meeting)
group by 1
order by 1;
select max(room) AS Min_Rooms_Required FROM (
select
a.time,
sum(b.num) as room
from
Final a
, Final b
where a.time >= b.time
group by a.time
order by a.time
);
Here's the explanation to gashu's nicely working code (or otherwise a non-code explanation of how to solve it with any language).
Firstly, if the variable 'minimum_rooms_required' would be renamed to 'overlap' it would make the whole thing much easier to understand. Because for each of the start or end times we want to know the numbers of overlapping ongoing meetings. When we found the maximum, this means there's no way of getting around with less than the overlapping amount, because well they overlap.
By the way, I think there might be a mistake in the code. It should check for t.start_time or t.end_time between y.start_time and y.end_time. Counterexample: meeting 1 starts at 8:00, ends at 11:00 and meeting 2 starts at 10:00, ends at 12:00.
(I'd post it as a comment to the gashu's answerbut I don't have enough reputation)
I'd go for Lead() analytic function
select
sum(needs_room_ind) as min_rooms
from (
select
id,
start_time,
end_time,
case when lead(start_time,1) over (order by start_time asc) between start_time
and end_time then 1 else 0 end as needs_room_ind
from
meetings
) a
IMO, I wanna to take the difference between how many meeting are started and ended at the same time when each meeting_id is started (assuming meeting starts and ends on time)
my code was just like this :
with alpha as
(
select a.meeting_id,a.start_time,
count(distinct b.meeting_id) ttl_meeting_start_before,
count(distinct c.meeting_id) ttl_meeting_end_before
from meeting a
left join
(
select meeting_id,start_time from meeting
) b
on a.start_time > b.start_time
left join
(
select meeting_id,end_time from meeting
) c
on a.start_time > c.end_time
group by a.meeting_id,a.start_time
)
select max(ttl_meeting_start_before-ttl_meeting_end_before) max_meeting_room
from alpha
I have this table in SQL Server:
I want a result like this
I am going to write SQL queries to count the transaction and consolidate every month.
Thank you in advance.
First, you need to edit the column you want to group.
SELECT
A.YYYYMM,
COUNT(*) TxnCount
FROM
(
SELECT
*,
LEFT(TXN_DATE, 6) YYYYMM
FROM
Tbl
) A
GROUP BY
A.YYYYMM
Use Group By and Substring :
SELECT
SUBSTRING(CAST(TXNDate AS VARCHAR(12)),0,9) AS TXNDate,COUNT(*) AS 'TXN Count'
FROM
#tblTest
GROUP BY SUBSTRING(CAST(TXNDate AS VARCHAR(12)),0,9)
You can use GROUP BY clause to count the transaction.
Assuming your TXN Date is of date type, you can use following query:
SELECT CONVERT(VARCHAR(6), TXN_DATE, 112) AS YYYYMM, COUNT(*) AS TXN_COUNT
FROM MyTable
GROUP BY CONVERT(VARCHAR(6), TXN_DATE, 112)
ORDER BY CONVERT(VARCHAR(6), TXN_DATE, 112)
EDIT: since your TXN_DATE is int type, you can use the following
SELECT LEFT(CONVERT(VARCHAR, TXN_DATE), 6) AS YYYYMM, COUNT(*) AS TXN_COUNT
FROM MyTable
GROUP BY LEFT(CONVERT(VARCHAR, TXN_DATE), 6)
ORDER BY LEFT(CONVERT(VARCHAR, TXN_DATE), 6)