I am trying to get the online time of riders based on below data but I am unable to cater to the need of changing days. The status = 1 means rider is online and 0 means offline. Sometimes, the riders forget to mark their offline status i.e. 0 on their shift end but do it the next morning when they have to start a new shift.
I am using the below query to get the time differences but this query also calculates the time difference between the changing days. I want it to calculate such that if the last status of the day of any rider is not 0, it should automatically take the last login time as logout time.
SELECT
fleet_id,
login_time,
logout_time,
timediff(logout_time, login_time) AS logged_in_time,
(unix_timestamp(logout_time) - unix_timestamp(login_time))/60 AS minutes_logged_in_time
FROM
(SELECT
fleet_id,
creation_datetime AS login_time,
coalesce(
(SELECT creation_datetime
FROM tb_fleet_duty_logs t_out
WHERE t_out.fleet_id = t_in.fleet_id
AND t_out.creation_datetime >= t_in.creation_datetime
AND t_out.status = 0
ORDER BY creation_datetime
LIMIT 1
),
creation_datetime
) AS logout_time
FROM
tb_fleet_duty_logs t_in
WHERE
status = 1
) AS q1
ORDER BY
fleet_id, login_time
I'll assume that you have some analytic functions available as well as CTEs. I have the impression that status = 0 is a login. If not then reverse the test in the case expression.
with data as (
select
fleet_id,
cast(creation_datetime as date) as dt,
count(case when status = 0 then 1 end) over (
partition by fleet_id, cast(creation_datetime as date)
order by creation_datetime asc) as login_count
from tb_fleet_duty_logs
)
select fleet_id,
min(creation_datetime) as login_time,
max(creation_time) as logout_time,
/* ... other calculations ... */
from data
group by fleet_id, dt, login_count
order by fleet_id, login_time;
The trick is to count off the number of logins per day per rider using the order of the timestamps. After that you only need to use simple grouping to collapse the pairs of rows (or single rows) into the login/logout times.
Related
In this scenario I have two tables users and transactions. I would like to filter all the transactions for a specified time period into 3 categories, first time deposit, second time deposit and additional deposits.
To work out a first time deposit you would check if the user has no transactions before that one using the created_at field, for second time deposit they would have one other transaction before that one and for the rest they should have 2 or more before that one.
The transactions table has 2 fields we care about here:
user (user id)
created_at (time transaction was created)
Here is my attempt but I am having trouble visualising the whole query. Any ideas on how I would do this?
SELECT
COUNT(t.id) as first_time_deposits
FROM
transactions t
WHERE
status = 'approved' AND DATE(t.created_at) BETWEEN (CURDATE() - INTERVAL 0 DAY) AND CURDATE()
GROUP BY user
HAVING NOT EXISTS
(
SELECT
u.id
FROM
transactions u
WHERE
u.created_at < t.created_at
)
I use the date interval here just for filtering transactions between a day, week etc. This query doesn't work because I am trying to reference the date of outer query in the sub query. I am also missing second time deposits and additionald deposits.
Example output I am looking for:
first_time_deposits
second_time_deposits
additional_deposits
15
5
6
All for a selected time period.
Any help would be greatly appreciated.
This is how I'd do that. The solution works fine if, for example, "first" transactions took place at the same time. Same for others
"first_to_last" is a recursive query just to display numbers we need to get transactions for (1 to 3 in your case). This makes the query easy adjustable in case if you suddenly need not first 3 but first 10 transactions
"numbered" - ranks transactions by date
Main query joins first 2 CTEs and replaces numbers with words like "first", "second", and "third". I didn't find other way rather than to hardcode values.
with recursive first_to_last(step) as (
select 1
union all
select step + 1
from first_to_last
where step < 3 -- how many lines to display
),
numbered as (
select dense_rank() over(partition by user_id order by created_at) rnk, created_at, user_id
from transactions
)
select user_id,
concat(case when f.step = 1 then 'first_deposit: '
when f.step = 2 then 'second_deposit: '
when f.step = 3 then 'third_deposit: '
end,
count(rnk))
from numbered n
join first_to_last f
on n.rnk = f.step
group by user_id, f.step
order by user_id, f.step
dbfiddle
UPD. Answer to the additional question: ". I just want the count of all first, second and any deposit that isn't first or second"
Just remove the "first_to_last" cte
with numbered as (
select dense_rank() over(partition by user_id order by created_at) rnk, created_at, user_id
from transactions
)
select user_id,
concat(case when n.rnk = 1 then 'first_deposit: '
when n.rnk = 2 then 'second_deposit: '
else 'other_deposits: '
end,
count(rnk))
from numbered n
group by user_id, case when n.rnk = 1 then 'first_deposit: '
when n.rnk = 2 then 'second_deposit: '
else 'other_deposits: '
end
order by user_id, rnk
UPD2. output in 3 columns: first, second and others
with numbered as (
select dense_rank() over(partition by user_id order by created_at) rnk, created_at, user_id
from transactions
)
select
sum(case when n.rnk = 1 then 1 else 0 end) first_deposit,
sum(case when n.rnk = 2 then 1 else 0 end) second_deposit,
sum(case when n.rnk not in (1,2) then 1 else 0 end) other_deposit
from numbered n
for now I was able to collect_set() everyone that is active with no problem:
with aux as(
select date
,collect_set(user_id) over(
partition by feature
order by cast(timestamp(date) as float)
range between (-90*60*60*24) following and 0 preceding
) as user_id
,feature
--
from (
select data
,feature
,collect_set(user_id)
--
from table
--
group by date, feature
)
)
--
select date
,distinct_array(flatten(user_id))
,feature
--
from aux
The problem is, now I have to keep only users that are older than last 90 days
I tried this and didn't work:
select date
,collect_set(case when user_created_at < date - interval 90 day
then user_id end) over(
partition by feature
order by cast(timestamp(date) as float)
range between (-90*60*60*24) following and 0 preceding
) as teste
,feature
from table
The reason it didn't work is because the filter inside collect_select() filters only users from one day instead filtering all the users from the last 90 days,
Making the result with more results than expected.
How can I get it correctly?
As reference, I'm using this query to verify if is correct:
select
count(distinct user_id) as total
,count(distinct case when user_created_at < date('2020-04-30') - interval 90 day then user_id end)
,count(distinct case when user_created_at >= date('2020-04-30') - interval 90 day then user_id end)
--
from table
--
where 1=1
and date >= date('2020-04-30') - interval 90 day
and date <= '2020-04-30'
and feature = 'a_feature'
pretty ugly workaround but:
select data
,feature
,collect_set(cus.client_id) as client
from (
select data
,explode(array_distinct(flatten(client))) as client
,feature
from(
select data
,collect_set(client_id) over(
partition by feature
order by cast(timestamp(data) as float)
range between (-90*60*60*24) following and 0 preceding
) as cliente
,feature
from (
select data
,feature
,collect_set(client_id) as cliente
from da_pandora.ds_transaction dtr
--
group by data, feature
)
)
)as dtr
left join costumer as cus
on cus.client_id = dtr.client and date(client_created_at) < data - interval 90 day
group by data, feature
Hi there I want to design this query in mySQL.
Statement: For all the customers that transacted during 2017, what % made another transaction within 30 days?
can you tell me how such query can be designed?
This is the picture of the table to perform this query on:
Table name is: transactions
Just use lead() to get the next date. Then aggregate at the customer level to determine if any transaction in the time period has another within 30 days for that customer.
Finally, aggregate again:
select avg(case when mindiff < 30 then 1.0 else 0 end) as within_30days
from (select customerid, min(datediff(next_date - date)) as mindiff
from (select t.*, lead(date) over (partition by customerid order by date) as next_date
from transactions t
) t
where date >= '2017-01-01' and date < '2018-01-01'
group by customerid
) c
I have a SQL query that I'm using to return the number of training sessions recorded by a client on each day of the week (during the last year).
SELECT COUNT(*) total_sessions
, DAYNAME(log_date) day_name
FROM programmes_results
WHERE log_date >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
AND log_date <= CURDATE()
AND client_id = 7171
GROUP
BY day_name
ORDER
BY FIELD(day_name, 'MONDAY', 'TUESDAY', 'WEDNESDAY', 'THURSDAY', 'FRIDAY', 'SATURDAY', 'SUNDAY')
I would like to then plot a table showing these values as a percentage of the total, as opposed to as a 'count' for each day. However I'm at a bit of a loss as to how to do that without another query (which I'd like to avoid).
Any thoughts?
Use a derived table
select day_name, total_sessions, total_sessions / sum(total_sessions) * 100 percentage
from (
query from your question goes here
) temp
group by day_name, total_sessions
You can add the number of trainings per day in your client application to get the total count. This way you definitely avoid having a 2nd query to get the total.
Use the with rollup modifier in the query to get the total returned in the last row:
...GROUP BY day_name WITH ROLLUP ORDER BY ...
Use a subquery to return the overall count within each row
SELECT ..., t.total_count
...FROM programmes_results INNER JOIN (SELECT COUNT(*) as total_count FROM programmes_results WHERE <same where criteria>) as t --NO join condition
...
This will have the largest performance impact on the database, however, it enables you to have the total number in each row.
please take a look of this query:
SELECT DATE(datetime), COUNT(1) as numVisits
FROM ".table_stats."
WHERE type='profile_visit'
AND user_url = '".$_GET['ref']."'
AND id_user='".$_SESSION['user_code']."'
GROUP BY DATE(DATE_SUB(datetime, INTERVAL 1 DAY))
This query counts the number of times that type is equal to 'profile_visit' by each date, as a result it gives me two rows (DATE(datetime), numVisits). This is a screen capture of the table table_stats:
Table_Stats
Ok, until now you can understand that every time a user comes to the site a new element is inserted on the table with type=profile_visit and the datetime field with the date and time of the visit, thats why i use a GROUP BY DATE(datetime) to count the total number of visits by day.
Here comes the complex part, when the type field is equal to 'click' and the origin is 'imp' that means that a user hits a particular button on the page, i will like to know how many times that button was clicked (no matter the ip) by day, just like i did with the profile visits.
I can make two querys, one to know the total visits (like the one before) and another similar just by grouping by datetime when type is 'click' and origin is 'imp'.
The problem is that i will like to make this just in one call in order to count the total visits by date in the row NumVisits like i did before and a new row call NumClick with the total of clicks made. This is why i dont want more calculations on my php server, if its possible will be great to make all the calculation on the sql server.
So finally, if you call this query to the table:
SELECT DATE( DATETIME ) , COUNT( 1 ) AS numVisits
FROM stats_ram
WHERE TYPE = 'profile_visit'
AND user_url = 'xxx'
AND id_user = '88e91'
GROUP BY DATE( DATE_SUB( DATETIME, INTERVAL 1
DAY ) )
LIMIT 0 , 30
You will get:
DATE(datetime) numVisits
2011-11-16 7
How can i add another row with the total type=click AND origin=imp made by DATE(datetime)???
Thanks for any help!!!
SELECT
DATE(DATETIME),
SUM(CASE WHEN type = 'profile_visit' THEN 1 ELSE 0 END) AS numVisits,
SUM(CASE WHEN type = 'click' AND origin = 'imp' THEN 1 ELSE 0 END) numClicks
FROM stats_ram
WHERE user_url = 'xxx'
AND id_user = '88e91'
GROUP BY DATE(DATE_SUB(DATETIME, INTERVAL 1 DAY))
LIMIT 0, 30