I have a table below as
timestamp | user_id | activity
2021-02-01 03:21:11 mike12 read
2021-02-02 03:45:22 bob55 like
2021-02-03 04:21:33 sarah22 post
2021-02-01 04:11:33 cindy11 sign-in
I want to calculate # users churned in last 7 days as =
number of all users - active users (where active are those who like, read, comment, or post
with active_users as
(
select count(distinct user_id)
from table
where activity IN ('comment','post','read','like')
and date_diff(timestamp, current_date()) <= 7
)
, inactive_users as
(select count(distinct user_id)
from table
where activity IN ('sign-in')
and date_diff(timestamp, current_date()) <= 7)
What would be the correct way to subtract the two above? I am unsure of how to join the two ctes in the final query, thanks for helping!
I have a stored procedure. A little problematic in terms of performance. I want to improve the performance of the stored procedure, but I could not figure out what to do. There are approximately 3 million records in my database. When I run this query one by one, it's good in performance. But when 150 people run this stored procedure at the same time, there are spikes in the CPU.
As an example, I created my procedure and table structures.
My Stored Procedure:
BEGIN
SELECT ss.car_route from person o
inner join car_time ss on ss.inst_id =o.inst_id
and ss.start_time<=DATE_FORMAT(CURTIME(),'%H:%i') AND ss.finish_time>= date_format(curtime() ,'%H:%i') AND ss.car_id=carid
and ss.days like concat('%',(select WEEKDAY(now())+1),'%')
where (o.car_id=carid or o.back_car_id=carid ) LIMIT 1 into #route_;
select sf.stop_service from car_comp sf
inner join cars s on s.inst_id = sf.id and s.id=carid and s.active=1 limit 1
into #stop_ser;
if #route_ = 1 and #stop_ser=0 THEN
select DISTINCT ss.start_time,ss.finish_time ,o.id,o.name,r.photo, oh.state ,oh.datee,ss.car_route,
ifnull(bh.id,0) AS called,
ifnull(mh.excuse_id,0) AS excuse_id,
ifnull(o.latitude_1,0) AS latitude_1,
ifnull(o.longitude_1,0) AS longitude_1,
ifnull(o.latitude_2,0) AS latitude_2,
ifnull(o.longitude_2,0) AS longitude_2,
case when (ifnull(o.call_notify,0)=1 or ifnull(o.mes_notify,0)=1) then 1 else 0 end AS call_notify ,
ifnull(o.rownumber,0) AS rownumber,
ifnull(o.number_1,0) AS number_1,
ifnull(o.number_2,0) AS number_2,
ifnull(o.brownumber,0) AS brownumber,
ifnull(ROUND(o.notify_meter_1/2),0) AS notify_meter_1,
ifnull(ROUND(o.notify_meter_2/2),0) AS notify_meter_2
from person o
inner join car_time ss on ss.inst_id =o.inst_id and o.car_id=ss.car_id
and ss.start_time<=DATE_FORMAT(CURTIME(),'%H:%i') AND ss.finish_time>= date_format(curtime() ,'%H:%i')
and ss.days like concat('%',(select WEEKDAY(now())+1),'%')
LEFT JOIN notify_records bh ON bh.table_id=o.id AND bh.car_route=#route_
and bh.table_name='person' AND bh.notify=4 AND bh.car_id=o.car_id and bh.date_ >= CURDATE() and bh.date_ < CURDATE() + INTERVAL 1 DAY
left join person_records oh on oh.person_id=o.id
and oh.car_id=o.car_id
and date_format(oh.datee,'%H:%i') >=ss.start_time
and date_format(oh.datee,'%H:%i') <=ss.finish_time
AND oh.car_route= #route_
and
oh.id in(select max(id) from person_records
where date_time >= CURDATE() and date_time < CURDATE() + INTERVAL 1 DAY and car_id = carid and car_id = carid
GROUP by person_id
)
left join inst ok on o.inst_id = ok.id and o.car_id=carid
left join excuse_records mh on mh.person_id=o.id and mh.date_time >= CURDATE() and mh.date_time < CURDATE() + INTERVAL 1 DAY and (mh.car_route=ss.car_route)
left join photo_ r on r.table_id = o.id and r.table_name = 'person'
where
(ss.car_route=o.cars_route_ or o.cars_route_=3) and
o.car_id = carid and o.active=1
AND o.work_time=ss.work_time;
elseif #route_ = 2 and #stop_ser=0 then
select DISTINCT ss.start_time,ss.finish_time ,o.id,o.name,r.photo, oh.state ,oh.datee,ss.car_route,
ifnull(bh.id,0) AS called,
ifnull(mh.excuse_id,0) AS excuse_id,
ifnull(o.latitude_1,0) AS latitude_1,
ifnull(o.longitude_1,0) AS longitude_1,
ifnull(o.latitude_2,0) AS latitude_2,
ifnull(o.longitude_2,0) AS longitude_2,
case when (ifnull(o.call_notify,0)=1 or ifnull(o.mes_notify,0)=1) then 1 else 0 end AS call_notify ,
ifnull(o.rownumber,0) AS rownumber,
ifnull(o.number_1,0) AS number_1,
ifnull(o.number_2,0) AS number_2,
ifnull(o.brownumber,0) AS brownumber,
ifnull(ROUND(o.notify_meter_1/2),0) AS notify_meter_1,
ifnull(ROUND(o.notify_meter_2/2),0) AS notify_meter_2
from person o
inner join car_time ss on ss.inst_id =o.inst_id and o.back_car_id=ss.car_id
and ss.start_time<=DATE_FORMAT(CURTIME(),'%H:%i') AND ss.finish_time>= date_format(curtime() ,'%H:%i')
and ss.days like concat('%',(select WEEKDAY(now())+1),'%')
LEFT JOIN notify_records bh ON bh.table_id=o.id AND bh.car_route=#route_
and bh.table_name='person' AND bh.notify=4 AND bh.car_id=o.back_car_id and bh.date_ >= CURDATE() and bh.date_ < CURDATE() + INTERVAL 1 DAY
left join person_records oh on oh.person_id=o.id
and oh.car_id=o.back_car_id and oh.car_route=2
and date_format(oh.datee,'%H:%i') >=ss.start_time
and date_format(oh.datee,'%H:%i') <=ss.finish_time
AND oh.car_route= #route_
and
oh.id in (select max(id) from person_records
where date_time >= CURDATE() and date_time < CURDATE() + INTERVAL 1 DAY and car_id = carid
GROUP by person_id
)
left join inst ok on o.inst_id = ok.id and o.car_id=carid
left join excuse_records mh on mh.person_id=o.id and mh.date_time >= CURDATE() and mh.date_time < CURDATE() + INTERVAL 1 DAY and (mh.car_route=ss.car_route)
left join photo_ r on r.table_id = o.id and r.table_name = 'person'
where
(ss.car_route=o.cars_route_ or o.cars_route_=3) and
o.back_car_id = carid and o.active=1
AND o.work_time=ss.work_time;
END IF;
end
I have a database example here.
I made my.cnf improvement but still have difficulties with performance. What is wrong with this query? What can I change?
Thank you from now.
Edit:
Server version: 10.1.41-MariaDB - MariaDB Server
I have indexes. I forgot to add indexes while creating test data.
What the heck is this?
ss.days like concat('%',(select WEEKDAY(now())+1),'%')
It can at least be sped up by changing to
ss.days like concat('%',WEEKDAY(now()),'%')
And, won't that lead to checking against 2, 21, 20, 12, ... if the WEEKDAY is "2"?
These might be useful for ss:
(car_id, inst_id, start_time)
(inst_id, car_id, finish_time)
LIMIT 1 without ORDER BY leads to some random row being returned? Is the LIMIT redundant? Or is an ORDER BY needed?
Suggest you get some timings -- It is not obvious which of the SELECTs is chewing up the most CPU.
If the PRIMARY KEY of cars is id, then why test for inst_id and active? Yikes! You don't seem to have a PK for cars! Please verify that every table has a PK.
Redundant:
and car_id = carid
and car_id = carid
Why twice? And what tables are those columns in? Please qualify columns so we can understand what is going on.
When #stop_ser=0, the procedure does nothing? In which case, perform that test first, so you can avoid computing #route.
Change start_time to datatype TIME; then you can get rid of DATE_FORMAT in
and ss.start_time<=DATE_FORMAT(CURTIME(),'%H:%i')
AND ss.finish_time>= date_format(curtime() ,'%H:%i')
Also, beware of the inequality tests, it may lead to some edge cases you did not want.
Don't use (m,n) on FLOAT (eg, float(11,7)); it does rounding that is unnecessary. Also, you can't get 7 decimal places for lat/lng except very near the equator and longitude=0. More on precision: http://mysql.rjweb.org/doc.php/latlng#representation_choices
After you have cleaned up those and provided the requested info, I will take another look.
I have two MySQL tables memberships and member_cards. Each membership & member card can have three states.
Active = start_date <= today <= end_date
Future = today < start_date
Expired = end_date < today
Memberships table
id--------membership_number--------start_date-------------end_date
1--------**123**--------------------------------09-20-2014-----------09-20-2015
2--------**123**--------------------------------09-20-2015-----------09-20-2016
3--------**123**--------------------------------09-20-2016-----------09-20-2017
4--------**123**--------------------------------09-20-2017-----------09-20-2018
5--------**456**--------------------------------09-20-2013-----------09-20-2014
6--------**456**--------------------------------09-20-2014-----------09-20-2015
Membership cards
id--------membership_id-------------start_date-------------end_date
1--------**1**--------------------------------09-20-2014-----------05-15-2015
2--------**1**--------------------------------09-20-2014-----------09-20-2015
3--------**2**--------------------------------09-20-2015-----------05-13-2016
4--------**2**--------------------------------09-20-2015-----------09-20-2016
5--------**3**--------------------------------09-20-2016-----------09-21-2016 (past)
6--------**3**--------------------------------09-20-2016-----------05-15-2017
7--------**3**--------------------------------09-20-2016-----------09-20-2017
8--------**4**--------------------------------09-20-2017-----------05-13-2017
9--------**4**--------------------------------09-20-2017-----------09-20-2018
10-------**5**--------------------------------09-20-2013-----------05-13-2014
11-------**5**--------------------------------09-20-2013-----------09-20-2014
12------**6**--------------------------------09-20-2014-----------05-13-2015
13-----**6**--------------------------------09-20-2014-----------09-20-2015
I want to retrieve
All the active + future memberships + (if there are no active or future memberships for a particular membership number, the last expired record)
The results:
id--------membership_number--------start_date-------------end_date
3--------**123**--------------------------------09-20-2016-----------09-20-2017
4--------**123**--------------------------------09-20-2017-----------09-20-2018
6--------**456**--------------------------------09-20-2014-----------09-20-2015
Active cards + (if the membership has expired, all the cards tied to that membership )
The results:
id--------membership_id-------------start_date-------------end_date
6--------**3**--------------------------------09-20-2016-----------05-15-2017
7--------**3**--------------------------------09-20-2016-----------09-20-2017
8--------**4**--------------------------------09-20-2017-----------05-13-2017
9--------**4**--------------------------------09-20-2017-----------09-20-2018
12------**6**--------------------------------09-20-2014-----------05-13-2015
13-----**6**--------------------------------09-20-2014-----------09-20-2015
Each table contains about 200k records. I am trying to do the second query (for the member_cards) using a single MySQL query using UNION. Are there any better approaches?
As has been said your question is a bit unclear, but I've written a query that returns your 2'nd target table based on the 1'st one. Inner query returns your 1'st target.
select
b.id#,
b.membership_id,
b.start_date,
b.end_date
from membership_cards b
full outer join (
select * from
(select a.*,
max(end_date) over (partition by membership_number) max_end_date,
case when
end_date>=sysdate --replace with today or whatever
then 1 --active
else 0 --inactive
end index_active
from memberships a)
where (end_date<=max_end_date and end_date>=sysdate) or
end_date = max_end_date) c
on c.id# = membership_id
where (c.index_active = 1 and b.end_date >= sysdate) or
c.index_active = 0
I'm tracking a users visit to a website by recording them in a database. A visit has a cooldown period of 6 hours. For this reason, I want to add a row to the table visits only if a user last visited the current website over 6 hours ago. If the last visit was less than 6 hours ago, do nothing.
I've looked around for answers to this and found plenty of quite similar issues, but none of those worked for me.
This is last query I tried:
INSERT INTO visits (user_id, web_id)
SELECT (66, 2) FROM websites WHERE NOT EXISTS (
SELECT 1 FROM visits WHERE web_id = 2 and user_id = 66 and added_on >= NOW() - INTERVAL 6 HOUR
)
I'm getting a syntax error near WHERE NOT EXISTS.
You might want to enforce this rule with a trigger rather than in the application. But I think the problem are the parentheses in the SELECT clause:
INSERT INTO visits (user_id, web_id)
SELECT 66, 2
FROM websites
WHERE NOT EXISTS (SELECT 1
FROM visits
WHERE web_id = 2 and user_id = 66 and
added_on >= NOW() - INTERVAL 6 HOUR
);
Hmmm . . . Your query is strange. What is websites? Your query is going to insert one row for every row in that table. It seems unlikely that you want this behavior. Perhaps you just want this:
INSERT INTO visits (user_id, web_id)
SELECT w.user_id, w.web_id
FROM (SELECT 66 as user_id, 2 as web_id) w
WHERE NOT EXISTS (SELECT 1
FROM visits v
WHERE v.web_id = w.web_id and
v.user_id = w.user_id and
v.added_on >= NOW() - INTERVAL 6 HOUR
);
you might try this ...
INSERT visits (user_id, web_id)
SELECT distinct user_id, web_id
FROM websites t join websites b
on b.user_id = t.user_id
and b.web_id = t.web_id
and b.added_on = (Select Max(added_on) From websites
Where id = t.user_id
and web_id = t.web_id
and added_on < t.added_on)
WHERE t.added_on >= NOW() - INTERVAL 6 HOUR
and user_id = 66
and web_id = 2
try the Select part by itself first, to see if it returns the correct result.
by the way, if you only have the userId and the webId in the visits table, without the datetime of the visit, how do you interpret the data when there are multiple rows for the same user/website combination, which are the result of visits more than six hours apart?
I have two tables, one is a list of firms, the other is a list of jobs the firms have advertised with deadlines for application and start dates.
Some of the firms will have advertised no jobs, some will only have jobs that are past their deadline dates, some will only have live jobs and others will have past and live applications.
What I want to be able to show as a result of a query is a list of all the firms, with the nearest deadline they have, sorted by that deadline. So the result might look something like this (if today was 2015-01-01).
Sorry, I misstated that. What I want to be able to do is find the next future deadline, and if there is no future deadline then show the last past deadline. So in the first table below the BillyCo deadline has passed, but the next BuffyCo deadline is shown. In the BillyCo case there are earlier deadlines, but in the BuffyCo case there are both earlier and later deadlines.
id name title date
== ==== ===== ====
1 BobCo null null
2 BillCo Designer 2014-12-01
3 BuffyCo Admin 2015-01-31
So, BobCo has no jobs listed at all, BillCo has a deadline that has passed and BuffyCo has a deadline in the future.
The problematic part is that BillCo may have a set of jobs like this:
id title date desired hit
== ===== ==== ===========
1 Coder 2013-12-01
2 Manager 2014-06-30
3 Designer 2012-12-01 <--
And BuffyCo might have:
id title date desired hit
== ===== ==== ===========
1 Magician 2013-10-01
2 Teaboy 2014-05-19
3 Admin 2015-01-31 <--
4 Writer 2015-02-28
So, I can do something like:
select * from (
select * from firms
left join jobs on firms.id = jobs.firmid
order by date desc)
as t1 group by firmid;
Or, limit the jobs joined or returned by a date criterion, but I don't seem to be able to get the records I want returned. ie the above query would return:
id name title date
== ==== ===== ====
1 BobCo null null
2 BillCo Designer 2014-12-01
3 BuffyCo Writer 2015-02-28
For BuffyCo it's returning the Writer job rather than the Admin job.
Is it impossible with an SQL query? Any advice appreciated, thanks in advance.
I think this may be what you need, you need:
1) calculate the delta for all of your jobs between the date and the current date finding the min delta for each firm.
2) join firms to jobs only on where firm id's match and where the calculated min delta for the firm matches the delta for the row in jobs.
SELECT f.id, f.name, j.title,j.date
FROM firms f LEFT JOIN
(SELECT firmid,MIN(abs(datediff(date, curdate())))) AS delta
FROM jobs
GROUP BY firmid) d
ON f.id = d.firmid
LEFT JOIN jobs j ON f.id = j.id AND d.delta = abs(datediff(j.date, curdate())))) ;
You want to make an outer join with something akin to the group-wise maximum of (next upcoming, last expired):
SELECT * FROM firms LEFT JOIN (
-- fetch the "groupwise" record
jobs NATURAL JOIN (
-- using the relevant date for each firm
SELECT firmid, MAX(closest_date) date
FROM (
-- next upcoming deadline
SELECT firmid, MIN(date) closest_date
FROM jobs
WHERE date >= CURRENT_DATE
GROUP BY firmid
UNION ALL
-- most recent expired deadline
SELECT firmid, MAX(date)
FROM jobs
WHERE date < CURRENT_DATE
GROUP BY firmid
) closest_dates
GROUP BY firmid
) selected_dates
) ON jobs.firmid = firms.id
This will actually give you all jobs that have the best deadline date for each firm. If you want to restrict the results to an indeterminate record from each such group, you can add GROUP BY firms.id to the very end.
The revision to your question makes it rather trickier, but it can still be done. Try this:
select
closest_job.*, firm.name
from
firms
left join (
select future_job.*
from
(
select firmid, min(date) as mindate
from jobs
where date >= curdate()
group by firmid
) future
inner join jobs future_job
on future_job.firmid = future.firmid and future_job.date = future.mindate
union all
select past_job.*
from
(
select firmid, max(date) as maxdate
from jobs
group by firmid
having max(date) < curdate()
) past
inner join jobs past_job
on past_job.firmid = past.firmid and past_job.date = past.maxdate
) closest_job
on firms.id = closest_job.firmid
I think this does what I need:
select * from (
select firms.name, t2.closest_date from firms
left join
(
select * from (
--get first date in the future
SELECT firmid, MIN(date) closest_date
FROM jobs
WHERE date >= CURRENT_DATE
GROUP BY firmid
UNION ALL
-- most recent expired deadline
SELECT firmid, MAX(date)
FROM jobs
WHERE date < CURRENT_DATE
GROUP BY firmid) as t1
-- order so latest date is first
order by closest_date desc) as t2
on firms.id = t2.firmid
-- group by eliminates all but latest date
group by firms.id) as t3
order by closest_date asc;
Thanks for all the help on this