Sub query or Join which is the optimal solution? - mysql

i've 2 tables, 1 user table and a table where the
outgoing emails are queued. I want to select the users
that are not online for a certain amount of time
and send them an email. I also want that, if they
already received such an email in the last 7 days
or have an scheduled email for the next 7 days, that
they are not selected.
I have 2 queries, which i think would be great if
they are working with subqueries.
As an area of which i'm not an expert in, i would
like to kindly invite you to either,
Build a subquery of the second query
Make a JOIN and exclude the second query results.
I would be far more then happy :)
Thank you for reading
SELECT
`user_id`
FROM
`user`
WHERE
DATEDIFF( CURRENT_DATE(), date_seen ) >= 7
The results of the second query should be excluded
from the query above.
SELECT
`mail_queue_id`,
`mail_id`,
`user_id`,
`status`,
`date_scheduled`,
`date_processed`
FROM
`mail_queue`
WHERE
(
DATEDIFF( CURRENT_DATE(), date_scheduled ) >= 7
OR
DATEDIFF( date_scheduled, CURRENT_DATE() ) <= 7
)
AND
(
`mail_id` = 'inactive_week'
AND
(
`status` = 'AWAITING'
OR
`status` = 'DELIVERED'
)
)
SOLUTION
SELECT
`user_id`
FROM
`user` as T1
WHERE
DATEDIFF( CURRENT_DATE(), date_seen ) >= 7
AND NOT EXISTS
(
SELECT
`user_id`
FROM
`mail_queue` as T2
WHERE
T2.`user_id` = T1.`user_id`
AND
(
DATEDIFF( CURRENT_DATE(), date_scheduled ) >= 7
OR
DATEDIFF( date_scheduled, CURRENT_DATE() ) <= 7
AND
(
`mail_id` = 'inactive_week'
AND
(
`status` = 'AWAITING'
OR
`status` = 'DELIVERED'
)
)
)
)

YOu can select the users who match the first criterion (not having logged on in the past seven days) and then "AND" that criterion to another clause using "NOT EXISTS", aliasing the same table:
select * from T where {first criterion}
and not exists
(
select * from T as T2 where T2.userid = T.userid
and ABS( DATEDIFF(datescheduled, CURRENT_DATE()) ) <=7
)
I'm not familiar with the nuances of the mysql DATEDIFF, i.e. whether it matters which date value appears in which position, but the absolute value would make it so that if the user had been sent a notice in the past 7 days or is scheduled to receive a notice in the next seven days, they would satisfy the condition, and thereby fail the NOT EXISTS condition, excluding that user from your final set.

Related

Select statement joining 2 tables, searching by date, and status

OK I think I have messed up somewhere but maybe someone can spot my error, because I have little clue of what I am doing.
I have 2 Tables Players and RegionPlayer (see bottom for structure)
I am trying to find when a none of the players on a region have been seen in a while. Players can be on vacation which gives then 58 days, else its only 8 days.
If none of the players on a region have been seen in that time, I want the sql search to return the regionID, as well as the most recent person on that region who was seen.
Now I think that way to do this is to get 2 results from each region, each providing me the most recent player seen who was on vacation, and who was not on vacation.
But while, I thought this would give me that, it doesn't seem to.
SELECT RegionPlayer.Regionid, Players.key, Players.Name, Players.Seen, Players.Vacation
FROM RegionPlayer
JOIN Players
ON Players.Key = RegionPlayer.Playerid
where ( RegionPlayer.Status = 1 )
GROUP BY RegionPlayer.Regionid DESC, Players.Vacation DESC
ORDER BY Players.Seen DESC
Then I am going to need to be able to tell who has not been seen in a while, this should give me that.
Now I know I can link both queries together, but I have no idea how, it has been many years since I last had to put this much effort into sql statements.
Select Players.key FROM Players
WHERE
(( Players.Vacation != 1 ) AND
( Players.Seen <= (NOW() - INTERVAL 8 DAY ) ))
OR
(( Players.Vacation != 0 ) AND
( Players.Seen <= (NOW() - INTERVAL 58 DAY ) ))
Is There a better way of doing this, I sort of remember things like views, and store procedures, and functions, would one or more of them be better?
Table Structure.
Please forgive, the names, of the tables and some of the structure, This is an example of why deciding things late at night after 1/2 a bottle of wine is a bad idea.
CREATE TABLE IF NOT EXISTS `Players` (
`key` int(11) NOT NULL,
`Name` varchar(255) NOT NULL,
`Vacation` varchar(1) NOT NULL,
`Seen` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`Modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
CREATE TABLE IF NOT EXISTS `RegionPlayer` (
`Key` int(11) NOT NULL,
`Playerid` int(11) NOT NULL,
`Regionid` int(11) NOT NULL,
`Type` varchar(1) NOT NULL,
`Status` int(1) NOT NULL DEFAULT '1',
`Modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`Created` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00'
)
I've put up an SQLFiddle.
The query that answers your basic requirement, which seems to be: list all regions that have no active player seen in the last 8 days and no vacated player seen in the last 58 days, giving also the data of the last seen player in that region:
SELECT r.*
FROM (
SELECT rp.Regionid, p.Key, p.Name, p.Vacation, p.Seen
FROM RegionPlayer rp
JOIN Players p ON p.Key = rp.Playerid
WHERE rp.Status = 1
GROUP BY rp.Regionid
ORDER BY p.Seen DESC
) r
WHERE ((r.Vacation != 1) AND (r.Seen <= (NOW()-INTERVAL 8 DAY)))
OR ((r.Vacation != 0) AND (r.Seen <= (NOW()-INTERVAL 58 DAY)));
I desumed from your SQL that only RegionPlayer rows with a Status of 1 should be considered.
On the SQLFiddle I've create a bit of regions with different combinations, and this query does its job.
As to your first SQL statement. You say it doesn't work as expected, but to me it seems to do it... the last seen active player and last seen vacated player for each region. The sorting may not make it very readable, but it does do that.
Try this
SELECT RegionPlayer.Regionid, m.key, m.Name, m.Seen, m.Vacation
FROM RegionPlayer
JOIN (Select * as key FROM Players
WHERE
(( Players.Vacation != 1 ) AND
( Players.Seen <= (NOW() - INTERVAL 8 DAY ) ))
OR
(( Players.Vacation != 0 ) AND
( Players.Seen <= (NOW() - INTERVAL 58 DAY ) ))) m
ON m.Key = RegionPlayer.Playerid
where ( RegionPlayer.Status = 1 )
GROUP BY RegionPlayer.Regionid DESC, m.Vacation DESC
ORDER BY m.Seen DESC

MySQL Query - Trouble with dates / between

I have a 3rd party plugin that displays events, however for some reason whenever there is an event with multiple days, it stops showing the event when the current day is past the start date, even though the end date is still in the future.
The MySQL query appears to be trying to return these with the BETWEEN part after the OR at the end but it never does. I'm not familiar enough with MySQL to see what's wrong I guess.
For instance the row I'm expecting it to return here contains:
published=1
dates=2014-04-17
enddates=2014-04-19
SQL:
SELECT a.*
FROM eventlist_events AS a
WHERE a.published = 1
AND a.dates >= '2014-04-18'
AND (DATE(a.dates) = DATE_ADD('2014-04-18',INTERVAL 0 DAY)
OR ( a.enddates IS NOT NULL
AND (DATE_ADD('2014-04-18',INTERVAL 0 DAY)
BETWEEN DATE(a.dates) AND DATE(a.enddates))) )
To satisfy your where clause for 2014-04-18 fall in between the column values:
SELECT a.* FROM eventlist_events AS a
WHERE a.published = 1
AND a.enddates IS NOT NULL
AND '2014-04-18' BETWEEN DATE(a.dates) AND DATE(a.enddates)
And if you want to fetch when a.enddates too is null, then
SELECT a.* FROM eventlist_events AS a
WHERE a.published = 1
-- AND a.enddates IS NOT NULL
AND '2014-04-18'
BETWEEN DATE( a.dates )
AND DATE( IFNULL( a.enddates, a.dates ) )

MYSQL view not filling up empty spots to fill 1000

The following query gets the popular questions from the questions asked in the last 2 days. It looks at a feed table to see whats talked about latest, then it searches a tag table to find which one of those is popular.
I only get about 60 results which is great, but I need 1000 results. This means I need to fill up the rest with random questions.
My sql query attempts to do this but does not fill in the rest of the view with more questions not in the feed table.
CREATE
ALGORITHM = UNDEFINED
DEFINER = `root`#`%`
SQL SECURITY DEFINER
VIEW `popular` AS
select
`q`.`name` AS `name`,
`q`.`questionUrl` AS `questionUrl`,
`q`.`miRating` AS `miRating`,
`q`.`imageUrl` AS `imageUrl`,
`q`.`foundOn` AS `foundOn`,
`q`.`myId` AS `myId`
from
(`question` `q`
join `feed` `f` ON ((`q`.`myId` = `f`.`question_id`))
join `tag` `t` ON ((`q`.`myId` = `t`.`question_id`)))
where
(`t`.`name` like '%popular%')
group by `q`.`name`
order by (max(`f`.`timeStamp`) >= (now() - interval 1 day)) desc , (`q`.`myId` is not null) desc
limit 0 , 1000comment
If you need random questions, remove the where clause and move the logic to the order by:
select
`q`.`name` AS `name`,
`q`.`questionUrl` AS `questionUrl`,
`q`.`miRating` AS `miRating`,
`q`.`imageUrl` AS `imageUrl`,
`q`.`foundOn` AS `foundOn`,
`q`.`myId` AS `myId`
from
(`question` `q`
join `feed` `f` ON ((`q`.`myId` = `f`.`question_id`))
join `tag` `t` ON ((`q`.`myId` = `t`.`question_id`)))
group by `q`.`name`
order by (max(`f`.`timeStamp`) >= (now() - interval 1 day)) desc ,
max(`t`.`name` like '%popular%') desc,
rand()
limit 0 , 1000;

MYSQL Query : How to get values per category?

I have huge table with millions of records that store stock values by timestamp. Structure is as below:
Stock, timestamp, value
goog,1112345,200.4
goog,112346,220.4
Apple,112343,505
Apple,112346,550
I would like to query this table by timestamp. If the timestamp matches,all corresponding stock records should be returned, if there is no record for a stock for that timestamp, the immediate previous one should be returned. In the above ex, if I query by timestamp=1112345 then the query should return 2 records:
goog,1112345,200.4
Apple,112343,505 (immediate previous record)
I have tried several different ways to write this query but no success & Im sure I'm missing something. Can someone help please.
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` = 1112345
UNION ALL
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` < 1112345
ORDER BY `timestamp` DESC
LIMIT 1
select Stock, timestamp, value from thisTbl where timestamp = ? and fill in timestamp to whatever it should be? Your demo query is available on this fiddle
I don't think there is an easy way to do this query. Here is one approach:
select tprev.*
from (select t.stock,
(select timestamp from t.stock = s.stock and timestamp <= <whatever> order by timestamp limit 1
) as prevtimestamp
from (select distinct stock
from t
) s
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock
This is getting the previous or equal timestamp for the record and then joining it back in. If you have indexes on (stock, timestamp) then this may be rather fast.
Another phrasing of it uses group by:
select tprev.*
from (select t.stock,
max(timestamp) as prevtimestamp
from t
where timestamp <= YOURTIMESTAMP
group by t.stock
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock

How to optimise my complex MySQL query?

Table
Each row represents a video that was on air at particular time on particular date. There are about 1600 videos per day.
CREATE TABLE `air_video` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`date` DATE NOT NULL,
`time` TIME NOT NULL,
`duration` TIME NOT NULL,
`asset_id` INT(10) UNSIGNED NOT NULL,
`name` VARCHAR(100) NOT NULL,
`status` VARCHAR(100) NULL DEFAULT NULL,
`updated` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE INDEX `date_2` (`date`, `time`),
INDEX `date` (`date`),
INDEX `status` (`status`),
INDEX `asset_id` (`asset_id`)
)
ENGINE=InnoDB
Task
There are two conditions.
Each video must be shown not more than 24 times per day.
Each video must be in rotation no longer than 72 hours.
In rotation means time span between the fist and the last time the video was on air.
So I need to select all videos that violate those conditions, given user-specified date range.
The result must be grouped by day and by asset_id (video id). For example:
date asset_id name dailyCount rotationSpan
2012-04-27 123 whatever_the_name 35 76
2012-04-27 134 whatever_the_name2 39 20
2012-04-28 125 whatever_the_name3 26 43
Query
By now I have written this query:
SELECT
t1.date, t1.asset_id, t1.name,
(SELECT
COUNT(t3.asset_id)
FROM air_video AS t3
WHERE t2.asset_id = t3.asset_id AND t3.date = t1.date
) AS 'dailyCount',
MIN(CONCAT(t2.date, ' ', t2.time)) AS 'firstAir',
MAX(CONCAT(t2.date, ' ', t2.time)) AS 'lastAir',
ROUND(TIMESTAMPDIFF(
MINUTE,
MIN(CONCAT(t2.date, ' ', t2.time)),
MAX(CONCAT(t2.date, ' ', t2.time))
) / 60) as 'rotationSpan'
FROM
air_video AS t1
INNER JOIN
air_video AS t2 ON
t1.asset_id = t2.asset_id
WHERE
t1.status NOT IN ('bumpers', 'clock', 'weather')
AND t1.date BETWEEN '2012-04-01' AND '2012-04-30'
GROUP BY
t1.asset_id, t1.date
HAVING
`rotationSpan` > 72
OR `dailyCount` > 24
ORDER BY
`date` ASC,
`rotationSpan` DESC,
`dailyCount` DESC
Problems
The bigger the range between user specified days - the longer it takes to complete the query (for a month range it takes about 9 sec)
The lastAir timestamp is not the latest time the video was aired on particular date but the latest time it was on air altogether.
If you need to speed up your query you need to remove the select sub query on line 3.
To still have that count you can inner join it again in the from clause with the exact parameters you used initially. This is how it should look:
SELECT
t1.date, t1.asset_id, t1.name,
COUNT(t3.asset_id) AS 'dailyCount',
MIN(CONCAT(t2.date, ' ', t2.time)) AS 'firstAir',
MAX(CONCAT(t2.date, ' ', t2.time)) AS 'lastAir',
ROUND(TIMESTAMPDIFF(
MINUTE,
MIN(CONCAT(t2.date, ' ', t2.time)),
MAX(CONCAT(t2.date, ' ', t2.time))
) / 60) as 'rotationSpan'
FROM
air_video AS t1
INNER JOIN
air_video AS t2 ON
(t1.asset_id = t2.asset_id)
INNER JOIN
air_video AS t3
ON (t2.asset_id = t3.asset_id AND t3.date = t1.date)
WHERE
t1.status NOT IN ('bumpers', 'clock', 'weather')
AND t1.date BETWEEN '2012-04-01' AND '2012-04-30'
GROUP BY
t1.asset_id, t1.date
HAVING
`rotationSpan` > 72
OR `dailyCount` > 24
ORDER BY
`date` ASC,
`rotationSpan` DESC,
`dailyCount` DESC
Since t2 is not bound by date, you are obviously looking at the whole table, instead of the date range.
Edit:
Due to a lot of date bindings the query still ran too slowly. I then took a different approach. I created 3 views (which you obviously can combine into a normal query without the views, but I like the end result query better)
--T1--
CREATE VIEW t1 AS select date,asset_id,name from air_video where (status not in ('bumpers','clock','weather')) group by asset_id,date order by date;
--T2--
CREATE VIEW t2 AS select t1.date,t1.asset_id,t1.name,min(concat(t2.date,' ',t2.time)) AS 'firstAir',max(concat(t2.date,' ',t2.time)) AS 'lastAir',round((timestampdiff(MINUTE,min(concat(t2.date,' ',t2.time)),max(concat(t2.date,' ',t2.time))) / 60),0) AS 'rotationSpan' from (t1 join air_video t2 on((t1.asset_id = t2.asset_id))) group by t1.asset_id,t1.date;
--T3--
CREATE VIEW t3 AS select t2.date,t2.asset_id,t2.name,count(t3.asset_id) AS 'dailyCount',t2.firstAir,t2.lastAir,t2.rotationSpan AS rotationSpan from (t2 join air_video t3 on(((t2.asset_id = t3.asset_id) and (t3.date = t2.date)))) group by t2.asset_id,t2.date;
From there you can then just run the following query:
SELECT
date,
asset_id,
name,
dailyCount,
firstAir,
lastAir,
rotationSpan
FROM
t3
WHERE
date BETWEEN '2012-04-01' AND '2012-04-30'
AND (
rotationSpan > 72
OR
dailyCount > 24
)
ORDER BY
date ASC,
rotationSpan DESC,
dailyCount DESC