I have database with statistics over a number of websites and I'm currently having an issue with a rather complex query that I have no idea how to do (or if it's even possible).
I have 2 tables: websites and visits. The former is a list of all websites and their properties, while the former is a list of each user's visit on a specific website.
The program I'm making is supposed to fetch websites that need to be "scanned". The interval between each scan for each site depends on the websites total number of visits for the last 30 days. Here is a table with the intended scan-interval:
The tables have the following structure:
Websites
Visits
What I want is a query that returns the websites that are either at or past their individual update deadline (can be seen from the last_scanned column).
Is this easily doable in a single query?
Here's something you can try:
SELECT main.*
FROM (
SELECT
w.web_id,
w.url,
w.last_scanned,
(SELECT COUNT(*)
FROM visits v
WHERE v.web_id = w.web_id
AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
) AS visit_count,
TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
FROM websites w
) main
WHERE
(CASE
WHEN visit_count >= 0 AND visit_count <= 10 AND hrs_since_update >= 4320 THEN 1
WHEN visit_count >= 11 AND visit_count <= 100 AND hrs_since_update >= 2160 THEN 1
WHEN visit_count >= 101 AND visit_count <= 500 AND hrs_since_update >= 1080 THEN 1
WHEN visit_count >= 501 AND visit_count <= 1000 AND hrs_since_update >= 720 THEN 1
WHEN visit_count >= 1001 AND visit_count <= 2000 AND hrs_since_update >= 360 THEN 1
WHEN visit_count >= 2001 AND visit_count <= 5000 AND hrs_since_update >= 168 THEN 1
WHEN visit_count >= 5001 AND visit_count <= 10000 AND hrs_since_update >= 72 THEN 1
WHEN visit_count >= 10001 AND hrs_since_update >= 24 THEN 1
ELSE 0
END) = 1;
Here's the fiddle demo: http://sqlfiddle.com/#!9/1f671/1
First, I would make a subquery to get the visit counts from the visits table for each distinct web_id. Then, LEFT OUTER JOIN the websites table to this subquery. You can then query the result for each possible condition in your visits-to-update-frequency table, like so:
SELECT websites.* FROM websites
LEFT OUTER JOIN (
SELECT visits.web_id, COUNT(*) AS visits_count FROM visits GROUP BY visits.web_id
) v ON v.web_id = websites.web_id
WHERE
(v.visits_count <= 10 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 4320 HOUR)) OR
(v.visits_count BETWEEN 11 AND 100 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 2160 HOUR)) OR
(v.visits_count BETWEEN 101 AND 500 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 1080 HOUR)) OR
(v.visits_count BETWEEN 501 AND 1000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 720 HOUR)) OR
(v.visits_count BETWEEN 1001 AND 2000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 360 HOUR)) OR
(v.visits_count BETWEEN 2001 AND 5000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 168 HOUR)) OR
(v.visits_count BETWEEN 5001 AND 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 72 HOUR)) OR
(v.visits_count > 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 24 HOUR));
Just an improvment on #morgb query, using a table for visit count ranges
SQL FIDDLE DEMO
create table visitCount (
`min` bigint(20),
`max` bigint(20),
`frequency` bigint(20)
);
SELECT main.*
FROM (
SELECT
w.web_id,
w.url,
w.last_scanned,
(SELECT COUNT(*)
FROM visits v
WHERE v.web_id = w.web_id
AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
) AS visit_count,
TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
FROM websites w
) main inner join
visitCount v on visit_count between v.min and v.max
WHERE
main.hrs_since_update > v.frequency
Related
I have a table with manufacturing assembly data, including timestamps. I'm trying to determine the average interval in minutes between 'job' starts.
My query that returns the id and time looks like:
select job_id, job_started from JobTable where job_started >= '2016-07-01' and job_started <= '2016-07-31';
I'm looking for output that would be the difference in time between each row:
15
18
21
14
13
Get average interval in seconds:
select (to_seconds(max(job_started)) - to_seconds(min(job_started))) / (count(*) - 1) as average_interval_seconds
from JobTable
where date(job_started) >= '2016-07-01'
and date(job_started) <= '2016-07-31'
;
Get all intervals in seconds:
select to_seconds((
select t2.job_started
from JobTable t2
where t2.job_started > t1.job_started
and date(t2.job_started) <= '2016-07-31'
limit 1
)) - to_seconds(t1.job_started) as interval_seconds
from JobTable t1
where date(t1.job_started) >= '2016-07-01'
and date(t1.job_started) <= '2016-07-31'
and t1.job_started <> (
select job_started
from JobTable
where date(job_started) <= '2016-07-31'
order by job_started desc
limit 1
)
;
http://sqlfiddle.com/#!9/1f8dc3/2
i have made one demo Enity in sql Fidldle.
my table have following collumn
displayId | displaytime | campaignId
now what i want is to select the last 7 days entry for the particular campaignId (i mean where campaignId="")
there should be multiple entry for the same date with same campaign id..so i want to show sum of total campaignId with their date
so if there is 7 records for date 2013-3-8 and other date then it shows me record like
2013-03-02 | 1
2013-03-03 | 1
2013-03-04 | 0
2013-03-05 | 0
2013-03-06 | 0
2013-03-07 | 2
2013-03-08 | 7
in following query its just shows the date with their count which have more then 0 count...
e.g.
what i have tried is as follows..but it just give me one record which have entry in database...
2013-03-08 | 7
SELECT date(a.displayTime) AS `DisplayTime`,
ifnull(l.TCount,0) AS TCount
FROM a_ad_display AS a
INNER JOIN
(SELECT count(campaignId) AS TCount,
displayTime
FROM a_ad_display
WHERE CONVERT_TZ(displaytime,'+00:00','-11:00') >= DATE_SUB(CONVERT_TZ(CURDATE(),'+00:00','-11:00') ,INTERVAL 1 DAY)
AND CONVERT_TZ(displaytime,'+00:00','-11:00') <= CONVERT_TZ(CURDATE(),'+00:00','-11:00')
AND campaignId = 20747
GROUP BY DATE(displayTime)) AS l ON date(a.displayTime) = date(l.displayTime)
GROUP BY DATE(a.displayTime)
ORDER BY a.displaytime DESC LIMIT 7
i have implemented time zone in my query so if u can help me with the simple query then its ok..dont include Convert_Tz line.
this is http://sqlfiddle.com/#!2/96600c/1 link of my dummy entitiy
If you use a dates table, your query can be as simple as:
select dates.fulldate, count(a_ad_display.id)
from dates
left outer join a_ad_display ON dates.fulldate = a_ad_display.displaytime
[AND <other conditions here>]
where dates.fulldate BETWEEN date_add(curdate(), interval -6 day) AND curdate()
group by dates.fulldate
http://sqlfiddle.com/#!2/51e17/3
Without using a dates table, here's one way. It's ugly and probably terrible for performance:
select displaytime, sum(c)
from
(
select displaytime, count(a_ad_display.id) AS c
from a_ad_display
where displaytime BETWEEN date_add(curdate(), interval -6 day) AND curdate()
group by displaytime
union all
select curdate(), 0
union all
select date_add(curdate(), interval -1 day), 0
union all
select date_add(curdate(), interval -2 day), 0
union all
select date_add(curdate(), interval -3 day), 0
union all
select date_add(curdate(), interval -4 day), 0
union all
select date_add(curdate(), interval -5 day), 0
union all
select date_add(curdate(), interval -6 day), 0
) x
group by displaytime
order by displaytime
http://sqlfiddle.com/#!2/96600c/16
Simple!You can done by PHP. First create date array like
date = date('Y-m-d'); $date1 = date('Y-m-d', strtotime('-1 days')); $date2 = date('Y-m-d', strtotime('-2 days'));$date3 = date('Y-m-d', strtotime('-3 days'));$date4 = date('Y-m-d', strtotime('-4 days'));$date5 = date('Y-m-d', strtotime('-5 days'));$date6 = date('Y-m-d', strtotime('-6 days'));
$dt_arr = array($date6,$date5,$date4,$date3,$date2,$date1,$date);
for last 7 days from today then use foreach and run query with where date field inside foreach like
foreach ($dt_arr as $value)
{
$qry = mysqli_fetch_array(mysqli_query($link,"select * from table where dt_field = $value"));
if($qry['value'] == '')
{
$valfield = 0;
$dtfiled = $value;
}
I can select the total amount of unique ip addresses between a single time range
SELECT COUNT(DISTINCT ip_address) as ip_addr, exec_datetime
FROM requests
WHERE exec_datetime >= NOW() - INTERVAL 1 DAY;
ip_addr exec_datetime
45 12/10/2012 5:21
How do I return a result set for the following clauses in a single query...
WHERE exec_datetime >= NOW() - INTERVAL 1 DAY;
WHERE exec_datetime >= NOW() - INTERVAL 2 DAY;
...
WHERE exec_datetime >= NOW() - INTERVAL 14 DAY;
...so that the result set would look like this?
ip_addr exec_datetime
45 11/26/2012 5:21
85 11/27/2012 5:21
130 11/28/2012 5:21
170 11/29/2012 5:21
... ...
I would use something like this:
SELECT
COUNT(DISTINCT requests.ip_address) ip_addr,
days.day
FROM
(SELECT now()-INTERVAL 1 DAY as day UNION
SELECT now()-INTERVAL 2 DAY UNION
...
SELECT now()-INTERVAL 14 DAY) days
INNER JOIN requests
on requests.exec_datetime >= days.day
GROUP BY
days.day
I'd like to merge the results of the following three select statements horizontally. I tried using joins but no idea how to proceed since it involves COUNT and GROUP BY too.
SELECT DATE(created_at) as date,COUNT(*) as countd1 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 1 DAY) GROUP BY DATE(created_at)
SELECT DATE(created_at) as date,COUNT(*) as countd2 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 2 DAY) GROUP BY DATE(created_at)
SELECT DATE(created_at) as date,COUNT(*) as countd3 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 3 DAY) GROUP BY DATE(created_at)
The individual results would be
date countd1
2011-12-01 100
2011-12-02 120
2011-12-03 130
date countd2
2011-12-01 200
2011-12-02 220
2011-12-03 230
date countd3
2011-12-01 300
2011-12-02 320
2011-12-03 330
But I'd like to merge them so that I'll get the following result
date countd1 countd2 countd3
2011-12-01 100 200 300
2011-12-02 120 220 320
2011-12-03 130 230 330
How do I do this?
Is it possible to do something like the query below
SELECT a, COUNT(b where condition), COUNT(c where condition) FROM table GROUP BY a
.
Update
biziclop provided a great work around
SELECT DATE(created_at) AS date,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 1 DAY )) AS countd1,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 2 DAY )) AS countd2,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 3 DAY )) AS countd3
FROM b_users GROUP BY DATE(created_at)
Solved, thank you! :)
In MySQL the results of comparisons are 1 if true, 0 if false, so you could SUM() them:
SELECT
DATE(created_at) AS date,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 1 DAY )) AS countd1,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 2 DAY )) AS countd2,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 3 DAY )) AS countd3,
FROM b_users
GROUP BY DATE(created_at)
Can anyone help me with this MySQL query?
SELECT p.ProductID,
p.StoreID,
p.DiscountPercentage
FROM Products p
WHERE p.IsSpecial = 1
AND p.SpecialDate >= date_sub(now(),interval 15 minute)
AND p.DiscountPercentage >= ?DiscountPercentage
AND p.ProductID NOT IN (SELECT lf.LiveFeedID
From LiveFeed lf
WHERE p.ProductID = lf.ProductID
AND lf.DateAdded >= date_sub(now(),interval 30 day))
AND p.StoreID NOT IN (SELECT lf.LiveFeedID
From LiveFeed lf
WHERE p.StoreID = lf.StoreID
AND lf.DateAdded >= date_sub(now(),interval 6 hour))
ORDER BY p.StoreID, p.DiscountPercentage DESC
I'm trying join where the ProductID is not in the livefeed table in the last 30 days and where the storeid is not in the livefeed table in the last 6 hours, but it does not seem to be working. Any idea what I'm doing wrong?
At a glance, it would appear that your first subquery should be selecting ProductID, not LiveFeedID and your second subquery should be selecting StoreID not LiveFeedID
I'm too late:
SELECT p.ProductID,
p.StoreID,
p.DiscountPercentage
FROM Products p
WHERE p.IsSpecial = 1
AND p.SpecialDate >= date_sub(now(),interval 15 minute)
AND p.DiscountPercentage >= ?DiscountPercentage
AND p.ProductID NOT IN (SELECT lf.productid
FROM LIVEFEED lf
WHERE lf.DateAdded >= DATE_SUB(NOW(), INTERVAL 30 DAY))
AND p.storeid NOT IN (SELECT lf.storeid
FROM LIVEFEED lf
WHERE lf.DateAdded >= DATE_SUB(NOW(), INTERVAL 6 HOUR))
ORDER BY p.StoreID, p.DiscountPercentage DESC
You were using EXISTS syntax with a correllated subquery...
I'm trying to get the top discount for each store.
In that case, use:
SELECT p.StoreID,
MAX(p.DiscountPercentage)
FROM Products p
WHERE p.IsSpecial = 1
AND p.SpecialDate >= date_sub(now(),interval 15 minute)
AND p.DiscountPercentage >= ?DiscountPercentage
AND p.ProductID NOT IN (SELECT lf.productid
FROM LIVEFEED lf
WHERE lf.DateAdded >= DATE_SUB(NOW(), INTERVAL 30 DAY))
AND p.storeid NOT IN (SELECT lf.storeid
FROM LIVEFEED lf
WHERE lf.DateAdded >= DATE_SUB(NOW(), INTERVAL 6 HOUR))
GROUP BY p.storeid
ORDER BY p.StoreID, p.DiscountPercentage DESC