Mysql date range query slow - mysql

I have 2 mysql tables spot_times - 10k rows and visit_times - 5.3 million rows.
I m trying to write a query that can join spot_times.spot_date on visit_times.visit_date based on a 10 minute window.
Both date fields are indexed and column type datetime.
I have written the following sql which takes hours to run.
Select spot_date, count(visit_date) total_visits
From spot_times st
Left
Join visit_times v
on v.visit_date between st.spot_date and st.spot_date + interval 10 minute
group by 1;
This query takes hours to run.
My explain plan looks like the query is not using the indexes.
Explain plan
Please help.

Range queries are notoriously difficult to get useful index performance with on large datasets.
You might be able to get some benefit out of partitioning visit_times by date range: https://dev.mysql.com/doc/refman/8.0/en/partitioning-range.html

Just thought it might be useful for anyone that came across the same issue.
I started off by adding an auto_increment column visit_id on the visits_times table ordered by visit_date field.
Idea is to get the visit_id nearest to st.spot_date and st.spot_date + interval 10 minute.Then subtracting the visit_id which should be the total visits between the range.
Created a function to return visit_id for a date and interval.Function uses the visit_date index and loops until it finds a record adding a second on every loop.
DELIMITER //
DROP function IF EXISTS `spot_time_function` //
CREATE function `spot_time_function`( p_datetime datetime, p_time int)
returns int
BEGIN
declare v_id int ;
declare z int;
set z = 0;
time_loop: LOOP
select visit_id into v_id from visit_times where visit_date = p_datetime + interval p_time minute + interval z second limit 1;
IF v_id is not null THEN
LEAVE time_loop;
END IF;
SET z = z + 1;
END LOOP;
return v_id;
END //
DELIMITER ;
So final query looks like.
Select
spot_date,
spot_time_function(spot_date,10) - spot_time_function(spot_date,0) as total_visit
From spot_times;
Above query runs in 0.110 sec.

Related

Count mysql rows by hourly for given date

I have a mysql table with date column. date column data type is TIMESTAMP and default set to CURRENT_TIMESTAMP which record both date and time
now i want to count my rows under given day
As an example
++++++++++ 8am 9am 10am 11am
++user1+++ 15 10 11 10
++user2+++ 10 10 20 30
Every hour count should be recorded separately like this.
i tried with this but it's not working
SELECT COUNT(*) FROM mytable
WHERE `date` = '2015-01-26'
GROUP BY HOUR(`TIMESTAMP`)
how can i achieve this ?
i have no idea how to group with user . sproc is also okay
I made a sproc like this. but this sproc contain errors. can some one please help me now i want to count this separated by 9 hours
DELIMITER $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `test22`(IN datestamp DATE)
BEGIN
SELECT username,
COUNT(if(disblid,1,null)) '8:00 AM', where time between '08:00' and '09:00':
COUNT(if(disblid,1,null)) '9:00 AM' , where time between '09:00' and '10:00';
FROM claimloans
WHERE DATE(date) = datestamp
group by Username;
END
Thanks for everyone who helped me I come up with sproc that working perfectly fine.
DELIMITER $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `claimscounter`(IN datestamp DATE)
BEGIN
SELECT username,
COUNT(IF(HOUR(date)=0,1,NULL)) AS '12am',
COUNT(IF(HOUR(date)=1,1,NULL)) AS '1am',
COUNT(IF(HOUR(date)=2,1,NULL)) AS '2am',
COUNT(IF(HOUR(date)=3,1,NULL)) AS '3am',
COUNT(IF(HOUR(date)=4,1,NULL)) AS '4am',
COUNT(IF(HOUR(date)=5,1,NULL)) AS '5am',
COUNT(IF(HOUR(date)=6,1,NULL)) AS '6am',
COUNT(IF(HOUR(date)=7,1,NULL)) AS '7am',
COUNT(IF(HOUR(date)=8,1,NULL)) AS '8am',
COUNT(IF(HOUR(date)=9,1,NULL)) AS '9am',
COUNT(IF(HOUR(date)=10,1,NULL)) AS '10am',
COUNT(IF(HOUR(date)=11,1,NULL)) AS '11am',
COUNT(IF(HOUR(date)=12,1,NULL)) AS '12pm',
COUNT(IF(HOUR(date)=13,1,NULL)) AS '1pm',
COUNT(IF(HOUR(date)=14,1,NULL)) AS '2pm',
COUNT(IF(HOUR(date)=15,1,NULL)) AS '3pm',
COUNT(IF(HOUR(date)=16,1,NULL)) AS '4pm',
COUNT(IF(HOUR(date)=17,1,NULL)) AS '5pm',
COUNT(IF(HOUR(date)=18,1,NULL)) AS '6pm',
COUNT(IF(HOUR(date)=19,1,NULL)) AS '7pm',
COUNT(IF(HOUR(date)=20,1,NULL)) AS '8pm',
COUNT(IF(HOUR(date)=21,1,NULL)) AS '9pm',
COUNT(IF(HOUR(date)=22,1,NULL)) AS '10pm',
COUNT(IF(HOUR(date)=23,1,NULL)) AS '11pm'
FROM claimloans
WHERE DATE(date) = datestamp
group by username;
END
But now I have another small problem. This count all the hours. if it's not entry for some hour it count as zero. I want to count hours only have records can someone help me with this
thnaks
I would solve this as a view, like so:
CREATE TABLE foo (id int not null, val timestamp);
CREATE VIEW foo_by_hours AS (
SELECT
id,
DATE(val) AS 'day',
COUNT(IF(HOUR(val)=0,1,NULL)) AS '12am',
COUNT(IF(HOUR(val)=1,1,NULL)) AS '1am',
COUNT(IF(HOUR(val)=2,1,NULL)) AS '2am',
COUNT(IF(HOUR(val)=3,1,NULL)) AS '3am',
...
FROM foo
GROUP BY id, day);
SELECT * FROM foo_by_hours;
Full example on SQL Fiddle
I also added a view which uses SUM instead of COUNT. The result is the same, it's just a different way of doing it.

MySQL date difference check within a trigger

Within a trigger before I insert some data into a table, I would want it to check, whether the difference of dates of events I am entering is greater than or equal to 1 day. There can only be one event in one club taking place each day.
Sample story
If there already is 2014-01-01 19:00:00 date in database and I'm trying to insert another record with 2014-01-01 date (hour does not matter), it should not allow it.
Partial code from the trigger
DECLARE k INT DEFAULT 0;
/* This is where I get the error, ABS is to make it always positive to go through
checking, so that it wont matter whether the NEW date is before or after */
SELECT ABS(DATEDIFF(DATE_FORMAT(`performance_date`, '\'%Y-%m-%d %H:%i:%s\''),
DATE_FORMAT(NEW.`performance_date`, '\'%Y-%m-%d %H:%i:%s\''))) INTO k;
/* Below code is out of scope for this question */
IF k = 0 THEN
SIGNAL SQLSTATE '58005'
SET MESSAGE_TEXT = 'Wrong! Only 1 performance in 1 club is allowed per day! Change your date, or club!';
END IF;
Error Code: 1054. Unknown column 'performance_date' in 'field list'
I've tried something as simple as:
...DATEDIFF(`performance_date`, NEW.`performance_date`)
You can use a SELECT ... INTO var_list query to COUNT how many entries are already in the database that match your times:
I'm assuming that you mean one entry per day, and I'm assuming that the performance_date column is of DATETIME or TIMESTAMP type.
DECLARE k INT DEFAULT 0;
/* Count number of performances occurring on the same date as the
performance being inserted */
SELECT COUNT(*)
FROM tbl
WHERE performance_date
BETWEEN DATE(NEW.`performance_date`)
AND DATE(DATE_ADD(NEW.`performance_date`, INTERVAL 1 DAY))
INTO k;
/* If k is not 0, error as there is already a performance */
IF k != 0 THEN
SIGNAL SQLSTATE '58005'
SET MESSAGE_TEXT = 'Wrong! Only 1 performance in 1 club is allowed per day! Change your date, or club!';
END IF;
For clarity, if you have a performance with performance_date as 2014-01-01 19:00:00, and you insert a new performance with date 2014-01-01 08:30:00 (for example) then the above code will run this query, which will return a COUNT of 1, which will then cause the trigger to give that error:
SELECT COUNT(*)
FROM tbl
WHERE performance_date
BETWEEN DATE("2014-01-01 08:30:00") AND DATE(DATE_ADD("2014-01-01 08:30:00", INTERVAL 1 DAY))
# The line above will become:
# BETWEEN "2014-01-01" AND "2014-01-02"
INTO k

MySQL date intervals generation

I am working with MySQL database and I have to generate date intervals for specified period(specified by start and stop date) with specified step(for example one day).
I have written a stored procedure to generate intervals, to create a temporary table and to populate this table with intervals.
DELIMITER $$
CREATE PROCEDURE showu(IN start date, IN stop date)
BEGIN
CREATE TEMPORARY TABLE intervals(single_day DATE);
next_date: LOOP
IF start>stop THEN
LEAVE next_date;
END IF;
INSERT INTO intervals(single_day) VALUES(start);
SET start = DATE_ADD(start, INTERVAL 1 DAY);
END LOOP next_date;
END$$
DELIMITER ;
I want to use this temporary table in join queries. However I faced with a problem. When I call procedure call showu('2008-01-09', '2010-02-09'); it is executing approximately 30 seconds. The question why it is executing so long? Is it possible to improve it? If this solution is wrong how can I resolve my problem in different way?
From comments:
2 big problems: 1. I don't know exactly value of step(one day or one month or one hour).
Create one big table like this once (not temporary):
full_date | year | month | day | full_time | hour | minute | is_weekend | whatever
----------------------------------------------------------------------------------
...
Create as much indexes as needed and you will have a very performant a powerful swiss knife for all sorts of reports.
Note: You might consider not having time and date in the same table. This is just to simplify the example.
Your second problem
I will clog my database with not model data.
is no problem. Databases are there to hold data. That's it. If you have problems with space or whatever, the solution is to get more space, not to limit your ability to work efficiently.
That being said, here's some examples how to use this table.
You need dates:
SELECT
*
FROM
(
SELECT full_date AS your_step
FROM your_new_swiss_army_knife
WHERE `year` = 2012
GROUP BY full_date
) dates
LEFT JOIN your_tables_that_you_want_to_build_a_report_on y ON dates.your_step = y.date
Same with months:
SELECT
*
FROM
(
SELECT CONCAT(year, '-', month) AS your_step
FROM your_new_swiss_army_knife
WHERE full_date BETWEEN this AND that
GROUP BY year, month
) dates
LEFT JOIN your_tables_that_you_want_to_build_a_report_on y ON dates.your_step = CONCAT(YEAR(y.date), '-', MONTH(y.date))

Can I do that in the SQL statement?

Let say I have a post table. But I want to query all today post. But if today post is less than 10 post, I will get back the yesterday post to query. If it is more than 10 posts, no need to query yesterday post....If SQL statement can't do it. Is this only achieve it by calling the post manually....? Thank you.
***The database is MySQL
Let me clarify the question in a typical example:
If today have 5 posts....ONLY. And yesterday have 10 posts.
return : 5 today posts, and 5 posts from yesterday
If today have 12 posts....ONLY.
And yesterday have 10 posts.
return : 12 today posts.
If today have 10 posts....ONLY. And yesterday have 10 posts.
return : 10 today posts.
If today have 2 posts....ONLY. yesterday have 5 posts, and the day before yesterday 5posts.
return : 2 today posts, 5 yesterday posts, 3 the day before yesterday posts.
You can try
select count(*) from post_table
where date = todays_date
and if the result is > 10 then
select * from post_table
where date = today's date
else
select * from post_table
order by date desc
limit 10
Just another idea, a little bit shorter:
set #i = 0;
select *, #i := #i + 1
from post_table
where #i < 10 or date = today
order by date desc;
Not sure it is very effective.
Update: it is fast!
I tested on the such sample:
create table a(i int primary key, d date not null, index idx(d))
set #i = 0;
insert into a(i, d)
select #i := #i + 1, adddate(curdate(), interval -(#i % 1000) day)
from <100 records> a, <100 records> b, <100 records> c
A tiny development on Jan S's solution (combines the two conditional SELECTs into one with a parametrised LIMIT):
SELECT #count := COUNT(*)
FROM post_table
WHERE date = today;
IF #count < 10 SET #count = 10;
SELECT *
FROM post_table
ORDER BY date DESC
LIMIT #count;
UPDATE
As stated in the documentation:
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants, with these exceptions:
Within prepared statements, LIMIT parameters can be specified using ? placeholder markers.
Within stored programs, LIMIT parameters can be specified using integer-valued routine parameters or local variables.
That means, you can only use code like above in a stored procedure, not in a plain query you are issuing in your client application.

Group by day and still show days without rows?

I have a log table with a date field called logTime. I need to show the number of rows within a date range and the number of records per day. The issue is that i still want to show days that do not have records.
Is it possible to do this only with SQL?
Example:
SELECT logTime, COUNT(*) FROM logs WHERE logTime >= '2011-02-01' AND logTime <= '2011-02-04' GROUP BY DATE(logTime);
It returns something like this:
+---------------------+----------+
| logTime | COUNT(*) |
+---------------------+----------+
| 2011-02-01 | 2 |
| 2011-02-02 | 1 |
| 2011-02-04 | 5 |
+---------------------+----------+
3 rows in set (0,00 sec)
I would like to show the day 2011-02-03 too.
MySQL will not invent rows for you, so if the data is not there, they will naturally not be shown.
You can create a calendar table, and join in that,
create table calendar (
day date primary key,
);
Fill this table with dates (easy with a stored procedure, or just some general scripting), up till around 2038 and something else will likely break unitl that becomes a problem.
Your query then becomes e.g.
SELECT logTime, COUNT(*)
FROM calendar cal left join logs l on cal.day = l.logTime
WHERE day >= '2011-02-01' AND day <= '2011-02-04' GROUP BY day;
Now, you could extend the calendar table with other columns that tells you the month,year, week etc. so you can easily produce statistics for other time units. (and purists might argue the calendar table would have an id integer primary key that the logs table references instead of a date)
In order to accomplish this, you need to have a table (or derived table) which contains the dates that you can then join from, using a LEFT JOIN.
SQL operates on the concept of mathematical sets, and if you don't have a set of data, there is nothing to SELECT.
If you want more details, please comment accordingly.
I'm not sure if this is a problem that should be solved by SQL. As others have shown, this requires maintaining a second table that contains the all of the individual dates of a given time span, which must be updated every time that time span grows (which presumably is "always" if that time span is the current time.
Instead, you should use to inspect the results of the query and inject dates as necessary. It's completely dynamic and requires no intermediate table. Since you specified no language, here's pseudo code:
EXECUTE QUERY `SELECT logTime, COUNT(*) FROM logs WHERE logTime >= '2011-02-01' AND logTime <= '2011-02-04' GROUP BY DATE(logTime);`
FOREACH row IN query result
WHILE (date in next row) - (date in this row) > 1 day THEN
CREATE new row with date = `date in this row + 1 day`, count = `0`
INSERT new row IN query result AFTER this row
ADVANCE LOOP INDEX TO new row (`this row` is now the `new row`)
END WHILE
END FOREACH
Or something like that
DECLARE #TOTALCount INT
DECLARE #FromDate DateTime = GetDate() - 5
DECLARE #ToDate DateTime = GetDate()
SET #FromDate = DATEADD(DAY,-1,#FromDate)
Select #TOTALCount= DATEDIFF(DD,#FromDate,#ToDate);
WITH d AS
(
SELECT top (#TOTALCount) AllDays = DATEADD(DAY, ROW_NUMBER()
OVER (ORDER BY object_id), REPLACE(#FromDate,'-',''))
FROM sys.all_objects
)
SELECT AllDays From d