Having a Date And DateTime Comparison problem in my SQL

Having a Date And DateTime Comparison problem in my SQL - mysql

Been testing this over and over, and it fails at the date comparison.(item.id_type seems to work fine).
request.date has the datatype DATETIME.
SELECT request.id, request.date, request.total_price,
item.cod_GERFIP,item.price,item.name,request_item.quantity,
section.name AS section,user.firstname,user.lastname
FROM ((((`request_item`
INNER JOIN `request` ON request_item.id_request = request.id)
INNER JOIN `item` ON request_item.id_item = item.cod_GERFIP)
INNER JOIN `user` ON request.id_user = user.id)
INNER JOIN `section` ON user.section = section.id)
WHERE request.date >= '2019-01-09' AND request.date <= '2019-01-10'
AND item.id_type = '1'
ORDER BY request.date DESC

Just a guess, because you haven't explained what or how the query fails, but to my eyes this condition does not look correct:
request.date <= '2019-01-10'
It's a common mistake to expect a condition like this, when used a part of a range, to include all records where the date part of a datetime field is 2019-01-10. That is, if we have an example value in the database of 1 PM on the same day (2019-01-10 13:00:00), the expectation is this value will narrow to match the 2019-01-10 literal in the query, the two values will be equal, and so it will meet the condition.
It does work this way.
Instead, the 2019-01-10 literal in the query is widened to a full DateTime, that looks more like this: 2019-01-10 00:00:00.000. Now the 1 PM value from the table is compared with this full date time, and it fails the condition.
It's much more common for a date range to compare using an exclusive upper bound set for one day in the future:
request.date < '2019-01-11'
Alternatively, you may be tempted to do this:
request.date <= '2019-01-10 23:59:59.999'
It will even work most of the time. Just be warned that in the (rare) case of leap seconds, you can still end up with incorrect results that way.
You may also be tempted to do something like this:
convert(date,request.date) <= '2019-01-10'
This works, but it's not recommended because it prevents the use of any index you might have on the request.date field, and that cuts at the core of database performance.
Or maybe the problem is even simpler. With the start of range at 2019-01-09, maybe you wanted to get the records for exactly one day, and are surprised to see a few values from midnight on 2010-01-10. Again, the solution is you want an exclusive boundary at the top of the range:
request.date < '2019-01-10'
As a complete side note to the question, I'm a very sad the SQL BETWEEN operator is inclusive at both end of the range. This may make sense for numeric or string data, but for date values defining an exclusive upper bound for the BETWEEN operator would have made much more sense.

Related

DATE type comparison in MySQL

Had a bit unintuitive case right now with MySQL:
the query contains where clause with comparison: WHERE t.date = '2016-12-31' (t.date-s datatype is DATE(!)).. And it returns no records on execution. But the query: WHERE t.date > '2016-12-31' - returns the records with date equals '2016-12-31' among other records! The record for 2016-12-31 also showed up in case I've used BETWEEN '20161231' AND '20170101'. Tried formattings, type changes - nothing helped. After some time spent on searching for cause I did the following: updated the record's date column manually, SETting it to '2016-12-31'. After this action WHERE t.date = '2016-12-31' started to work as expected.
Probably I'm missing something, wondering what can cause such behavior.
Update
date is DATE, not DATETIME
After doing manual update I can't reproduce the mentioned behavior again: now any type of comparison(=, DATE(..)=, STRCMP) - works as it should!
Update 2
For 2016-11-30 and 2016-09-30(end of months!) found the same behavior! Won't update the record manually for now to test the suggestions I get here.
Update 3
I've also run OPTIMIZE TABLE on the table with that date column to rebuild indexes for elimination any problems with corruption.
Update 4
Here is more:
if I check HEX values for the date field for incorrect fields(end of month) I get wrong values!
SELECT HEX(t.date) FROM table t WHERE t.date BETWEEN DATE('20160930') AND DATE('20161001');
Returns:
323031362D31302D3030
323031362D31302D3031
SELECT HEX(DATE('20160930'));
Returns:
323031362D30392D3330
And 323031362D30392D3330 != 323031362D31302D3030
SELECT X'323031362D31302D3030';
And it returns:
2016-10-00, NOT 2016-09-30!
For the value that I've updated manually - HEX is same.
But what can cause such difference?

Try forcing the format using
WHERE date(t.date) = '2016-12-31'
or
WHERE date(t.date) = str_to_date( '2016-12-31', '%Y-%m-%d')
or based on your test
WHERE date(t.date) = str_to_date( '20161231', '%Y%m%d')

After some investigation I've found the problem and its not related directly to the date comparison in MySQL. I'll post it here in case anyone is stuck at such case.
I've found that the problem was with selecting results in IDE (in my case DataGrip): the value for date field in database was 2016-10-00 and select was returning 2016-09-30! That was confusing.. But after the 00 DAY was found - it was relatively easy to find the cause of it: CURDATE() - 1 (in my case there should have been: CURDATE() - INTERVAL 1 DAY). Don't ever use date related functionality without specific functions like INTERVAL!!
Thanks to everyone who supported the question, sorry for confusion, I was confused too and found the answer only after several steps.

Retrieve Timediff between current row and next row in subquery

Why I am getting more than 24 hours? I am trying to get the timediff between each row in the sub-query if the timediff is greater than 10 min. then sum the result per day.
My goal is to figure out for each user the total of every brake thats longer than 10 min. and list that among the amount of calls on that particular day?
SELECT DATE_FORMAT(last_call, '%d, %W') AS DAY
, COUNT(call_id) AS calls
, ( SELECT SEC_TO_TIME(SUM((
SELECT timestampdiff(SECOND, c.last_call, c2.last_call)
FROM calls c2
WHERE c2.calling_agent = c.calling_agent
AND c2.last_call > c.last_call
AND timestampdiff(SECOND, c.last_call, c2.last_call) > 600
ORDER BY c2.last_call LIMIT 1
)))
FROM calls AS c
WHERE EXTRACT(DAY FROM c.last_call) = EXTRACT(DAY FROM calls.last_call)
) AS `brakes`
FROM calls
WHERE 9 IN (calls.reg_calling_agent)
AND last_call > DATE_SUB(now() , INTERVAL 12 MONTH)
GROUP BY EXTRACT(DAY FROM last_call)
ORDER BY EXTRACT(DAY FROM last_call) DESC

You're getting more than 24 hours because
1) the row retrieved from c2 could be from a different day. There's no guarantee that the next call (10 minutes after the previous call) isn't the first call made/received by an agent after a week long vacation.
2) that same "gap" of over 10 minutes is going to reported for the last call the agent made/received. And you're also going to get a "gap" between the call the agent made immediately before the one before the gap, and the one before that. That is, there's no provision to made exclude the calls that DID have a subsequent call within 10 minutes. (The subquery is just looking for any subsequent call that is 10 minutes after a call.)
3) you are getting getting an aggregate total (SUM) of all of those gaps in a given day, irregardless of the agent; all the gaps for all agents are being totaled.
4) the outer query is getting a years worth of calls, (for all agents?) but is grouping by day of month (1 through 31). So, you're getting back one row for the 5th of the month, but there will be multiple agents and multiple "days" (Jan 5, Feb 5, March 5, etc.), multiple values of 'brakes', and only one of those values is going to be included in the result,. It's indeterminate which of those row values will be returned. (Other RDBMS's would balk with this construct, a non-aggregate expression in the SELECT list which not included in the GROUP BY, but by default, MySQL allows it.)
--
FOLLOWUP
Q: could you please post the corrected query?
A: I don't have the table schema, or sample data, or a specification, so it's impossible for me to provide a "corrected" query.
For example, it's not at all clear why there's a predicate on reg_calling_agent in the outermost query, but the subqueries don't have any reference to that column, or any other column from the table in the outer query, except for the last_call column. The query to find a subsequent call is relying on the calling_agent column, not reg_calling_agent, but that's being performed for ALL calls in a given day of month.
I can take a shot a query that may be closer to what you are looking for, but there is absolutely no guarantee that this is "correct" in terms of matching the schema, the datatypes, the actual data, or the expected output. A query that returns unexpected results is not an adequate specification.
SELECT a.calling_agent
, DATE_FORMAT(a.last_call,'%d, %W') AS `day`
, COUNT(a.call_id) AS `calls`
, SEC_TO_TIME(
SUM(
SELECT IF(TIMESTAMPDIFF(SECOND, a.last_call, c.last_call) > 600
,TIMESTAMPDIFF(SECOND, a.last_call, c.last_call)
,NULL
) AS `gap`
FROM calls c
WHERE c.calling_agent = a.calling_agent
AND c.last_call > a.last_call
AND c.last_call < DATE(a.last_call)+INTERVAL 1 DAY
ORDER BY c.last_call
LIMIT 1
)
) AS `breaks`
FROM calls a
WHERE a.reg_calling_agent = 9
AND a.last_call > DATE(NOW()) - INTERVAL 12 MONTH
GROUP BY a.calling_agent, DATE_FORMAT(a.last_call,'%d, %W')
ORDER BY a.calling_agent, DATE_FORMAT(a.last_call,'%d, %W') DESC
UNPACKING THE QUERY
I thought I might provide some insight as to the design of this query, what it's intended to do. I retained the FROM and WHERE clauses from the original outer query. I just gave an alias to the calls table, and re-wrote the predicates to a form that I think is simpler, and that I'm more used to using.
For the GROUP BY, I added calling_agent, since it doesn't seem to make sense that we would want to lump all of the agents together. (It's really up to you to decide whether that matches the spec or not.) I did this because calling_agent is NOT referenced in the WHERE clause. (There's an equality predicate on reg_calling_agent, but that's a different column.)
I replaced the EXTRACT(DAY FROM ) expression, since that's only returning an integer value between 1 and 31. And it just doesn't seem to make sense to lump together all the "4th day" of all months. I chose to use the expression that's in the SELECT list; because that's the normative pattern... returning the expressions used in the GROUP BY clause in the SELECT list, so the client will be able to distinguish which row in the result belongs to which group identifier.
I also qualified all column references with a table alias, as an aid to the future reader. We're familiar following that pattern in complex queries. It's natural that we extend that same pattern to simpler queries, even when it's not required.
The big change is to the derived breaks column. (I renamed that from 'brakes', because it seems like what this query is doing is finding out when calling_agents weren't making/receiving calls, when workers were "taking a break". (That's entirely a guess on my part.)
There's a SEC_TO_TIME function, all that's doing is reformatting the result.
There's a SUM() aggregate. This is just going to total up the values, for each row in a that's in a "group".
The real "meat" is the correlated subquery. What that does... for each row returned by the outer query (i.e. every row from calls that satisfies the WHERE clause on the outer query)... we are going to run another SELECT. And it's going to look for the very "next" call made/received by the same calling_agent. To do that, the calling_agent on the "next" call needs to match the value from row from the outer query...
WHERE c.calling_agent = a.calling_agent
Also, the datetime/timestamp of the subsequent "call" needs to be anytime after the datetime/timestamp of the row from the outer query...
AND c.last_call > a.last_call
And, we only want to look for calls that are on the same calendar date (year, month, day) as the previous call. (This prevents us from considering a call made four days later as a "subsequent" call.)
AND c.last_call < DATE(a.last_call)+INTERVAL 1 DAY
And, out of all those potential subsequent calls, we only want the first one, so we order them by datetime/timestamp, and then take just the first one.
ORDER BY c.last_call
LIMIT 1
If we don't get a row, the subquery will return a NULL. If we do get a row, the next thing we want to do is check if the datetime/timestamp on this call is more than 10 minutes after the previous call. We use the same TIMESTAMPDIFF expression from the original query, to derive the number of seconds between the calls, and we compare that to 10 minutes. If the gap is greater than 10 minutes, we consider this as a "break", and we return the difference as number of seconds. Otherwise, we just return a NULL, as if we hadn't found a "next" row.
IF(TIMESTAMPDIFF(SECOND, a.last_call, c.last_call) > 600
,TIMESTAMPDIFF(SECOND, a.last_call, c.last_call)
,NULL
) AS `gap`
That's MySQL-specific shorthand for the ANSI-standard form:
CASE
WHEN TIMESTAMPDIFF(SECOND, a.last_call, c.last_call) > 600
THEN TIMESTAMPDIFF(SECOND, a.last_call, c.last_call)
ELSE NULL
END AS `gap`
(NOTE: the ELSE NULL could be omitted, that would be functionally equivalent because NULL is the default when ELSE is omitted. I include it here for completeness, and for comparison to the MySQL IF() function.)
Finally, we include all of the expressions in the GROUP BY clause in the SELECT list. (This isn't required, but it's the usual pattern. If those expressions are omitted, there should be a pretty obvious reason why they are omitted. For example, if the outer query had an equality predicate on calling_agent, e.g.
AND a.calling_agent = 86
Then we'd know that any row returned by the query would have a value of 86 returned for calling_agent, so we could omit the expression from the SELECT list. But if we omit an equality predicate, or change it so that more than one calling_agent could be returned, something like:
AND (a.calling_agent = 86 OR a.calling_agent = 99)
then without calling_agent in the SELECT list, we won't be able to tell which rows are for which calling_agent. If we're going to the bother of doing a GROUP BY on the expression, we usually want to include the expression in the SELECT list; that's the normal pattern.

MySQL DATE_ADD running too slow with dynamic interval

I have the following query that's running pretty slow when executing it on thousands of records.
SELECT
name,
id
FROM
meetings
WHERE
meeting_date < '2014-09-20 11:00:00' AND (
meeting_date >= '2014-09-20 09:00:00' OR
DATE_ADD(meeting_date, INTERVAL meeting_length SECOND) > '2014-09-20 09:00:00'
)
The query checks if meeting_date overlaps in anyway between 2014-09-20 09:00:00 and 2014-09-20 11:00:00. The above query covers all the possible overlapping cases. However, DATE_ADD adds a lot of overhead.
Anyway to optimize DATE_ADD? Removing DATE_ADD greatly boosts the performance but it won't cover all overlapping cases.

I recommend you eliminate the OR.
MySQL won't (can't) perform a range scan operation on an index on column meeting_date when that column is wrapped in a function.
When the comparison is against the bare column, MySQL can do a range scan. But with the comparison to an expression, MySQL has to evaluate that expression for every row in the table, and then comapare.
For a large table, we'd get optimal performance with an index with leading column of meeting_date.
I think the "trick" to getting better performance is to rewrite the query to introduce some additional domain knowledge. Specifically, what are the MINIMUM and MAXIMUM values for meeting_length?
I think it's pretty safe to assume it won't be negative. And we probably don't expect it to be zero. But even if the minimum length is greater than zero, we can use zero as our "known" minimum. (It's going to turn out to be more convenient than some other non-zero value.)
What we really need to know is the MAXIMUM value for meeting_length. If that's a known constant value, that would be great, because we're going to include that value in the query. let's assume the maximum value of meeting_length is the number of seconds in 7 days.
As a demonstration of what I'm thinking:
SELECT m.name
, m.id
FROM meetings m
WHERE m.meeting_date < '2014-09-20 11:00:00'
AND m.meeting_date > '2014-09-20 09:00:00' + INTERVAL -7 DAY
HAVING m.meeting_date + INTERVAL meeting_length SECOND
> '2014-09-20 09:00:00'
Let's unwrap that a bit.
The first predicate is the same as in your original query... the "start" time of the meeting is before the "end" of the specified period.
The third predicate is the same as in your query too... the "end" of the meeting is after the beginning of the specified period. (My personal preference is to use the + INTERVAL form to add a duration to datetime.)
So, just like the original query we're looking for overlap.
I'm suggesting that we include another sargable predicate. The addition of this predicate doesn't really change the check for the overlap, given that we have a known minimum of 0 for meeting_length. What it does do is add a fixed lower bound that we can check against.
To explain that a little bit... if a meeting row that satisfies the condition "meeting end is after the period start", then we also know, for that row, that "meeting start is after (period start MINUS meeting length)". And we also know that "meeting start is after (period start MINUS the MAXIMUM possible value of meeting length.
And for most rows, that's going to be a bigger range... but the "trick" is the the predicate that checks that can compare a "bare" column against a constant.
And that means MySQL will be able to use an index range scan operation to satisfy that. The query is of the form:
WHERE meeting_date > const
AND meeting_date < const
And that's perfect for an index range scan. That should benefit performance... assuming there's a suitable index and that significantly limits the number of rows that need to be checked.
But by itself, that returns more rows than we need, we're going to get some meetings that start and end before the start of the period.
So we still need the additional check, to further filter down the rows. But that won't have to be evaluated for every row, only the rows that are pass through the first two predicates.
AND meeting_date + length > const
We just need to MySQL to recognize that it length won't ever be negative; to recognize that this is actually a "stricter" range, not a broader range. It might work with the AND, but we can force MySQL to evaluate that condition later, by including it in the HAVING clause.
HAVING meeting_date + length > const
But, all of that is really just a guess.
We'd really need to take a look at the EXPLAIN output.
If that index with the leading column of meeting_date also includes the id and name columns, then MySQL could satisfy the query entirely from the index, without any need to reference pages in the underlying table. (If that happens, we'll see "Using index" in the EXPLAIN output.)
Earlier, I said it would be convenient if we had a known constant for maximum meeting_length.
We could also use a query to determine that from the data:
SELECT MAX(meeting_length) FROM meetings
(And index with meeting_length as the leading column will avoid having to do an expensive full scan of the table)
We use that value to derive the "constant" value in the predicate.
We could include that query (as an inline view or a subquery), but that might impact performance. (We'd need to test how "smart" MySQL optimizer is...
We could try it as a subquery:
SELECT m.name
, m.id
FROM meetings m
WHERE m.meeting_date < '2014-09-20 11:00:00'
AND m.meeting_date > '2014-09-20 09:00:00'
- INTERVAL (SELECT MAX(l.meeting_length) FROM meetings l) DAY
HAVING m.meeting_date + INTERVAL meeting_length SECOND
> '2014-09-20 09:00:00'
Or try it as an inline view:
SELECT m.name
, m.id
FROM ( SELECT MAX(l.meeting_length) AS max_seconds
FROM meetings l
) d
CROSS
JOIN meetings m
WHERE m.meeting_date < '2014-09-20 11:00:00'
AND m.meeting_date > '2014-09-20 09:00:00'
- INTERVAL d.max_seconds SECOND
HAVING m.meeting_date + INTERVAL meeting_length SECOND
> '2014-09-20 09:00:00'

MySQL TimeStamp query

I am trying to search for studentID within a date range.
I only have one date in my database, therfore i only want the users to input one date, rather than having them input a start date and an end date for:
WHERE timeStamp BETWEEN startDate AND endDate
So i am trying this...
SELECT * FROM scansTable
INNER JOIN registeredUsers ON scansTable.studentID = registeredUsers.id
INNER JOIN labSession ON scansTable.labSessionID = labSession.id
INNER JOIN staffTable ON labSession.lecturer = staffTable.id
INNER JOIN unitTable ON labSession.unit = unitTable.id
WHERE studentID = '10'
AND labSession.StartTimeStamp BETWEEN '2011 -05 -30'+00:00:00
AND '2011 -05 -30'+23:59:59;
But it is not returning anything when i know for sure there is a student of id 10 and that date range in the database
Am i doing the +00:00:00 wrong??
thanks

Remove the spaces and plus symbols:
BETWEEN '2011-05-30 00:00:00' AND '2011-05-30 23:59:59'

It seems you are using something like '2011 -05 -30'+00:00:00, where you should use '2011-05-30 00:00:00' (and make corresponsing changes to the second condition), because TIMESTAMP format (I assume this field is in this format) is YYYY-MM-DD HH:MM:SS.
Did it help? If not, give use the definition of the table plus the example row (at least both timestamp columns).
EDIT:
If you wanted to concatenate, you should have used CONCAT() function (see MySQL's documentation). It would look like this:
CONCAT('2011-05-30',' 00:00:00')
or, more meaningfully:
CONCAT_WS(' ','2011-05-30','00:00:00')

If you haven't changed the default date format, remove the + signs and the extra spaces.
labSession.StartTimeStamp BETWEEN '2011-05-30 00:00:00' AND '2011-05-30 23:59:59';
Also, I don't know if it's a byproduct of the copy/paste, but MySQL won't even run the query as-is. The time portion of your timestamp needs to be within the quotes.

between might not include that lower bound as an =
i believe it does include the upper bound as =, this might differ depending on the database.

How to use where clause in separate datetime(year,month,day)

http://upic.me/i/hq/capture.png
http://upic.me/i/3g/capture.png
I have the table that divide datetime to single field and set these field to index.
i would to use where clause in date range ex. between 2010/06/21 to 2011/05/15
I try to use
where concat_ws('-',year,month,day) between '2010/06/21' and '2011/05/15'
it's work because I use concat function to adjust these field like ordinary datetime
but it not use index and query slowly.This table has 3 million record
if would to use index I try to this query
where
year = '2011'
and month between 05 and 06
and day between 21 and 15
It almost work but in last line
day between 21 and 15
I can't use this condition
I try to solve this problem but I can't find it and change structer table
I'm looking for answer
thank you
Now I can OR operation for query thank for your answer
In another case if would to find 2009/08/20 to 2011/04/15 It's use longer query and make confusion.Has someone got idea?

If it's a datestamp type, you can just use the where/between clause directly. I would consider switching to that, it's quite faster than a varchar with a custom date format.
WHERE yourdate BETWEEN "2011-05-01" AND "2011-06-15"
Although checking ranges may work for single months, you will find if you're querying between several months to have some margin of error because, if you think about it, you're selecting more than you may necessarily want. Using Datestamp will fix performance and usability issues arising from storing the date in a custom varchar.
Here are the two queries to convert your times around if you're interested:
ALTER TABLE `yourtable` ADD `newdate` DATE NOT NULL;
UPDATE `yourtable` SET `newdate` = STR_TO_DATE(`olddate`, '%Y/%m/%d');
Just change "yourtable", "newdate", and "olddate" to your table's name, the new date column name, and the old datestamp column names respectively.

If you can't change the table structure, you could use something like the following:
WHERE year = '2011'
AND ((month = '05' AND day >= 21) OR (month = '06' AND day <= '15'))
(At least, I think that query does what you want in your specific case. But for e.g. a longer span of time, you'd have to think about the query again, and I suspect queries like this could become a pain to maintain)
UPDATE for the updated requirement
The principle remains the same, only the query becomes more complex. For the range of 2009/08/20 to 2011/04/15 it might look like this:
WHERE year = '2009' AND (month = '08' AND day >= '20' OR month BETWEEN '09' AND '12')
OR year = '2010'
OR year = '2011' AND (month BETWEEN '01' AND '03' OR month = '04' AND day <= '15')

where year = 2011
and (month between 5 and 6) and (day > 20 or day < 16)
You where seperating days and month whereas you must keep them together
parentheses must be set ...
Mike

It is important that you use OR otherwise it is nonsense

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008