Mysql optimizing date filter - mysql

Is there any difference in any sql engines (and particularly in mysql) in the following two queries?
SELECT * FROM table where date = '2019-01-01'
And:
SELECT * FROM table where date = DATE('2019-01-01')
Doing an explain returns the same result, but perhaps there's some sort of difference that I'm not catching? I need to run a query against a multi-billion row table and am trying to optimize it before running.

There should not be. The expression DATE('2019-01-01') should be evaluated during the compilation phase turning the result into a date. Similarly, the constant value '2019-01-01' is implicitly converted to a date for the comparison.
This allows MySQL (and most other databases) to use indexes and partitions defined on that column.

Date() function Extracts the date part of a date or date/time expression
for example the value of the field name BirthTime is "2017-09-26 16:44:15.581"
so you have to use the following query to check the date :
SELECT DATE(BirthTime)
result is : 2017-09-26

Related

MySQL where query by hour diff of 2 timestamps

I have a table with 2 timestamps: start_time and end_time. How can I query with conditions like select all where the diff of those 2 fields is more than X hours.
Also does the field type (timestamp vs datetime) has any impact on the query i'm trying to achieve?
The data type of the fields does for sure have some differences.
As stated in MySQL Timestamp Difference, it is usually the case that, given two datetime fields, they get converted to timestamps in order to subtract them.
The query could be something like
SELECT *
FROM Table
WHERE TIMESTAMPDIFF(HOUR,start_time,end_time)>2
Edit:
What written above is valid for MySQL, you can use DATEDIFF if you are working in SQL Server

Generalize a sql query to accomodate full and incremental extract

I have a the following query :
select * from table where createddate>='03-Feb-2020' and createddate<'04-Feb-2020'
The above query will give me incremental count for a single day.
How do i generalize the above query so that i can get the entire historical data/full dump without changing the where clause.
For example:
select * from table where createddate>='VARIABLE1' and createddate<'VARIABLE2'
Is there a way that without changing the schema of the sql query i can just pass in different values for the createddate to get the full dump?
Is this what you want?
where createddate >= '1000-01-01' and createddate < '9999-12-31'
Note that dates in MySQL must be formated as YYYY-MM-DD.
Do you want group by?
select date(createddate), count(*)
from table
group by date(createddate);
This returns the count for each day.

Mysql query help to display results less than given date

I am having my date field in Mysql which is stored as char is as follows 050712.. Now I would like to display the results which are available in the database which are less than this date. I write as follows
Condition should fail
select * from tblFedACHRDFI where date_format(changedate,'%m/%d/%Y')> 05/08/12;
This is displaying all records which are available but I don't need that I would like to display only when date is 05/06/12 which means
True Condition
select * from tblFedACHRDFI where date_format(changedate,'%m/%d/%Y')> 05/06/12;
The same worked for me in Sqlserver when I write as follows
Records not getting displayed which is true as per my requirement
select * from tblFedACHRDFI where
CONVERT(datetime,(SUBSTRING(ChangeDate,1,2)+'/'
+SUBSTRING(ChangeDate,3,2)+'/'+dbo.Years
(SUBSTRING(ChangeDate,5,2))+SUBSTRING(ChangeDate,5,2)))>
'05/08/2012'
So can any one help me where I went wrong in MySql statement..
A MySQL date should be YYYY-MM-DD, column type should be DATE.
If you wish to store a date any other way (for example, a CHAR(6) as you do here), you'll have to use conversions each time you use the date. This is slower, uses more CPU, and can fail because you can store invalid values in your CHAR field.
It does work, however. Use the STR_TO_DATE function to convert your CHAR column to a proper date. Now you can compare it against a proper date, use INTERVAL functions, the whole shebang:
select *
from tblFedACHRDFI
where str_to_date(changedate,'%m%d%Y') > "2012-08-05";

Explain how the following query works?

I have a mysql query which works in a strange way. I am posting the 2 queries with input data changed and the output are listed under each query.
Query 1 (Area to be noted BETWEEN '13/05/11' AND '30/05/11'):
SELECT COUNT(pos_transaction_id) AS total,
DATE_FORMAT(pt.timestamp,'%d-%m-%Y %H:%i:%S') AS Date,
SUM(amount) AS amount
FROM pos_transactions pt
WHERE DATE_FORMAT(pt.timestamp,'%e/%m/%y') BETWEEN '13/05/11' AND '30/05/11'
GROUP BY WEEK(pt.timestamp) ORDER BY pt.timestamp
Output:
Query 2 (Area to be noted BETWEEN '3/05/11' AND '30/05/11'):
SELECT COUNT(pos_transaction_id) AS total,
DATE_FORMAT(pt.timestamp,'%d-%m-%Y %H:%i:%S') AS Date,
SUM(amount) AS amount
FROM pos_transactions pt
WHERE DATE_FORMAT(pt.timestamp,'%e/%m/%y') BETWEEN '3/05/11' AND '30/05/11'
GROUP BY WEEK(pt.timestamp) ORDER BY pt.timestamp
Output:
Now when the range is increased in the second query why am I getting just one record ? And even in the first query I am getting records which is out of range. What is wrong with it??
EDIT
The changed query looks like this and still not doing what I wanted it to do.
SELECT COUNT(pos_transaction_id) AS total,
DATE_FORMAT(pt.timestamp,'%d-%m-%Y %H:%i:%S') AS Date,
SUM(amount) AS amount
FROM pos_transactions pt
WHERE DATE_FORMAT(pt.timestamp,'%e/%m/%y') BETWEEN STR_TO_DATE('01/05/11','%e/%m/%y') AND STR_TO_DATE('30/05/11','%e/%m/%y')
GROUP BY WEEK(pt.timestamp) ORDER BY pt.timestamp
The output is:
I think you're seeing the result of the intersection of two bad practices.
First, the date_format() function returns a string. Your WHERE clause does a string comparison. In PostgreSQL
select '26/04/2011' between '13/05/11' AND '30/05/11';
--
T
That's because the string '26' is between the strings '13' and '30'. If you write them as dates, though, PostgreSQL will correctly tell you that '2011-04-26' (following the datestyle setting on my server) isn't in that range.
Second, I'm guessing that the odd out-of-range values appear because you're using an indeterminate expression in your aggregate. The expression WEEK(pt.timestamp) doesn't appear in the SELECT list. I think every other SQL engine on the market will throw an error if you try to do that. Since it's not in the SELECT list, MySQL will return an apparently random value from that aggregate range.
To avoid these kinds of errors, don't do string comparisons on date or timestamp ranges, and don't use indeterminate aggregate expressions.
Posting DDL and minimal SQL INSERT statements to reproduce the problem helps people help you.
I'm absolutely not sure, but it is maybe the comparison is done as a string and not as a date.
DATE_FORMAT returns a string and both your condition are strings too.
You should try without the DATE_FORMAT, just the column, or maybe trying to convert the condition to a date.
I'm thinking something like this :
pt.timestamp BETWEEN STR_TO_DATE('13/05/11', '%e/%m/%y') AND STR_TO_DATE('30/05/11', '%e/%m/%y')
I am pretty sure you are meaning to do
WHERE pt.timestamp BETWEEN TO_DATE('13/04/11', 'dd/mm/yy') AND TO_DATE('30/05/11', 'dd/mm/yy')
Before you are asking it for a string between two other strings.
Update
I think a few point is being missed here. Based on the calculations you are doing on pos_transactions.timestamp I am going to assume it's a type of timestamp. In your query you need to use the timestamp directly if you want to do a range compare. A timestamp already contains all the data you need to do this comparison. You don't need to covert it to Day/Month/Year to compare it.
What you need to do is this:
Find all values where my timestamp is between create a new date from '13/05/11' AND create a new date from '30/05/11'. pt.timestamp is already a timestamp, no need to convert it in your WHERE clause.
What you keep doing is converting it into a String representation. Thats ok when you want to display it, but not when you want to compare it with other values.

Timestamp as int field, query performance

I'm storing timestamp as int field. And on large table it takes too long to get rows inserted at date because I'm using mysql function FROM_UNIXTIME.
SELECT * FROM table WHERE FROM_UNIXTIME(timestamp_field, '%Y-%m-%d') = '2010-04-04'
Is there any ways to speed this query? Maybe I should use query for rows using timestamp_field >= x AND timestamp_field < y?
Thank you
EDITED This query works great, but you should take care of index on timestamp_field.
SELECT * FROM table WHERE
timestamp_field >= UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND timestamp_field <= UNIX_TIMESTAMP('2010-04-14 23:59:59')
Use UNIX_TIMESTAMP on the constant instead of FROM_UNIXTIME on the column:
SELECT * FROM table
WHERE timestamp_field
BETWEEN UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND UNIX_TIMESTAMP('2010-04-14 23:59:59')
This can be faster because it allows the database to use an index on the column timestamp_field, if one exists. It is not possible for the database to use the index when you use a non-sargable function like FROM_UNIXTIME on the column.
If you don't have an index on timestamp_field then add one.
Once you have done this you can also try to further improve performance by selecting the columns you need instead of using SELECT *.
If you're able to, it would be faster to either store the date as a proper datetime field, or, in the code running the query, to convert the date you're after to a unix timestamp before sending it to the query.
The FROM_UNIXTIME would have to convert every record in the table before it can check it which, as you can see, has performance issues. Using a native datatype that is closest to what you're actually using in your queries, or querying with the column's data type, is the fastest way.
So, if you need to continue using an int field for your time, then yes, using < and > on a strict integer would boost performance greatly, assuming you store things to the second, rather than the timestamp that would be for midinight of that day.