I'm trying to do something like this:
SELECT MAX(
ADDDATE(expirationdate, INTERVAL 1 YEAR),
ADDDATE(now(), INTERVAL 1 YEAR)
)
That is, get "a year from now", or "a year from the expiration date stored in the table", whichever is greater (i'm renewing people's subscriptions).
This obviously doesn't work, since MAX() is for aggregation between rows, not for comparing 2 values. Is there a function that'll do this in MySQL? (i'd like to avoid doing an IF)
greatest()
Related
This is a question from leetcode, using the second query I got the question wrong but could not identify why
SELECT
user_id,
max(time_stamp) as "last_stamp"
from
logins
where
year(time_stamp) = '2020'
group by
user_id
and
select
user_id,
max(time_stamp) as "last_stamp"
from
logins
where
time_stamp between '2020-01-01' and '2020-12-31'
group by
user_id
The first query uses a function on every row to extract the year (an integer) and compares that to a string. (It would be preferable to use an integer instead.) Whilst this may be sub-optimal, this query would accurately locate all rows that fall into the year 2020.
The second query could fail to locate all rows that fall into 2020. Here it is important to remember that days have a 24 hour duration, and that each day starts at midnight and concludes at midnight 24 hours later. That is; a day does have a start point (midnight) and an end-point (midnight+24 hours).
However a single date used in SQL code cannot be both the start-point and the end-point of the same day, so every date in SQL represents only the start-point. Also note here, that between does NOT magically change the second given date into "the end of that day" - it simply cannot (and does not) do that.
So, when you use time_stamp between '2020-01-01' and '2020-12-31' you need to think of it as meaning "from the start of 2020-01-01 up to and including the start of 2020-12-31". Hence, this excludes the 24 hours duration of 2020-12-31.
The safest way to deal with this is to NOT use between at all, instead write just a few characters more code which will be accurate regardless of the time precision used by any date/datetime/timestamp column:
where
time_stamp >= '2020-01-01' and time_stamp <'2021-01-01'
with the second date being "the start-point of the next day"
See answer to SQL "between" not inclusive
In a MySQL DB table that stores sale orders, I have a LastReviewed column that holds the last date and time when the sale order was modified (type timestamp, default value CURRENT_TIMESTAMP). I'd like to plot the number of sales that were modified each day, for the last 90 days, for a particular user.
I'm trying to craft a SELECT that returns the number of days since LastReviewed date, and how many records fall within that range. Below is my query, which works just fine:
SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND DATEDIFF(CURDATE(),LastReviewed)<=90
GROUP BY days
ORDER BY days ASC
Notice that I am computing the DATEDIFF() as well as CURDATE() multiple times for each record. This seems really ineffective, so I'd like to know how I can reuse the results of the previous computation. The first thing I tried was:
SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND days<=90
GROUP BY days
ORDER BY days ASC
Error: Unknown column 'days' in 'where clause'. So I started to look around the net. Based on another discussion (Can I reuse a calculated field in a SELECT query?), I next tried the following:
SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND (SELECT days)<=90
GROUP BY days
ORDER BY days ASC
Error: Unknown column 'days' in 'field list'. I'm also tried the following:
SELECT #days := DATEDIFF(CURDATE(), LastReviewed) AS days,
COUNT(*) AS number FROM sales
WHERE UserID=123 AND #days <=90
GROUP BY days
ORDER BY days ASC
The query returns zero result, so #days<=90 seems to return false even though if I put it in the SELECT clause and remove the WHERE clause, I can see some results with #days values below 90.
I've gotten things to work by using a sub-query:
SELECT * FROM (
SELECT DATEDIFF(CURDATE(),LastReviewed) AS sales ,
COUNT(*) AS number FROM sales
WHERE UserID=123
GROUP BY days
) AS t
WHERE days<=90
ORDER BY days ASC
However I odn't know whether it's the most efficient way. Not to mention that even this solution computes CURDATE() once per record even though its value will be the same from the start to the end of the query. Isn't that wasteful? Am I overthinking this? Help would be welcome.
Note: Mods, should this be on CodeReview? I posted here because the code I'm trying to use doesn't actually work
There are actually two problems with your question.
First, you're overlooking the fact that WHERE precedes SELECT. When the server evaluates WHERE <expression>, it then already knows the value of the calculations done to evaluate <expression> and can use those for SELECT.
Worse than that, though, you should almost never write a query that uses a column as an argument to a function, since that usually requires the server to evaluate the expression for each row.
Instead, you should use this:
WHERE LastReviewed < DATE_SUB(CURDATE(), INTERVAL 90 DAY)
The optimizer will see this and get all excited, because DATE_SUB(CURDATE(), INTERVAL 90 DAY) can be resolved to a constant, which can be used on one side of a < comparison, which means that if an index exists with LastReviewed as the leftmost relevant column, then the server can immediately eliminate all of the rows with LastReviewed >= that constant value, using the index.
Then DATEDIFF(CURDATE(), LastReviewed) AS days (still needed for SELECT) will only be evaluated against the rows we already know we want.
Add a single index on (UserID, LastReviewed) and the server will be able to pinpoint exactly the relevant rows extremely quickly.
Builtin functions are much less costly than, say, fetching rows.
You could get a lot more performance improvement with the following 'composite' index:
INDEX(UserID, LastReviewed)
and change to
WHERE UserID=123
AND LastReviewed >= CURRENT_DATE() - INTERVAL 90 DAY
Your formulation is 'hiding' LastRevieded in a function call, making it unusable in an index.
If you are still not satisfied with that improvement, then consider a nightly query that computes yesterday's statistics and puts them in a "Summary table". From there, the SELECT you mentioned can run even faster.
I have searched SO for this question and found slightly similar posts but was unable to adapt to my needs.
I have a database with server requests since forever, each one with a timestamp and i'm trying to come up with a query that allows me to create a heatmatrix chart (CCC HeatGrid).
The sql query result must represent the server load grouped by each hour of each weekday.
Like this: Example table
I just need the SQL query, i know how to create the chart.
Thank you,
Those looks like "counts" of rows.
One of the issues is "sparse" data, we can address that later.
To get the day of the week ('Sunday','Monday',etc.) returned, you can use the DATE_FORMAT function. To get those ordered, we need to include an integer value 0 through 6, or 1 through 7. We can use an ORDER BY clause on that expression to get the rows returned in the order we want.
To get the "hour" across the top, we can use expressions in the SELECT list that conditionally increments the count.
Assuming your timestamp column is named ts, and assuming you want to pull all rows from the year 2014, we start with something like this:
SELECT DAYOFWEEK(t.ts)
, DATE_FORMAT(t.ts,'%W')
FROM mytable t
WHERE t.ts >= '2014-01-01'
AND t.ts < '2015-01-01'
GROUP BY DAYOFWEEK(t.ts)
ORDER BY DAYOFWEEK(t.ts)
(I need to check the MySQL documentation, WEEKDAY and DAYOFWEEK are real similar, but we want the one that returns lowest value for Sunday, and highest value for Saturday... i think we want DAYOFWEEK, easy enough to fix later)
The "trick" now is the columns across the top.
We can extract the "hour" from timestamp using the DATE_FORMAT() function, the HOUR() function, or an EXTRACT() function... take your pick.
The expressions we want are going to return a 1 if the timestamp is in the specified hour, and a zero otherwise. Then, we can use a SUM() aggregate to count up the 1. A boolean expression returns a value of 1 for TRUE and 0 for FALSE.
, SUM( HOUR(t.ts)=0 ) AS `h0`
, SUM( HOUR(t.ts)=1 ) AS `h1`
, SUM( HOUR(t.ts)=2 ) AS `h2`
, '...'
, SUM( HOUR(t.ts)=22 ) AS `h22`
, SUM( HOUR(t.ts)=23 ) AS `h23`
A boolean expression can also evaluate to NULL, but since we have a predicate (i.e. condition in the WHERE clause) that ensures us that ts can't be NULL, that won't be an issue.
The other issue we can encounter (as I mentioned earlier) is "sparse" data. To illustrate that, consider what happens (with our query) if there are no rows that have a ts value for a Monday. What happens is that we don't get a row in the resultset for Monday. If it does happen that a row is "missing" for Monday (or any day of the week), we do know that all of the hourly counts across the "missing" Monday row would all be zero.
I am building a report for people who signed up 1 year ago.
I want to run this report at a given time. So anyone that has been a member for 1 year between two dates.
It's a form with between date1 and date2 with a submit.
So if i want to see anyone who has been a member between 01-08-2014 and 01-10-2014 as an example. Anyone that would have been or was a member between those dates show in a list.
So far i have this but its not displaying any records:
SELECT *
FROM `nfw_users`
WHERE DATE(date_join) BETWEEN 2012-05-20
AND 2012-10-20 AND date_join >= DATE_SUB(NOW(),INTERVAL 1 YEAR)
Likely the literals 2012-05-20 and 2012-10-20 in your query are evaluating to NULL in a "date" context. (That's valid syntax, but likely not what you want.)
Date literals should be enclosed in single quotes, e.g.
... BETWEEN '2012-05-20' AND '2012-10-20'
^ ^ ^ ^
As of right now ('2014-10-14 06:36:36'), this predicate:
date_join >= DATE_SUB(NOW(),INTERVAL 1 YEAR)
is equivalent to:
date_join >= '2013-10-14 06:36:36'
That means that no rows with date_join less than that value will be returned, so no rows can be returned, since there are no date_join values that are greater than '2013-10-14' that are also less than or equal to '2012-10-20'. The predicates in your query make it impossible for any rows to match.
Your specification is a little ambiguous. Some example data, and which rows you expect to be returned would go a long ways towards clarifying the specification. You want to return rows for individuals who were members for exactly one year, or at least one year, within a given date range?
To return rows for "members" who hit a one year anniversary sometime between two specific dates:
WHERE date_join >= '2013-05-20' + INTERVAL -1 YEAR
AND date_join < '2013-10-20' + INTERVAL 1 DAY + INTERVAL -1 YEAR
To return rows for "members" who have been (or would have been) members for at least a full year between two dates, I don't see that two boundaries would be required for that, a check against a single lower bound date would be sufficient.
Try below query and you are missing quotes around date like "2012-05-20"
SELECT *
FROM `nfw_users`
WHERE DATE(date_join) BETWEEN "2013-05-20" AND "2013-10-20"
AND YEAR(date_join) = YEAR(NOW() - INTERVAL 1 YEAR)
I have a table that holds the information of installation dates for shop displays in a store. Every shop display has a certain warranty period, which can be 6 months, 1 year, 2 years etc., which I store in another table as "6 MONTH", "1 YEAR", "2 YEAR", etc.
Is it possible to calculate the expiry dates for each shop display using a single MySQL query? I am looking for something like this:
SELECT t1.install_date, (t1.install_date + INTERVAL t2.period) as expiry_date FROM t1, t2
So, I am basically trying to treat string values that I get from the t2 table as a part of MySQL statement. Is this possible to do? For my query MySQL does not give any errors but displays the values in the expiry_date column like this: "20110228", "20110224" for the corresponding values of t1.install_date "2011-02-28", "2011-02-24".
This is harder than I thought because DATE_ADD won't accept a string as an INTERVAL unit. So the solution I came up with was to dynamically convert all your YEAR values to months by multiplying by 12.
Also, you need to specify a join condition, otherwise you will calculating intervals for the entire cartesian product of the two tables.
SELECT t1.install_date, DATE_ADD(t1.install_date,
INTERVAL IF(SUBSTRING_INDEX(t2.period,' ','-1')='YEAR',
SUBSTRING_INDEX(t2.period,' ','1')*12,
SUBSTRING_INDEX(t2.period,' ','1'))
MONTH) as expiry_date
FROM t1,t2 WHERE t1.id = t2.id
Have a look at STR_TO_DATE(str,format). But I think this will be really hard. Good luck.