I need to select one row with the "highest" date and time from a table, but I can't get the highest one, the ORDER BY DESC doesn't work.
Here's my query:
SELECT count(*) as c,
start,
UNIX_TIMESTAMP(start) as S,
duration
FROM appuntamento
WHERE DATE(start) = DATE('2014-04-08 18:30:00')
ORDER BY S DESC
LIMIT 1
I don't care about getting the start value in unix timestamp, it was the nth try to get through this
Any tips?
A few problems here. First, the presence of COUNT(*) turns this into an aggregate query, which do you not want. That's probably the cause of your trouble.
Second, if you have a lot of rows in your appuntamento table the performance of this query will be bad, because you can't use an index.
Presuming that you want the time and duration of the last (latest-in-time) row from a particular day in your table, and the number of appointments for that same day, you need to do this:
SELECT a.start, a.duration, b.count
FROM (
SELECT start,
duration
FROM appuntamento
WHERE start >= DATE('2014-04-08 18:30:00')
AND start < DATE('2014-04-08 18:30:00') + INTERVAL 1 DAY
ORDER BY start DESC, duration DESC
LIMIT 1
) AS a
JOIN (
SELECT COUNT(*) AS count
FROM appuntamento
WHERE start >= DATE('2014-04-08 18:30:00')
AND start < DATE('2014-04-08 18:30:00') + INTERVAL 1 DAY
) AS b
Explanation: First, this form of searching on start allows you to use an index on the start column. You want that for performance reasons.
WHERE start >= DATE('2014-04-08 18:30:00')
AND start < DATE('2014-04-08 18:30:00') + INTERVAL 1 DAY
Second, you need to handle the COUNT(*) as a separate subquery. I have done that.
Third, you can definitely do ORDER BY start DESC and it will work if start is a DATETIME column. No need for UNIX_TIMESTAMP().
Fourth, I used ORDER BY start DESC, duration DESC to arrange to return the longest appointment if there happen to be several with the same start time.
if all you want is one row returned then use the MAX() function without the order. should do the trick.
SELECT count(*) as c,
MAX(start) as highest_date,
UNIX_TIMESTAMP(start) as S,
duration
FROM appuntamento
WHERE DATE(start) = DATE('2014-04-08 18:30:00')
also with your order by statement. you need to add a group by so that you aren't combining incorrect rows with the COUNT() aggregate.
SELECT count(*) as c,
start,
UNIX_TIMESTAMP(start) as S,
duration
FROM appuntamento
WHERE DATE(start) = DATE('2014-04-08 18:30:00')
GROUP BY S
ORDER BY S DESC
LIMIT 1
Related
I recently upgraded my MySQL server to version 5.7 and the following example query does not work:
SELECT *
FROM (SELECT *
FROM exam_results
WHERE exam_body_id = 6674
AND exam_date >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
AND subject_ids LIKE '%4674%'
ORDER BY score DESC
) AS top_scores
GROUP BY user_id
ORDER BY percent_score DESC, time_advantage DESC
LIMIT 10
The query is supposed to select exam results from the specified table matching the top scorers who wrote a particular exam within some time interval. The reason why I had to include a GROUP BY clause when I first wrote the query was to eliminate duplicate users, i.e. users who have more than one top score from writing the exam within the same time period. Without eliminating duplicate user IDs, a query for the top 10 high scorers could return exam results from the same person.
My question is: how do I rewrite this query to remove the error associated with MySQL 5.7 strict mode enforced on GROUP BY clauses while still retaining the functionality I want?
That is because you never really wanted aggregation to begin with. So, you used a MySQL extension that allowed your syntax -- even though it is wrong by the definition of SQL: The GROUP BY and SELECT clauses are incompatible.
You appear to want the row with the maximum score for each user meeting the filtering conditions. A much better approach is to use window functions:
SELECT er.*
FROM (SELECT er.*,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY score DESC) as seqnum
FROM exam_results er
WHERE exam_body_id = 6674 AND
exam_date >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK) AND
subject_ids LIKE '%4674%'
) er
WHERE seqnum = 1
ORDER BY percent_score DESC, time_advantage DESC
LIMIT 10;
You can do something similar in older versions of MySQL. Probably the closest method uses variables:
SELECT er.*,
(#rn := if(#u = user_id, #rn + 1,
if(#u := user_id, 1, 1)
)
) as rn
FROM (SELECT er.*
FROM exam_results
WHERE exam_body_id = 6674 AND
exam_date >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK) AND
subject_ids LIKE '%4674%'
ORDER BY user_id, score DESC
) er CROSS JOIN
(SELECT #u := -1, #rn := 0) params
HAVING rn = 1
ORDER BY percent_score DESC, time_advantage DESC
LIMIT 10
When you aggregate (GROUP BY) a result set by a subset of the columns (user_id), then all the other columns need to be aggregated.
Note: according to the SQL Standard if you are grouping by the primary key this is not necessary, since all the other columns are dependent on the PK. Nevertheless, this is not the case in your question.
Now, you can use any aggregation function like MAX(), MIN(), SUM(), etc. I chose to use MAX(), but you can change it for any of them.
The query can run as:
SELECT
user_id,
max(exam_body_id),
max(exam_date),
max(subject_ids),
max(percent_score),
max(time_advantage)
FROM exam_results
WHERE exam_body_id = 6674
AND exam_date >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK)
AND subject_ids LIKE '%4674%'
GROUP BY user_id
ORDER BY max(percent_score) DESC, max(time_advantage) DESC
LIMIT 10
See running example at DB Fiddle.
Now, why do you need to aggregate the other columns, you ask? Since you are gruping rows the engine needs to produce a single row per group. Therefore, you need to tell the engine which value to pick when there are many values to pick from: the biggest one, the smallest one, the average of them, etc.
In MySQL 5.7.4 or older, the engine didn't require you to aggregate the other columns. The engine silently and randomly decided for you. You may have got the result you wanted today, but tomorrow the engine could choose the MIN() instead of the MAX() without you knowing, therefore leading to unpredictable results every time you run the query.
An alternative to Gordon's answer using user-defined variables and a CASE conditional statement for older versions of MySQL is as follows:
SELECT *
FROM (
SELECT *,
#row_number := CASE WHEN #user_id <> er.user_id
THEN 1
ELSE #row_number + 1 END
AS row_number,
#user_id := er.user_id
FROM exam_results er
CROSS JOIN (SELECT #row_number := 0, #user_id := null) params
WHERE exam_body_id = 6674 AND
exam_date >= DATE_SUB(CURDATE(), INTERVAL 1 WEEK) AND
subject_ids LIKE '%4674%'
ORDER BY er.user_id, score DESC
) inner_er
HAVING inner_er.row_number = 1
ORDER BY score DESC, percent_score DESC, time_advantage DESC
LIMIT 10
This achieved the filtering behavior I wanted without having to rely on the unpredictable behavior of a GROUP BY clause and aggregate functions.
i am new in learning sql. how to create query to get the timestamp of a minimum value and the minimum value itself?
previously i managed to get the minimum value but not with its timestamp. with this query
SELECT min(score) as lowest
FROM rank
WHERE time >= CAST(CURDATE() AS DATE)
here is the table that i've created:
(cannot attach image because of the reputation rule)
sorry for the bad english.
If you either expect that there would be only a single record with the lowest score, or if there be ties, you don't care which record gets returned, then using LIMIT might be the easiest way to go here:
SELECT timestamp, score
FROM rank
WHERE time >= CAST(CURDATE() AS DATE)
ORDER BY score
LIMIT 1;
If you care about ties, and want to see all of them, then we can use a subquery:
SELECT timestamp, score
FROM rank
WHERE time >= CAST(CURDATE() AS DATE) AND
score = (SELECT MIN(score) FROM rank WHERE time >= CAST(CURDATE() AS DATE));
It's possible by following way.
Note: It only works if you want to get a single record at once
select score, time
FROM rank
WHERE time >= CAST(CURDATE() AS DATE)
ORDER BY score ASC LIMIT 1
I have a simple table for events with a date column. I can easily select
the next n events with (assuming n = 3):
SELECT * FROM events WHERE `date` > NOW() ORDER BY `date` LIMIT 3
However, not aways there will be 3 events in the future. In this case,
I'd like to return the ones available in the future and complete what is
missing with the closest ones to today. E.g., if today is day 12-04, the
following dates marked with a * should be selected of the whole list:
10-03
20-03
30-03 *
10-04 *
20-04 *
While I can easily check the result of the first query to find out how
many rows were returned and build another query to find the past dates
if necessary, I'm interested to know if there is a way to fetch these
rows in a single query.
You can use multiple keys in the order by. So:
SELECT e.*
FROM events
ORDER BY (date > now()) DESC, -- future events first
(CASE WHEN date > now() THEN date END) ASC -- nearest future events first
date DESC -- other dates in descending order
LIMIT 3;
If your table is large, it is probably faster to get three events from the near future and near past and combine those:
select e.*
from ((select e.*
from events e
where date > now()
order by date asc
limit 3
) union all
(select e.*
from events e
where date <= now()
order by date desc
limit 3
)
) e
order by date desc
limit 3;
I want to create a timeline report that shows, for each date in the timeline, a moving average of the latest N data points in a data set that has some measures and the dates they were measured. I have a calendar table populated with every day to provide the dates. I can calculate a timeline to show the overall average prior to that date fairly simply with a correlated subquery (the real situation is much more complex than this, but it can essentially be simplified to this):
SELECT c.date
, ( SELECT AVERAGE(m.value)
FROM measures as m
WHERE m.measured_on_dt <= c.date
) as `average_to_date`
FROM calendar c
WHERE c.date between date1 AND date2 -- graph boundaries
ORDER BY c.date ASC
I've spent days reading around this and I've not found any good solutions. Some have suggested that LIMIT might work in the subquery (LIMIT is supported in subqueries the current version of MySQL), however LIMIT applies to the return set, not the rows going into the aggregate, so it makes no difference to add it.
Nor can I write a non-aggregated SELECT with a LIMIT and then aggregate over that, because a correlated subquery is not allowed inside a FROM statement. So this (sadly) WON'T work:
SELECT c.date
, SELECT AVERAGE(last_5.value)
FROM ( SELECT m.value
FROM measures as m
WHERE m.measured_on_dt <= c.date
ORDER BY m.measured_on_dt DESC
LIMIT 5
) as `last_5`
FROM calendar c
WHERE c.date between date1 AND date2 -- graph boundaries
ORDER BY c.date ASC
I'm thinking I need to avoid the subquery approach completely and see if I do this with a clever join / row numbering technique with user-variables and then aggregate that but while I'm working on that I thought I'd ask if anyone knew a better method?
UPDATE:
Okay, I've got a solution working which I've simplified for this example. It relies on some user-variable trickery to number the measures backwards from the calendar date. It also does a cross product with the calendar table (instead of a subquery) but this has the unfortunate side-effect of causing the row-numbering trick to fail (user-variables are evaluated when they're sent to the client, not when the row is evaluated) so to workaround this, I've had to nest the query one level, order the results and then apply the row-numbering trick to that set, which then works.
This query only returns calendar dates for which there are measures, so if you wanted the whole timeline you'd simply select the calendar and LEFT JOIN to this result set.
set #day = 0;
set #num = 0;
set #LIMIT = 5;
SELECT date
, AVG(value) as recent_N_AVG
FROM
( SELECT *
, #num := if(#day = c.date, #num + 1, 1) as day_row_number
, #day := day as dummy
FROM
( SELECT c.full_date
, m.value
, m.measured_on_dt
FROM calendar c
JOIN measures as m
WHERE m.measured_on_dt <= c.full_date
AND c.full_date BETWEEN date1 AND date2
ORDER BY c.full_date ASC, measured_on_dt DESC
) as full_data
) as numbered
WHERE day_row_number <= #LIMIT
GROUP BY date
The row numbering trick can be generalised to more complex data (my measures are in several dimensions which need aggregating up).
If your timeline is continuous (1 value each day) you could improve your first attempt like this:
SELECT c.date,
( SELECT AVERAGE(m.value)
FROM measures as m
WHERE m.measured_on_dt
BETWEEN DATE_SUB(c.date, INTERVAL 5 day) AND c.date
) as `average_to_date`
FROM calendar c
WHERE c.date between date1 AND date2 -- graph boundaries
ORDER BY c.date ASC
If your timeline has holes in it this would result in less than 5 values for the average.
Lets say I have a table of messages that users have sent, each with a timestamp.
I want to make a query that will tell me (historically) the most number of messages a user ever sent in an hour.
So in other words, in any given 1 hour period, what was the most number of messages sent.
Any ideas?
Assuming timestamp to be a DATETIME - otherwise, use FROM_UNIXTIME to convert to a DATETIME...
For a [rolling] count within the last hour:
SELECT COUNT(*) AS cnt
FROM MESSAGES m
WHERE m.timestamp BETWEEN DATE_SUB(NOW(), INTERVAL 1 HOUR)
AND NOW()
GROUP BY m.user
ORDER BY cnt DESC
LIMIT 1
If you want a specific hour, specify the hour:
SELECT COUNT(*) AS cnt
FROM MESSAGES m
WHERE m.timestamp BETWEEN '2011-06-06 14:00:00'
AND '2011-06-06 15:00:00'
GROUP BY m.user
ORDER BY cnt DESC
LIMIT 1
Need more details on table structure etc. but something like:
select date(timestmp), hour(timestmp) , count(*)
from yourtable group by date(timestmp) , hour(timestmp)
order by count(*) DESC
limit 100;
would give you hte desired result.
Something like this should work:
SELECT MAX(PerHr) FROM
(SELECT COUNT(*) AS PerHr FROM messages WHERE msg_uid=?
GROUP BY msg_time/3600) t
I suspect this would be horribly slow, but for an arbitrary historical max hour, something like this might work (downvote me if I'm way off, I'm not a MySQL person):
SELECT base.user, base.time, COUNT(later.time)
FROM messages base
INNER JOIN messages later ON later.time BETWEEN base.time AND DATE_ADD(base.time, INTERVAL 1 HOUR) AND base.user = later.user
WHERE base.user = --{This query will only work for one user}
GROUP BY base.user, base.time
ORDER BY COUNT(later.time) DESC
LIMIT 1