Select all the (varied number of ) rows with the latest date - mysql

A MySQL database with data as follows:
project_id | updated | next_steps
1 | 2014-08-01 03:19:20 | new_com
2 | 2014-08-12 03:20:34 | NULL
3 | 2014-08-12 07:01:12 | NULL
4 | 2014-08-05 09:25:45 | comment
I want to select all the rows with the latest date in the column of 'update'. The difference in hours/minutes should be ignored. I expected to get the row 2 and row 3 from this example as follows:
2 | 2014-08-12 03:20:34 | NULL
3 | 2014-08-12 07:01:12 | NULL
Of course, for the real table, the number of rows meet my criteria is changed daily and the numbers could be 100, 200, 324, etc. (it is not a fixed number). I have tried the following code and always get errors.
SELECT * FROM `table` WHERE updated LIKE %DATE(MAX(updated))%;
or
SELECT * FROM `table` WHERE updated LIKE %CAST(DATE(MAX(updated)) AS CHAR)%;
Error message is
"#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%CAST(DATE(MAX(updated)) AS CHAR)% LIMIT 0, 30' at line 1"

SELECT MAX(DATE(updated)) FROM table(this returns the 2014-08-12) use this as sub query. This gives back the max date. For example: SELECT * FROM table WHERE DATE(updated) = (SELECT MAX(DATE(updated)) FROM table) The sub query gives back the max date you want, after that you can query the right rows. This returns all the lines that were updated at the max date.

You need to use a WHERE query and use DATE(x) to calculate the maximum date without time and then select all values with that date without time.
Try this:
SELECT * FROM `table` WHERE DATE(`updated`) = (SELECT MAX(DATE(`updated`)) FROM `table`)
And if you still want them ordered
SELECT * FROM `table`
WHERE DATE(`updated`) = (SELECT MAX(DATE(`updated`)) FROM `table`) ORDER BY `updated` DESC
Happy Coding!

if you want two data,
SELECT * FROM `table` order by 'updated' desc limt 2

Related

MySQL: Select multiple columns but with the latest value of one column for each distinct value

I would like to select several colums. In one column there are two distinct values "top" and "bot". Every row has a timestamp. I would like to make sure, that for each value "top" and "bot" I do get the latest timestamp entry.
Table:
uid | datetime | device | temp | hum
================================================
1 |2022-08-30 17:34:34 |top |11.5 |88.90
2 |2022-08-30 17:34:22 |bot |13.2 |88.90
3 |2020-10-06 13:48:33 |top |24.3 |75.00
4 |2020-10-06 14:35:37 |bot |21.7 |75.00
I would like to get the following result with the SQL statement:
datetime | device | temp | hum
===========================================
2022-08-30 17:34:34 |top |11.5 |88.90
2022-08-30 17:34:22 |bot |13.2 |88.90
But what I get with my current statement is:
datetime | device | temp | hum
===========================================
2020-10-06 13:48:33 |top |24.3 |75.00
2020-10-06 14:35:37 |bot |21.7 |75.00
So it's not the most current row, it is the oldest one.
My best SQL try so far is:
SELECT datetime, device, temp, hum
FROM <table_name>
WHERE uid=123456789
GROUP BY device
ORDER BY datetime DESC
LIMIT 2
This one should work fine for you:
SELECT device, datetime
FROM table_name
WHERE datetime in (SELECT MAX(datetime) FROM table_name GROUP BY device)

Is it possible to get sum without change sql_mode?

I have a problem to get SUM value from multiple table using join statement. The error is:
this is incompatible with sql_mode=only_full_group_by
Is it possible to get sum without change sql_mode? If possible, how to make a SQL statement to?
Table fuel:
vehicle_id | liter
-----------+-----------
2 | 43.5
4 | 78.3
8 | 20.5
Table usage:
date_usage | vehicle_id
-----------+-----------
2019-10-01 | 8
2019-10-15 | 2
2019-10-20 | 8
2019-10-20 | 4
2019-11-02 | 8
The SQL statement is below:
SELECT fuel.vehicle_id, SUM(fuel.liter), usage.date_usage
FROM fuel
LEFT JOIN usage ON fuel.vehicle_id = usage.vehicle_id
WHERE fuel.vehicle_id='8'
AND usage.date_usage >='2019-10-01' AND usage.date_usage <='2019-10-31'
GROUP BY fuel.vehicle_id
You have two possibilities to solve this:
Remove column usage.date_usage from SELECT.
Use a aggregate function on column usage.date_usage too:
MAX to get the highest value of the column.
MIN to get the lowest value of the column.
ANY_VALUE to get any value of the column.
So your query can look like the following (using MAX on column usage.date_usage):
SELECT fuel.vehicle_id, SUM(fuel.liter), MAX(`usage`.date_usage)
FROM fuel LEFT JOIN `usage` ON fuel.vehicle_id = `usage`.vehicle_id
WHERE fuel.vehicle_id = 8
AND `usage`.date_usage BETWEEN '2019-10-01' AND '2019-10-31'
GROUP BY fuel.vehicle_id
demo on dbfiddle.uk
Note: be careful using words like usage as identifiers (e.g. table and column names) in MySQL. There are keywords and reserved words you should avoid or quote with backticks.

Random sampling using subquery and rand() gives unexpected results

Edit: If it makes any difference, I am using mysql 5.7.19.
I have a table A, and am trying to randomly sample on average 10% of the rows. I have decided that using rand() in a subquery, and then filtering out on that random result would do the trick, but it is giving unexpected results. When I print out the randomly generated value after filtering, I get random values that do not match my main query's "where" clause, so I suppose it is regenerating the random value in the outer select.
I guess I'm missing something to do with subqueries and when things are executed, but I'm really not sure what's going on.
Can anyone explain what I might be doing wrong? I've checked out this post: In which sequence are queries and sub-queries executed by the SQL engine? , and my subquery is correlated so I assume that my subquery is being executed first, and then the main query is filtering off of it. Given my assumptions, I do not understand why the result has values that should have been filtered away.
Query:
select
*
from
(
select
*,
rand() as rand_value
from
A
) a_rand
where
rand_value < 0.1;
Result:
--------------------------------------
| id | events | rand_value |
--------------------------------------
| c | 1 | 0.5512495763145849 | <- not what I expected
--------------------------------------
I am not able to reproduce using this SQL Fiddle use that link and click the blue [Run SQL] button a few times
CREATE TABLE Table1
(`x` int)
;
INSERT INTO Table1
(`x`)
VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
;
Query 1:
select
*
from (
select
*
, rand() as rand_value
from Table1
) a_rand
where
rand_value < 0.1
[Results]:
| x | rand_value |
|---|---------------------|
| 1 | 0.03006686086772649 |
| 1 | 0.09353976332912199 |
| 1 | 0.08519635823107917 |

Query database in weekly interval

I have a database with a created_at column containing the datetime in Y-m-d H:i:s format.
The latest datetime entry is 2011-09-28 00:10:02.
I need the query to be relative to the latest datetime entry.
The first value in the query should be the latest datetime entry.
The second value in the query should be the entry closest to 7 days from the first value.
The third value should be the entry closest to 7 days from the second value.
REPEAT #3.
What I mean by "closest to 7 days from":
The following are dates, the interval I desire is a week, in seconds a week is 604800 seconds.
7 days from the first value is equal to 1316578202 (1317183002-604800)
the value closest to 1316578202 (7 days) is... 1316571974
unix timestamp | Y-m-d H:i:s
1317183002 | 2011-09-28 00:10:02 -> appear in query (first value)
1317101233 | 2011-09-27 01:27:13
1317009182 | 2011-09-25 23:53:02
1316916554 | 2011-09-24 22:09:14
1316836656 | 2011-09-23 23:57:36
1316745220 | 2011-09-22 22:33:40
1316659915 | 2011-09-21 22:51:55
1316571974 | 2011-09-20 22:26:14 -> closest to 7 days from 1317183002 (first value)
1316499187 | 2011-09-20 02:13:07
1316064243 | 2011-09-15 01:24:03
1315967707 | 2011-09-13 22:35:07 -> closest to 7 days from 1316571974 (second value)
1315881414 | 2011-09-12 22:36:54
1315794048 | 2011-09-11 22:20:48
1315715786 | 2011-09-11 00:36:26
1315622142 | 2011-09-09 22:35:42
I would really appreciate any help, I have not been able to do this via mysql and no online resources seem to deal with relative date manipulation such as this. I would like the query to be modular enough to be able to change the interval weekly, monthly, or yearly. Thanks in advance!
Answer #1 Reply:
SELECT
UNIX_TIMESTAMP(created_at)
AS unix_timestamp,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT max(created_at) - 7
FROM my_table
)
)
AS `random_1`,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT MAX(created_at) - 14
FROM my_table
)
)
AS `random_2`
FROM my_table
WHERE created_at =
(
SELECT MAX(created_at)
FROM my_table
)
Returns:
unix_timestamp | random_1 | random_2
1317183002 | 1317183002 | 1317183002
Answer #2 Reply:
RESULT SET:
This is the result set for a yearly interval:
id | created_at | period_index | period_timestamp
267 | 2010-09-27 22:57:05 | 0 | 1317183002
1 | 2009-12-10 15:08:00 | 1 | 1285554786
I desire this result:
id | created_at | period_index | period_timestamp
626 | 2011-09-28 00:10:02 | 0 | 0
267 | 2010-09-27 22:57:05 | 1 | 1317183002
I hope this makes more sense.
It's not exactly what you asked for, but the following example is pretty close....
Example 1:
select
floor(timestampdiff(SECOND, tbl.time, most_recent.time)/604800) as period_index,
unix_timestamp(max(tbl.time)) as period_timestamp
from
tbl
, (select max(time) as time from tbl) most_recent
group by period_index
gives results:
+--------------+------------------+
| period_index | period_timestamp |
+--------------+------------------+
| 0 | 1317183002 |
| 1 | 1316571974 |
| 2 | 1315967707 |
+--------------+------------------+
This breaks the dataset into groups based on "periods", where (in this example) each period is 7-days (604800 seconds) long. The period_timestamp that is returned for each period is the 'latest' (most recent) timestamp that falls within that period.
The period boundaries are all computed based on the most recent timestamp in the database, rather than computing each period's start and end time individually based on the timestamp of the period before it. The difference is subtle - your question requests the latter (iterative approach), but I'm hoping that the former (approach I've described here) will suffice for your needs, since SQL doesn't lend itself well to implementing iterative algorithms.
If you really do need to determine each period based on the timestamp in the previous period, then your best bet is going to be an iterative approach -- either using a programming language of your choice (like php), or by building a stored procedure that uses a cursor.
Edit #1
Here's the table structure for the above example.
CREATE TABLE `tbl` (
`id` int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
`time` datetime NOT NULL
)
Edit #2
Ok, first: I've improved the original example query (see revised "Example 1" above). It still works the same way, and gives the same results, but it's cleaner, more efficient, and easier to understand.
Now... the query above is a group-by query, meaning it shows aggregate results for the "period" groups as I described above - not row-by-row results like a "normal" query. With a group-by query, you're limited to using aggregate columns only. Aggregate columns are those columns that are named in the group by clause, or that are computed by an aggregate function like MAX(time)). It is not possible to extract meaningful values for non-aggregate columns (like id) from within the projection of a group-by query.
Unfortunately, mysql doesn't generate an error when you try to do this. Instead, it just picks a value at random from within the grouped rows, and shows that value for the non-aggregate column in the grouped result. This is what's causing the odd behavior the OP reported when trying to use the code from Example #1.
Fortunately, this problem is fairly easy to solve. Just wrap another query around the group query, to select the row-by-row information you're interested in...
Example 2:
SELECT
entries.id,
entries.time,
periods.idx as period_index,
unix_timestamp(periods.time) as period_timestamp
FROM
tbl entries
JOIN
(select
floor(timestampdiff( SECOND, tbl.time, most_recent.time)/31536000) as idx,
max(tbl.time) as time
from
tbl
, (select max(time) as time from tbl) most_recent
group by idx
) periods
ON entries.time = periods.time
Result:
+-----+---------------------+--------------+------------------+
| id | time | period_index | period_timestamp |
+-----+---------------------+--------------+------------------+
| 598 | 2011-09-28 04:10:02 | 0 | 1317183002 |
| 996 | 2010-09-27 22:57:05 | 1 | 1285628225 |
+-----+---------------------+--------------+------------------+
Notes:
Example 2 uses a period length of 31536000 seconds (365-days). While Example 1 (above) uses a period of 604800 seconds (7-days). Other than that, the inner query in Example 2 is the same as the primary query shown in Example 1.
If a matching period_time belongs to more than one entry (i.e. two or more entries have the exact same time, and that time matches one of the selected period_time values), then the above query (Example 2) will include multiple rows for the given period timestamp (one for each match). Whatever code consumes this result set should be prepared to handle such an edge case.
It's also worth noting that these queries will perform much, much better if you define an index on your datetime column. For my example schema, that would look like this:
ALTER TABLE tbl ADD INDEX idx_time ( time )
If you're willing to go for the closest that is after the week is out then this'll work. You can extend it to work out the closest but it'll look so disgusting it's probably not worth it.
select unix_timestamp
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 7
from my_table )
)
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 14
from my_table )
)
from my_table
where sql_tstamp = ( select max(sql_tstamp)
from my_table )

To cast or not to cast?

I am developing a system using MySQL queries written by another programmer, and am adapting his code.
I have three questions:
1.
One of the queries has this select statement:
SELECT
[...]
AVG(mytable.foo, 1) AS 'myaverage'`,
Is the 1 in AVG(mytable.foo, 1) AS 'myaverage' legitimate? I can find no documentation to support its usage?
2.
The result of this gives me average values to 2 decimal places, why?.
3.
I am using this to create a temp table. So:
(SELECT
[...]
AVG(`mytable`.`foo`, 1) AS `myaverage`,
FROM
[...]
WHERE
[...]
GROUP BY
[...])
UNION
(SELECT
[...]
FROM
[...]
WHERE
[...]
GROUP BY
[...])
) AS `tmptable`
ORDER BY
`tmptable`.`myaverage` DESC
When I sort the table on this column I get output which indicates that this average is being stored as a string, so the result is like:
9.3
11.1
In order to get around this what should I use?
Should I be using CAST or CONVERT, as DECIMAL (which I read is basically binary), BINARY itself, or UNSIGNED?
Or, is there a way to state that myaverage should be an integer when I name it in the AS statement?
Something like:
SELECT
AVG(myaverage) AS `myaverage`, INT(10)
Thanks.
On your last question: can you post the exact MySQL query that you are using?
The result type of a column from a UNION is determined by everything you get back. See http://dev.mysql.com/doc/refman/5.0/en/union.html .
So, even if your AVG() function returns a DOUBLE, the other part of the UNION may still return a string. In which case the column type of the result will be a string.
See the following example:
mysql> select a from (select 19 as a union select '120') c order by a;
+-----+
| a |
+-----+
| 120 |
| 19 |
+-----+
2 rows in set (0.00 sec)
mysql> select a from (select 19 as a union select 120) c order by a;
+-----+
| a |
+-----+
| 19 |
| 120 |
+-----+
2 rows in set (0.00 sec)
Just for anyone who's interested, I must have deleted or changed my predecessors code so this AVG question was incorrect. The correct code was ROUND(AVG(myaverage),1). Apologies to those who scrathed their heads over my stupidity.
on 1.
AVG() accepts exactly one argument, otherwise MySQL will raise an error:
mysql> SELECT AVG( id, 1 ) FROM anytable;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ' 1 )' at line 1
http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html#function_avg
Just because I'm curious - what should the second argument do?