I am running a mysql - 10.1.39-MariaDB - mariadb.org binary- database.
I am having the following table:
| id | date | product_name | close |
|----|---------------------|--------------|-------|
| 1 | 2019-08-07 00:00:00 | Product 1 | 806 |
| 2 | 2019-08-06 00:00:00 | Product 1 | 982 |
| 3 | 2019-08-05 00:00:00 | Product 1 | 64 |
| 4 | 2019-08-07 00:00:00 | Product 2 | 874 |
| 5 | 2019-08-06 00:00:00 | Product 2 | 739 |
| 6 | 2019-08-05 00:00:00 | Product 2 | 555 |
| 7 | 2019-08-07 00:00:00 | Product 3 | 762 |
| 8 | 2019-08-06 00:00:00 | Product 3 | 955 |
| 9 | 2019-08-05 00:00:00 | Product 3 | 573 |
I want to get the following output:
| id | date | product_name | close | daily_return |
|----|---------------------|--------------|-------|--------------|
| 4 | 2019-08-07 00:00:00 | Product 2 | 874 | 0,182679296 |
| 1 | 2019-08-07 00:00:00 | Product 1 | 806 | -0,179226069 |
Basically I want ot get the TOP 2 products with the highest return. Whereas return is calculated by (close_currentDay - close_previousDay)/close_previousDay for each product.
I tried the following:
SELECT
*,
(
CLOSE -(
SELECT
(t2.close)
FROM
prices t2
WHERE
t2.date < t1.date
ORDER BY
t2.date
DESC
LIMIT 1
)
) /(
SELECT
(t2.close)
FROM
prices t2
WHERE
t2.date < t1.date
ORDER BY
t2.date
DESC
LIMIT 1
) AS daily_return
FROM
prices t1
WHERE DATE >= DATE(NOW()) - INTERVAL 1 DAY
Which gives me the return for each product_name.
How to get the last product_name and sort this by the highest daily_return?
Problem Statement: Find the top 2 products with the highest returns on the latest date i.e. max date in the table.
Solution:
If you have an index on date field, it would be super fast.
Scans table only once and also uses date filter(index would allow MySQL to only process rows of given date range only.
A user-defined variable #old_close is used to find the return. Note here we need sorted data based on product and date.
SELECT *
FROM (
SELECT
prices.*,
CAST((`close` - #old_close) / #old_close AS DECIMAL(20, 10)) AS daily_return, -- Use #old_case, currently it has value of old row, next column will set it to current close value.
#old_close:= `close` -- Set #old_close to close value of this row, so it can be used in next row
FROM prices
INNER JOIN (
SELECT
DATE(MAX(`date`)) - INTERVAL 1 DAY AS date_from, -- if you're not sure whether you have date before latest date or not, can keep date before 1/2/3 day.
#old_close:= 0 as o_c
FROM prices
) AS t ON prices.date >= t.date_from
ORDER BY product_name, `date` ASC
) AS tt
ORDER BY `date` DESC, daily_return DESC
LIMIT 2;
Another version which doesn't depend on this date parameter.
SELECT *
FROM (
SELECT
prices.*,
CAST((`close` - #old_close) / #old_close AS DECIMAL(20, 10)) AS daily_return, -- Use #old_case, currently it has value of old row, next column will set it to current close value.
#old_close:= `close` -- Set #old_close to close value of this row, so it can be used in next row
FROM prices,
(SELECT #old_close:= 0 as o_c) AS t
ORDER BY product_name, `date` ASC
) AS tt
ORDER BY `date` DESC, daily_return DESC
LIMIT 2
You can do it with a self join:
select
p.*,
cast((p.close - pp.close) / pp.close as decimal(20, 10)) as daily_return
from prices p left join prices pp
on p.product_name = pp.product_name
and pp.date = date_add(p.date, interval -1 day)
order by p.date desc, daily_return desc, p.product_name
limit 2
See the demo.
Results:
| id | date | product_name | close | daily_return |
| --- | ------------------- | ------------ | ----- | ------------ |
| 4 | 2019-08-07 00:00:00 | Product 2 | 874 | 0.182679296 |
| 1 | 2019-08-07 00:00:00 | Product 1 | 806 | -0.179226069 |
Related
This is the table I am working with:
+---------------------+-----------
| Field | Type |
+---------------------+--------------+
| ID | binary(17) |
| MiscSensor_ID | binary(17) |
| rawValue | varchar(100) |
| RawValueUnitType_ID | int |
| timestamp | timestamp |
+---------------------+--------------+
Now my goal is to implement an event which deletes all entries older than a month BUT for each week I want to leave one entry per MiscSensor_ID (the one with the lowest rawValue).
I am this far:
CREATE EVENT delete_old_miscsensordatahistory
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_DATE + INTERVAL 1 DAY
DO
DELETE
FROM history
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31;
I need to do something like: delete if (value > minvalue) and group it in by MiscSensor_ID and 7 day periods but i am stuck right now on how to do that.
Any help would be much appreciated.
You can try using the ROW_NUMBER window function to match the rows which you don't want to delete. Records having row number equal to 1 will be those rows with the minimum "rawValue" for each combination of (week, sensorId).
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(
PARTITION BY MiscSensorId, WEEK(timestamp)
ORDER BY rawValue ) AS rn
FROM history
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31
)
DELETE
FROM history
INNER JOIN cte
ON history.ID = cte.ID
WHERE rn > 1;
This is how i implemented the event right now:
CREATE EVENT delete_old_miscsensordatahistory
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_DATE + INTERVAL 1 DAY
DO
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(
PARTITION BY MiscSensor_ID, WEEK(timestamp)
ORDER BY CAST(rawValue AS SIGNED) ) AS rn
FROM MiscSensorDataHistory
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31
)
DELETE MiscSensorDataHistory
FROM MiscSensorDataHistory
INNER JOIN cte
ON cte.ID = MiscSensorDataHistory.ID
WHERE rn > 1
Testing my method I found out that there are still entries with the same MiscSensor_ID and less than 7 days apart:
| 0x3939333133303037343939353436393032 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-02-17 23:09:21 |
| 0x3939333133303037343939313631333039 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-02-06 16:52:48 |
| 0x3939333133303037343938383835353239 | 0x3439303031303031303730303030303535 | 553 | 30 | 2022-01-30 08:21:55 |
| 0x3939333133303037343938383639333436 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-01-29 22:48:06 |
| 0x3939333133303037343937303734353537 | 0x3439303031303031303730303030303535 | 444 | 30 | 2021-12-26 06:12:07 |
| 0x3939333133303037343937303530363738 | 0x3439303031303031303730303030303535 | 446 | 30 | 2021-12-25 21:53:03 |
| 0x3939333133303037343936333034343238 | 0x3439303031303031303730303030303535 | 0 | 30 | 2021-12-14 13:08:04 |
| 0x3939333133303037343935393934303832 | 0x3439303031303031303730303030303535 | 415 | 30 | 2021-12-08 12:56:43
Any suggestions would be much appreciated.
I am running a mysql - 10.1.39-MariaDB - mariadb.org binary- database.
I am having the following table:
| id | date | api_endpoint | ticker | open | high | low | close | volume |
|------|---------------------|--------------|--------|-----------|-----------|-----------|-----------|-----------|
| 18 | 2019-08-07 00:00:00 | daily | AAPL | 195.41000 | 199.56000 | 193.82000 | 199.04000 | 33364400 |
| 19 | 2019-08-06 00:00:00 | daily | AAPL | 196.31000 | 198.07000 | 194.04000 | 197.00000 | 35824800 |
| 20 | 2019-08-05 00:00:00 | daily | AAPL | 197.99000 | 198.65000 | 192.58000 | 193.34000 | 52393000 |
| 21 | 2019-08-02 00:00:00 | daily | AAPL | 205.53000 | 206.43000 | 201.62470 | 204.02000 | 40862100 |
| 44 | 2019-08-01 00:00:00 | monthly | AAPL | 213.90000 | 218.03000 | 206.74000 | 208.43000 | 54017900 |
| 5273 | 1999-09-07 00:00:00 | monthly | AAPL | 73.75000 | 77.93800 | 73.50000 | 76.37500 | 246198400 |
I am calculating returns using mysql:
SELECT *
,(CLOSE - (SELECT (t2.close)
FROM prices t2
WHERE t2.date < t1.date
ORDER BY t2.date DESC
LIMIT 1 ) ) / (SELECT (t2.close)
FROM prices t2
WHERE t2.date < t1.date
ORDER BY t2.date DESC
LIMIT 1 ) AS daily_returns
FROM prices
The above query adds a column daily_returns to my table.
I would like to get the top 5 highest daily_returns. I tried to use ORDER BY, however, this does not work with a calculated column.
Any suggestions how to get the top 5 highest daily_returns?
Update: MySQL 8
SELECT
prices.*,
prices.close - LAG(prices.close) OVER w AS daily_return
FROM prices
WHERE api_endpoint = 'daily'
WINDOW w AS (ORDER BY prices.`date` ASC)
ORDER BY daily_return DESC
LIMIT 5;
MySQL 5.7 & Lower
Use MySQL variable to store close value of last day. Compare it with close value to the current row to do the calculation.
SELECT
*
FROM (
SELECT
prices.*,
(`close` - #old_close) / #old_close AS daily_return, -- Use #old_case, currently it has value of old row, next column will set it to current close value.
#old_close:= `close` -- Set #old_close to close value of this row, so it can be used in next row
FROM prices,
(SELECT #old_close:= 0 as o_c) AS t -- Initialize old_close as 0
WHERE api_endpoint = 'daily'
ORDER BY `date` ASC -- return is calculated based on last day close, so keep it sorted based on ascending order of date
) AS tt
ORDER BY daily_return DESC
LIMIT 5;
Reference: How to get diff between two consecutive rows
I have a table with 'ON' and 'OFF' values in column activity and another column datetime.
id(AUTOINCREMENT) id_device activity datetime
1 a ON 2017-05-26 22:00:00
2 b ON 2017-05-26 05:00:00
3 a OFF 2017-05-27 04:00:00
4 b OFF 2017-05-26 08:00:00
5 a ON 2017-05-28 12:00:00
6 a OFF 2017-05-28 15:00:00
I need to get total ON time by day
day id_device total_minutes_on
2017-05-26 a 120
2017-05-26 b 180
2017-05-27 a 240
2017-05-27 b 0
2017-05-28 a 180
2017-05-28 b 0
i have searched and tried answers for another posts, i tried TimeDifference and i get correct total time.
I don't find the way to get total time grouped by date
i appreciate your help
I'm not posting this as a definite answer rather it's an experiment for me and hopefully you'll find is useful in your case. Also I would like to mention that the MySQL database version I'm working with is quite old so the method I'm using is also very manual to say the least.
First of all lets extract your expected output:
The date value in day need to be repeated twice fro each of id_device a and b.
Minutes are calculated based on the activity; if activity is 'ON' until tomorrow, it needs to be calculated until the day end at 24:00:00 while the next day will calculate minutes until the activity is OFF.
What I come up with is this:
Creating condition (1):
SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY dtt,id_device;
The query above will return the following result:
+------------+-----------+
| dtt | id_device |
+------------+-----------+
| 2017-05-26 | a |
| 2017-05-26 | b |
| 2017-05-27 | a |
| 2017-05-27 | b |
| 2017-05-28 | a |
| 2017-05-28 | b |
+------------+-----------+
*Above will only work with all the dates you have in the table. If you want all date regardless if there's activity or not, I suggest you create a calendar table (refer: Generating a series of dates).
So this become the base query. Then I've added an outer query to left join the query above with the original data table:
SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
What's new in the query is the addition of GROUP_CONCAT operation on both activity and time value extracted from datetime column which is converted into seconds value. You notice that in both of the GROUP_CONCAT there's a similar ORDER BY condition which is important in order to get the exact corresponding value.
The query above will return the following result:
+------------+-----------+----------+-------------+
| dtt | id_device | activity | tr |
+------------+-----------+----------+-------------+
| 2017-05-26 | a | ON | 79200 |
| 2017-05-26 | b | ON OFF | 18000 28800 |
| 2017-05-27 | a | OFF | 14400 |
| 2017-05-27 | b | (NULL) | (NULL) |
| 2017-05-28 | a | ON OFF | 43200 54000 |
| 2017-05-28 | b | (NULL) | (NULL) |
+------------+-----------+----------+-------------+
From here, I've added another query outside to calculate how many minutes and attempt to get the expected result:
SELECT dtt,id_device,
CASE
WHEN SUBSTRING_INDEX(activity,' ',1)='ON' AND SUBSTRING_INDEX(activity,' ',-1)='OFF'
THEN (SUBSTRING_INDEX(tr,' ',-1)-SUBSTRING_INDEX(tr,' ',1))/60
WHEN activity='ON' THEN 1440-(tr/60)
WHEN activity='OFF' THEN tr/60
WHEN activity IS NULL AND tr IS NULL THEN 0
END AS 'total_minutes_on'
FROM
-- from the last query
(SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
--
) z
The last part I do is if the activity value have both ON and OFF on the same day then (OFF-ON)/60secs=total minutes. If activity value is only ON then minutes value for '24:00:00' > 24 hr*60 min= 1440-(ON/60secs)= total minutes, and if activity only OFF, I just convert seconds to minutes because the day starts at 00:00:00 anyhow.
+------------+-----------+------------------+
| dtt | id_device | total_minutes_on |
+------------+-----------+------------------+
| 2017-05-26 | a | 120 |
| 2017-05-26 | b | 180 |
| 2017-05-27 | a | 240 |
| 2017-05-27 | b | 0 |
| 2017-05-28 | a | 180 |
| 2017-05-28 | b | 0 |
+------------+-----------+------------------+
Hopefully this will give you some ideas. ;)
I have two tables, one that store product information and one that stores reviews for the products.
I am now trying to get the number of reviews submitted for the products between two dates but for some reason I get the same results regardless of the dates i put.
This is my query:
SELECT
productName,
COUNT(*) as `count`,
avg(rating) as `rating`
FROM `Reviews`
LEFT JOIN `Products` using(`productID`)
WHERE `date` BETWEEN '2015-07-20' AND '2015-07-30'
GROUP BY
`productName`
ORDER BY `count` DESC, `rating` DESC;
This returns:
+------------+---------------------+
| productName| count|rating |
+------------+------+--------------+
| productA | 23 | 4.3333333 |
| productB | 17 | 4.25 |
| productC | 10 | 3.5 |
+------------+---------------------+
Products table:
+---------+-------------+
|productID | productName|
+---------+-------------+
| 1 | productA |
| 2 | productB |
| 3 | productC |
+---------+-------------+
Reviews table
+---------+-----------+--------+---------------------+
|reviewID | productID | rating | date |
+---------+-----------+--------+---------------------+
| 1 | 1 | 4.5 | 2015-07-27 17:47:01|
| 2 | 1 | 3.5 | 2015-07-27 18:54:22|
| 3 | 3 | 2 | 2015-07-28 13:28:37|
| 4 | 1 | 5 | 2015-07-28 18:33:14|
| 5 | 2 | 1.5 | 2015-07-29 11:58:17|
| 6 | 2 | 3.5 | 2015-07-30 15:04:25|
| 7 | 2 | 2.5 | 2015-07-30 18:11:11|
| 8 | 1 | 3 | 2015-07-30 18:26:23|
| 9 | 1 | 3 | 2015-07-30 21:35:05|
| 10 | 1 | 4.5 | 2015-07-31 14:25:47|
| 11 | 3 | 0.5 | 2015-07-31 14:47:48|
+---------+-----------+--------+---------------------+
when I put two random dates that I do know for sure they not on the date column, I will still get the same results. Even when I want to retrieve records only on a certain day, I get the same results.
You should not use left join, because by doing so you retrieve all the data from one table. What you should use is something like :
select
productName,
count(*) as `count`,
avg(rating) as `rating`
from
products p,
reviews r
where
p.productID = r.productID
and `date` between '2015-07-20' and '2015-07-30'
group by productName
order by count desc, rating desc;
If the result, given your sample data, that you're looking for is:
| productName | count | rating |
|-------------|-------|--------|
| productA | 5 | 4 |
| productB | 3 | 3 |
| productC | 1 | 2 |
This is the count and average of reviews made on any date between 2015-07-20 and 2015-07-30 inclusive.
Then the there are two issues with your query. First, you need to change the join to a inner join instead of a left join, but more importantly you need to change the date condition as you are currently excluding reviews that fall on the last date on the range, but after midnight.
This happens because your between clause compares datetime values with date values so the comparison ends up being date between '2015-07-20 00:00:00' and '2015-07-30 00:00:00' which clearly excludes some dates at the end.
The fix is to either change the date condition so that the end is a day later:
where date >= '2015-07-20' and date < '2015-07-31'
or cast the date column to a date value, which will remove the time part:
where date(date) between '2015-07-20' and '2015-07-30'
Sample SQL Fiddle
You are using a LEFT JOIN between your reviews and your products tables. This will result in all the rows of reviews being shown with some rows having all product columns left empty.
You should use INNER JOIN, as this will filter only the wanted results.
(In the end I can only guess, since I don't even know which column belongs to which table ...)
The full query (very similar to Angelo Giannis's solution):
select
productName,
count(*) as `count`,
avg(rating) as `rating`
from
products INNER JOIN reviews USING(productId)
where date between '2015-07-20' and '2015-07-30'
group by productName
order by count desc, rating desc;
Here a fiddle with my and Angelo's solution (they both work).
I have a table like this:
| id | date | user_id |
----------------------------------------------------
| 1 | 2008-01-01 | 10 |
| 2 | 2009-03-20 | 15 |
| 3 | 2008-06-11 | 10 |
| 4 | 2009-01-21 | 15 |
| 5 | 2010-01-01 | 10 |
| 6 | 2011-06-01 | 10 |
| 7 | 2012-01-01 | 10 |
| 8 | 2008-05-01 | 15 |
I’m looking for a solution how to select user_id where the difference between MIN and MAX dates is more than 3 yrs. For the above data I should get:
| user_id |
-----------------------
| 10 |
Anyone can help?
SELECT user_id
FROM mytable
GROUP BY user_id
HAVING MAX(`date`) > (MIN(`date`) + INTERVAL '3' YEAR);
Tested here: http://sqlize.com/MC0618Yg58
Similar to bernie's approach, I'd keep date formats native. I'd also probably list the MAX first as to avoid an ABS call (secure a positive number is always returned).
SELECT user_id
FROM my_table
WHERE DATEDIFF(MAX(date),MIN(date)) > 365
DATEDIFF just returns delta (in days) between two given date fields.
SELECT user_id
FROM (SELECT user_id, MIN(date) m0, MAX(date) m1
FROM table
GROUP by user_id)
HAVING EXTRACT(YEAR FROM m1) - EXTRACT(YEAR FROM m0) > 3
SELECT A.USER_ID FROM TABLE AS A
JOIN TABLE AS B
ON A.USER_ID = B.USER_ID
WHERE DATEDIFF(A.DATE,B.DATE) > 365