Mysql Rank based on Avg - mysql

I have a table with these columns
-----------------------------------------------------------------------
|id | deviceId | totalMethaneInGrams | totalFeedInMinutes | date |
-----------------------------------------------------------------------
|1 |141 | 402 |305 |2020-10-13 |
|2 |141 | 410 |368 |2020-10-13 |
|3 |145 | 361 |300 |2020-10-13 |
-----------------------------------------------------------------------
Now i want to calculate an average of totalMethaneInGrams and totalFeedInMinutes for a subset of devices. where date is less than some day. Group them by devicedId, order them by avg(totalMethaneInGrams) and get a global rank of those devices based on avg(totalMethaneInGrams).
This what i have up until now,
SELECT
deviceId,
ROUND(avg(totalFeedInMinutes),2) as methane,
ROUND(avg(totalMethaneInGrams)) as feed
FROM sensor_data
WHERE
deviceId IN (141,123,145) AND date < '2020-10-14'
GROUP BY deviceId
ORDER BY methane
Now what i don't understand is how to calculate global rank. My understanding is we need to calculate rank of all devices in the table. Then i can just search for the devices in the returned global dataset. Can it be done in a single query ?

mysql does not do rank over multiple columns well (eg feed within methane) a workaround is to do separately and join
SELECT t.deviceId,
ROUND(avg(totalMethaneInGrams),2) methane,
rank() over (order by ROUND(avg(totalMethaneInGrams),2 ) desc) as rankmethane,
max(feed) feed,
max(rankfeed) rankfeed
fROM t
join
(SELECT
deviceId,
ROUND(avg(totalFeedInMinutes)) as feed,
rank() over (order by ROUND(avg(totalFeedInMinutes),2 ) desc) as rankfeed
fROM t
WHERE deviceId IN (141,123,145) AND date < '2020-10-14'
group by deviceid) s
on s.deviceid = t.deviceid
WHERE t.deviceId IN (141,123,145) AND date < '2020-10-14'
GROUP BY t.deviceId ;
+----------+---------+-------------+------+----------+
| deviceId | methane | rankmethane | feed | rankfeed |
+----------+---------+-------------+------+----------+
| 141 | 406.00 | 1 | 337 | 2 |
| 145 | 361.00 | 2 | 400 | 1 |
+----------+---------+-------------+------+----------+
2 rows in set (0.002 sec)

Related

How to group by year from a concatenated column

Having a MySQL table as this, where the id is a concatenation of the date with Ymd format (of the moment the row is inserted) with an incremental id.
| id | weight |
| 20200128001 | 100 |
| 20200601002 | 250 |
| 20201208003 | 300 |
| 20210128001 | 150 |
| 20210601002 | 200 |
| 20211208003 | 350 |
To make a sum of 'weight' by a single year I'm making:
SELECT sum(weight) as weight FROM `table` WHERE id LIKE '2020%';
resulting in this case as:
650
How can I make it result in a table of weights by year, instead of querying by every single possible year, resulting in this case as:
| date | weight |
| 2020 | 650 |
| 2021 | 700 |
Use one of the string processing functions in MySQL like left()
SELECT LEFT(id,4) as Year, SUM(weight) as Weight
FROM `table`
GROUP BY LEFT(id,4)
ORDER BY LEFT(id,4)
And if you want to limit the results to just those 2 years
SELECT LEFT(id,4) as Year, SUM(weight) as Weight
FROM `table`
WHERE LEFT(id,4) IN (2021, 2022)
GROUP BY LEFT(id,4)
ORDER BY LEFT(id,4)

Getting top 5 results based on difference from last day data

I am running a mysql - 10.1.39-MariaDB - mariadb.org binary- database.
I am having the following table:
| id | date | api_endpoint | ticker | open | high | low | close | volume |
|------|---------------------|--------------|--------|-----------|-----------|-----------|-----------|-----------|
| 18 | 2019-08-07 00:00:00 | daily | AAPL | 195.41000 | 199.56000 | 193.82000 | 199.04000 | 33364400 |
| 19 | 2019-08-06 00:00:00 | daily | AAPL | 196.31000 | 198.07000 | 194.04000 | 197.00000 | 35824800 |
| 20 | 2019-08-05 00:00:00 | daily | AAPL | 197.99000 | 198.65000 | 192.58000 | 193.34000 | 52393000 |
| 21 | 2019-08-02 00:00:00 | daily | AAPL | 205.53000 | 206.43000 | 201.62470 | 204.02000 | 40862100 |
| 44 | 2019-08-01 00:00:00 | monthly | AAPL | 213.90000 | 218.03000 | 206.74000 | 208.43000 | 54017900 |
| 5273 | 1999-09-07 00:00:00 | monthly | AAPL | 73.75000 | 77.93800 | 73.50000 | 76.37500 | 246198400 |
I am calculating returns using mysql:
SELECT *
,(CLOSE - (SELECT (t2.close)
FROM prices t2
WHERE t2.date < t1.date
ORDER BY t2.date DESC
LIMIT 1 ) ) / (SELECT (t2.close)
FROM prices t2
WHERE t2.date < t1.date
ORDER BY t2.date DESC
LIMIT 1 ) AS daily_returns
FROM prices
The above query adds a column daily_returns to my table.
I would like to get the top 5 highest daily_returns. I tried to use ORDER BY, however, this does not work with a calculated column.
Any suggestions how to get the top 5 highest daily_returns?
Update: MySQL 8
SELECT
prices.*,
prices.close - LAG(prices.close) OVER w AS daily_return
FROM prices
WHERE api_endpoint = 'daily'
WINDOW w AS (ORDER BY prices.`date` ASC)
ORDER BY daily_return DESC
LIMIT 5;
MySQL 5.7 & Lower
Use MySQL variable to store close value of last day. Compare it with close value to the current row to do the calculation.
SELECT
*
FROM (
SELECT
prices.*,
(`close` - #old_close) / #old_close AS daily_return, -- Use #old_case, currently it has value of old row, next column will set it to current close value.
#old_close:= `close` -- Set #old_close to close value of this row, so it can be used in next row
FROM prices,
(SELECT #old_close:= 0 as o_c) AS t -- Initialize old_close as 0
WHERE api_endpoint = 'daily'
ORDER BY `date` ASC -- return is calculated based on last day close, so keep it sorted based on ascending order of date
) AS tt
ORDER BY daily_return DESC
LIMIT 5;
Reference: How to get diff between two consecutive rows

Calculate average, minimum, maximum interval between date

I am trying to do this with SQL. I have a transaction table which contain transaction_date. After grouping by date, I got this list:
| transaction_date |
| 2019-03-01 |
| 2019-03-04 |
| 2019-03-05 |
| ... |
From these 3 transaction dates, I want to achieve:
Average = ((4-1) + (5-4)) / 2 = 2 days (calculate DATEDIFF every single date)
Minimum = 1 day
Maximum = 3 days
Is there any good syntax? Before I iterate all of them using WHILE.
Thanks in advance
If your mysql version didn't support lag or lead function.
You can try to make a column use a subquery to get next DateTime. then use DATEDIFF to get the date gap in a subquery.
Query 1:
SELECT avg(diffDt),min(diffDt),MAX(diffDt)
FROM (
SELECT DATEDIFF((SELECT transaction_date
FROM T tt
WHERE tt.transaction_date > t1.transaction_date
ORDER BY tt.transaction_date
LIMIT 1
),transaction_date) diffDt
FROM T t1
) t1
Results:
| avg(diffDt) | min(diffDt) | MAX(diffDt) |
|-------------|-------------|-------------|
| 2 | 1 | 3 |
if your mysql version higher than 8.0 you can try to use LEAD window function instead of subquery.
Query #1
SELECT avg(diffDt),min(diffDt),MAX(diffDt)
FROM (
SELECT DATEDIFF(LEAD(transaction_date) OVER(ORDER BY transaction_date),transaction_date) diffDt
FROM T t1
) t1;
| avg(diffDt) | min(diffDt) | MAX(diffDt) |
| ----------- | ----------- | ----------- |
| 2 | 1 | 3 |
View on DB Fiddle

How can I retrieve all the columns on a timerange aggregation?

I am currently struggling on how to aggregate my daily data in other time aggregations (weeks, months, quarters etc).
Here is how my raw data type looks like:
| date | traffic_type | visits |
|----------|--------------|---------|
| 20180101 | 1 | 1221650 |
| 20180101 | 2 | 411424 |
| 20180101 | 4 | 108407 |
| 20180101 | 5 | 298117 |
| 20180101 | 6 | 26806 |
| 20180101 | 7 | 12033 |
| 20180101 | 8 | 80368 |
| 20180101 | 9 | 69544 |
| 20180101 | 10 | 39919 |
| 20180101 | 11 | 26291 |
| 20180102 | 1 | 1218490 |
| 20180102 | 2 | 410965 |
| 20180102 | 4 | 108037 |
| 20180102 | 5 | 297727 |
| 20180102 | 6 | 26719 |
| 20180102 | 7 | 12019 |
| 20180102 | 8 | 80074 |
First, I would like to check the sum of visits regardless of traffic_type:
SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date
Here is the outcome:
| ymd | visits_per_day |
|:--------:|:--------------:|
| 20180101 | 2294563 |
| 20180102 | 2289145 |
| 20180103 | 2300367 |
| 20180104 | 2310256 |
| 20180105 | 2368098 |
| 20180106 | 2372257 |
| 20180107 | 2373863 |
| 20180108 | 2364236 |
However, if I want to check the specific day which the visits_per_day was the highest for each time aggregation (eg.: Month), I am struggling to retrieve the right output.
Here is what I did:
SELECT
(date div 100) as y_month, MAX(visits_per_day) as max_visit_per_day
FROM
(SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date) as t1
GROUP BY
y_month
And here is the output of my query:
| y_month | max_visit_per_day |
|:-------:|:-----------------:|
| 201801 | 2435845 |
| 201802 | 2519000 |
| 201803 | 2528097 |
| 201804 | 2550645 |
However, I cannot know what was the exact day where the visits_per_day was the highest.
Desired output:
| y_month | max_visit_per_day | ymd |
|:-------:|:-----------------:|:--------:|
| 201801 | 2435845 | 20180130 |
| 201802 | 2519000 | 20180220 |
| 201803 | 2528097 | 20180325 |
| 201804 | 2550645 | 20180406 |
ymd would represent the day in which the visits_per_day was the highest.
This logic would be used in a dashboard with the help of programming in order to automatically select the time aggregation.
Can someone please help me?
This is a job for the structured part of structured query language. That is, you will write some subqueries and treat them as tables.
You already know how to find the number of visits per day. Let's add the month for each day to that query (http://sqlfiddle.com/#!9/a8455e/13/0).
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
Next you need to find the largest number of daily visits in each month. (http://sqlfiddle.com/#!9/a8455e/12/0)
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
Then, the trick is retrieving the date on which that maximum occurred in each month. That requires a join. Without common table expressions (which MySQL lacks) you need to repeat the first subquery. (http://sqlfiddle.com/#!9/a8455e/11/0)
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
) maxvisits
JOIN (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
The outline of this rather complex query helps explain it. Instead of that subquery, we'll use an imaginary table called dayvisits.
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY date DIV 100
) maxvisits
JOIN dayvisits detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
You're seeking an extreme value for each month in the subquery. (This is a fairly standard sort of SQL operation.) To do that you find that value with a MAX() ... GROUP BY query. Then you join that to the subquery itself to find the other values corresponding to the extreme value.
If you did have common table expressions, the query would look like this. YOu might consider adopting the MySQL fork called MariaDB, which has CTEs.
WITH dayvisits AS (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
)
SELECT dayvisits.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY month
) maxvisits
JOIN dayvisits ON dayvisits.visits = maxvisits.max_daily_visits
AND dayvisits.month = maxvisits.month
[Query Check on MSSQL] its quick and efficient.
select visit_sum_day_wise.date
, visit_sum_day_wise.Max_Visits
, visit_sum_day_wise.traffic_type
, LAST_VALUE(visit_sum_day_wise.visits) OVER(PARTITION BY
visit_sum_day_wise.date ORDER BY visit_sum_day_wise.date ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS max_visit_per_day
from (
select visits_tbl.date , visits_tbl.visits , visits_tbl.traffic_type
,max(visits_tbl.visits ) OVER ( PARTITION BY visits_tbl.date ORDER
BY visits_tbl.date ROWS BETWEEN UNBOUNDED PRECEDING AND 0
PRECEDING) Max_visits
from visits_tbl
) as visit_sum_day_wise
where visit_sum_day_wise.visits = (select max(visits_B.visits ) from
visits_tbl visits_B where visits_B.Date = visit_sum_day_wise.date )
enter image description here

Joining one table to the latest row in another table using MySQL

I want to join two tables in a special way, first table is devices which has a list of devices.
The second table is datalog which is where abit of data is stored for everytime a device in devices gets polled.
Devices Table:
+----------+------------+----------------------------+---------------------+
| deviceId | deviceName | deviceDescription | timeCreated |
+----------+------------+----------------------------+---------------------+
| 1 | System 1 | Main System in Server Room | 2010-01-01 00:00:00 |
| 2 | System 2 | Outdoor System | 2010-01-01 00:00:00 |
+----------+------------+----------------------------+---------------------+
DataLog Table:
+----+---------------------+----------+-----------+---------+
| id | time_stamp | DeviceId | FuelLevel | Voltage |
+----+---------------------+----------+-----------+---------+
| 1 | 2010-01-01 00:00:00 | 1 | 60 | 220 |
| 2 | 2010-01-01 00:00:00 | 2 | 20 | 221 |
| 3 | 2010-01-02 00:00:00 | 1 | 100 | 219 |
| 4 | 2010-01-02 00:00:00 | 2 | 100 | 222 |
| 5 | 2010-01-03 00:00:00 | 1 | 80 | 219 |
| 6 | 2010-01-03 00:00:00 | 2 | 99 | 220 |
+----+---------------------+----------+-----------+---------+
Currently I am getting the latest data for each device using a query on the DataLog table with:
Where DeviceId = 1 ORDER BY timestamp DESC LIMIT 1
What I would like is one query to return a list of all devices, with the columns joined with the latest data for each device like this:
+----------+------------+----------------------------+---------------------+-----------+---------+
| deviceId | deviceName | deviceDescription | time_stamp |FuelLevel | Voltage |
+----------+------------+----------------------------+---------------------+-----------+---------+
| 1 | System 1 | Main System in Server Room | 2010-01-03 00:00:00 | 80 | 219 |
| 2 | System 2 | Outdoor System | 2010-01-03 00:00:00 | 99 | 220 |
+----------+------------+----------------------------+---------------------+-----------+---------+
You can't do the "limit 1" at the outer level, you loose what you are looking for... ALL devices last entry. Use a pre-query for the last ID of each device, then join back...
select
Devices.*,
DataLog.Time_Stamp,
DataLog.FuelLevel,
DataLog.Voltage
from
( select DeviceID,
max( ID ) LastActionID
from
DataLog
group by
1 ) LastInstance
join DataLog
on LastInstance.LastActionID = DataLog.ID
join Devices
on LastInstance.DeviceID = Devices.DeviceID
order by
Devices.DeviceName
Per your last comment, I would actually change to something like...
Update your device table with a "LastLogID". Then, via a trigger an insert into your DataLog table, update the Device table immediately with that new ID... This way, you never need to pre-query the data log directly.. You'll already HAVE the last ID and run from that directly to the data log joined by that ID.
I know it's horrible, not elegant and time consuming, but this query works:
SELECT deviceId,deviceName,deviceDescription,
(SELECT time_stamp FROM datalog
WHERE datalog.DeviceId=devices.deviceId
ORDER BY time_stamp DESC LIMIT 0,1) time_stamp,
(SELECT FuelLevel FROM datalog
WHERE datalog.DeviceId=devices.deviceId
ORDER BY time_stamp DESC LIMIT 0,1) FuelLevel,
(SELECT Voltage FROM datalog
WHERE datalog.DeviceId=devices.deviceId
ORDER BY time_stamp DESC LIMIT 0,1) Voltage
FROM devices
I tried to have a single subquery retrieving multiple columns, but MySql complains because it wants only one column.
try
by the way if u want only latest row then u can search it by auto increment field (datalog_table.id)
SELECT dvc.deviceId,dvc.deviceName,dvc.deviceDescription,
dtl.time_stamp,dtl.FuelLevel,dtl.Voltage
FROM device_table dvc
INNER JOIN datalog_table dtl
ON dtl.DeviceId=dvc.deviceId
ORDER BY dtl.id LIMIT 1
SELECT
d.deviceId, d.deviceName, d.deviceDescription,
dl.time_stamp, dl.FuelLevel, dl.Voltage
FROM Device d, DataLog dl
WHERE d.deviceId=dl.deviceID
ORDER BY time_stamp DESC
LIMIT 1