there are many devices and while using it will upload data every some seconds or minutes.
I want to get the sections of date-time that the device is in use
Id date-time value
0 2021-07-08 14:46:46 1
1 2021-07-08 14:47:47 5
2 2021-07-08 14:48:48 2
3 2021-07-08 14:49:49 4
4 2021-07-08 15:30:01 7
5 2021-07-08 15:30:46 4
6 2021-07-08 15:30:46 4
7 2021-07-08 15:50:04 4
8 2021-07-08 15:50:05 6
can it be true that group the data by an interval?
let us consider interval = 1 minutes
then group the data which the minus of the two date-time is more than 1 minutes.
then Id=0 or Id=1 or Id=2 or Id=3 is one group and Id=4 and Id=5 and Id=6 and Id=7 and Id=8 is another group
what I want is the group is a nearly date-time.
If the difference between two records is more than 1 minute then they are in two groups. If not they are in the same groups.
which means in the same group time1 will be smaller than 1 minutes to one of the other time.
If the time difference is 1 or 10 minutes larger than the previous record it will belong to a new groups
and I am using MYSQL
You can use lag window function to obtain previous date_time.
One way to calculate the time difference in seconds is to convert timestamp type to integer by unix_timestamp function.
Make a newgroup flag which equals one if and only if the difference from the previous record is larger than 60*10 seconds (10 minutes).
Cumulative sum of newgroup would become the section group ID.
with tmp AS (
SELECT
*,
coalesce(unix_timestamp(date_time) - unix_timestamp(lag(date_time) over (ORDER BY date_time)), 0) > 60*10 AS newgroup
FROM
tbl
)
,tmp2 AS (
SELECT
*,
sum(newgroup) over (ORDER BY date_time) AS groupid
FROM
tmp
)
SELECT * FROM tmp2
This query would get:
id date_time value newgroup groupid
0 2021-07-08 14:46:46 1 0 0
1 2021-07-08 14:47:47 5 0 0
2 2021-07-08 14:48:48 2 0 0
3 2021-07-08 14:49:49 4 0 0
4 2021-07-08 15:30:01 7 1 1
5 2021-07-08 15:30:46 4 0 1
6 2021-07-08 15:30:46 4 0 1
7 2021-07-08 15:50:04 4 1 2
8 2021-07-08 15:50:05 6 0 2
Hmmm . . . It sounds like you are looking for gaps to defines groups that are related, and the gaps are determined by the interval.
In pseudo-SQL, this might look like:
select min(date_time), max(date_time), count(*), avg(value)
from (select t.*,
sum(case when prev_date_time > date_time - interval '1 minute' then 0 else 1 end) over (order by date_time) as grp
from (select t.*,
lag(date_time) over (order by date_time) as prev_date_time
from t
) t
) t
group by grp;
Related
Table1
id
hour
date
tableValue1
tableValue2
1
3
2020-05-29
123
145
2
2
2020-05-29
1500
3400
Table2:
id
hour
date
tableValue3
tableValue4
1
1
2020-05-29
4545
3697
2
3
2020-05-29
5698
2896
Table3:
id
hour
date
tableValue5
tableValue6
1
2
2020-05-29
7841
5879
2
1
2020-05-29
1485
3987
I want to select multiple columns from different tables with one query.
Expected Output:
hour
tableValue1
tableValue3
tableValue5
1
0
4545
1485
2
1500
0
7841
3
123
5698
0
I've tried this query without success:
SELECT hour , tableValue1 WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10 FROM table1
UNION ALL
SELECT hour , tableValue3 WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10 FROM table2
UNION ALL
SELECT hour , tableValue5 WHERE date = "2020-05-29" AND hour BETWEEN 10 AND 10 FROM table3
I'm getting instead the following:
hour
tableValue1
3
123
2
1500
1
4545
3
5698
2
5879
1
3987
The columns tables have in common are hour and date, do I need to redesign database structure to link the tables, so that I can use JOIN command, but how? Or is there a sql command to select multiple column from multiple tables?
There are a couple of issues in your code:
your WHERE clause should be found after the FROM clause in your subqueries
you want different columns, but you associate only one column for each of your table: if you want three columns, each of your subqueries should return three columns
your rows are not ordered because you're missing an ORDER BY clause at the end of your code.
your rows are not aggregated to remove the zeroes in excess: in that case it is sufficient to apply a MAX aggregation function for each relevant field, partitioning on the "hour" field
WITH cte AS (
SELECT hour,
tableValue1,
0 AS tableValue3,
0 AS tableValue5
FROM table1
WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10
UNION ALL
SELECT hour,
0 AS tableValue1,
tableValue3,
0 AS tableValue5
FROM table2
WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10
UNION ALL
SELECT hour,
0 AS tableValue1,
0 AS tableValue3,
tableValue5
FROM table3
WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10
ORDER BY hour
)
SELECT hour,
MAX(tableValue1) AS tableValue1,
MAX(tableValue3) AS tableValue3,
MAX(tableValue5) AS tableValue5
FROM cte
GROUP BY hour
Check the demo here.
You must introduce empty columns in first query
SELECT hour , tableValue1, 0 tableValue3, 0 tableValue5 FROM table1 WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10
UNION ALL
SELECT hour , 0, tableValue3, 0 FROM table2 WHERE date = "2020-05-29" AND hour BETWEEN 0 AND 10
UNION ALL
SELECT hour , 0,0 tableValue5 FROM table3 WHERE date = "2020-05-29" AND hour BETWEEN 10 AND 10
vehicle_assignment_history
id companyAccountId date totalVan totalBike
1 4 2021-11-11 00:00:00 2 0
2 4 2021-11-11 00:00:00 3 0
3 4 2021-11-11 00:00:00 1 0
4 8 2021-11-11 00:00:00 1 0
5 8 2021-11-12 00:00:00 2 0
6 9 2021-11-13 00:00:00 0 2
7 9 2021-11-14 00:00:00 0 1
I want to calculate sum of each group last row of companyAccountId.also the date bewteen a range.
for example:-
2021-11-11 -> 2021-11-13
totalVan totalBike
1 + 2 + 0 = 3 0 + 0 + 2 = 2
2021-11-11 -> 2021-11-14
totalVan totalBike
1 + 2 + 0 = 3 0 + 0 + 1 = 1
One way to do this is to take the max (for each companyAccountId) of a complex string that joins the id and the field you want to find for the highest id, then extract the field you want from the end and convert it back to a number (all in a subquery, so you can sum all the resulting values)
select sum(latestTotalVan) as totalVan, sum(latestTotalBike) as totalBike
from (
select
cast(substring(max(concat(lpad(id,11,'0'),totalVan)) from 12) as unsigned) latestTotalVan,
cast(substring(max(concat(lpad(id,11,'0'),totalBike)) from 12) as unsigned) latestTotalBike
from vehicle_assignment_history
where date between '2021-11-11 00:00:00' and '2021-11-14 00:00:00'
group by companyAccountId
) latest_values
fiddle
mysql 8 adds window functions that make this kind of thing much easier.
SELECT
companyAccountId,
sum(totalVan) AS [Total Vans],
sum(totalBike) AS [Total Bike],
FROM vehicle_assignment_history
GROUP BY companyAccountId
HAVING '2021-11-11' < date AND date < '2021-11-13'
Tablename=run_detail
I have to calculate avg time of jobs for last 7 days, but in somecases
number of runs could be less than 7 days. eg abc has only 2 run_date.
(4.5+6+.....+7)/7=5.83 and (23.9+45.7)/2=34.8 and also need to
calculate based on latest 7 runs. for eg. 2020-07-04 to 2020-07-10,
not from 2020-07-01
Job_name run_date rownum count elapsed_time(sec) avg_time
xyz 2020-07-01 1 10 4.5 ?
xyz 2020-07-02 2 10 6 ?
.......
xyz 2020-07-10 10 10 7.0 ?
abc 2020-07-01 1 2 23.9 ?
abc 2020-07-02 2 2 45.7 ?
Desired Output
Job_name run_date rownum count elapsed_time(sec) avg_time
xyz 2020-07-01 1 10 4.5 5.83
xyz 2020-07-02 2 10 6 5.83
.......
xyz 2020-07-10 10 10 7.0 5.83
abc 2020-07-01 1 2 23.9 34.8
abc 2020-07-02 2 2 45.7 34.8
Could you please help how to achieve the avg time in mysql
If you want the overage over the preceding 7 days, you can use a window functions:
select t.*,
avg(elapsed_time) over (partition by job_name
order by run_date
range between interval -6 day preceding and current row
) as avg_time
from t;
Note: This assumes that you really want six preceding days plus the current date. If you really want 7 days before to 1 day before (the preceding week), then use:
range between interval -7 day preceding and interval -1 day preceding
EDIT:
In older versions of MySQL, you can use a correlated subquery:
select t.*,
(select avg(t2.elapsed_time)
from t t2
where t2.job_name = t.job_name and
t2.run_date <= t.run_date and
t2.run_date > t.run_date - interval 7 day
) as avg_time
from t;
Adjust the date comparison to get exactly the period you want.
This issue is a reference for my other question
Python solution has been done based on extract from MySQL DB (5.6.34) where original data are stored.
My question is: Is it possible to make such calculation straight in MySQL?
Just to remind:
There is 'runners' table with accumulated distance per runner and reset tags
runner startdate cum_distance reset_event
0 1 2017-04-01 100 1
1 1 2018-04-20 125 0
2 1 2018-05-25 130 1
3 2 2015-04-05 10 1
4 2 2015-10-20 20 1
5 2 2016-11-29 50 0
I would like to calculate an accumulated distance per runner since the reset point (my comments in brackets ()):
runner startdate cum_distance reset_event runner_dist_since_reset
0 1 2017-04-01 100 1 100 <-(no reset since begin)
1 1 2018-04-20 125 0 25 <-(125-100)
2 1 2018-05-25 130 1 30 <-(130-100)
3 2 2015-04-05 10 1 10 <-(no reset since begin)
4 2 2015-10-20 20 1 10 <-(20-10)
5 2 2016-11-29 50 0 30 <-(50-20)
So far I was able to calculate only differences between reset events:
SET #DistSinceReset=0;
SELECT
runner,
startdate,
reset_event,
IF(cum_distance - #DistSinceReset <0, cum_distance, cum_distance - #DistSinceReset) AS 'runner_dist_since_reset',
#DistSinceReset := cum_distance AS 'cum_distance'
FROM
runners
WHERE
reset_event = 1
GROUP BY runner, startdate
This answer is for MySQL 8.
The information you want is the most recent cum_distance for each user with reset_event = 1. You are using MySQL 8, so you can use window functions.
Here is one method:
select r.*,
(cum_distance - coalesce(preceding_reset_cum_distance, 0)) as runner_dist_since_reset
from (select r.*,
min(cum_distance) over (partition by runner order by preceding_reset) as preceding_reset_cum_distance
from (select r.*,
max(case when reset_event = 1 then start_date end) over
(partition by runner
order by start_date
rows between unbounded preceding and 1 preceding
) as preceding_reset
from runners r
) r
) r;
I have a table that contains:
id date user_id duration amount
1 2014-01-01 00:00:00 1 1 £10
2 2014-01-02 00:00:00 2 2 £10
3 2014-01-03 00:00:00 3 3 £10
I'm trying to display the amount per month. Any ideas how to do this in a query?
Working on the assumptions that you can extract the month from you datetime easily, so the real question is about the aggregation logic, and that you can create a numbers table.
Here is a simple example that shows the pattern.
sqlfiddle
CREATE TABLE Num (num int);
INSERT INTO Num VALUES (0),(1),(2),(3),(4);
CREATE TABLE Tbl (start int, run int);
INSERT INTO Tbl VALUES (1,2),(2,3);
SELECT start + num active_month
,count(*) * 10 income
FROM Tbl
INNER JOIN
Num ON num < run
GROUP BY start + num
Like Karl, I'm pretty sure some kind of numbers table is necessary here. Personally I like the approach given here, which defines a view (well, several of them) to generate numbers, instead of having to actually store a table full of numbers. Whether you use a table or a view, when you SELECT from it, it just looks like this:
n
---
0
1
2
3
…
With that you can construct a query like this:
SELECT
purchases.purchase_id,
purchases.date_purchased,
purchases.duration,
-- generator_16 is our numbers table
generator_16.n,
-- Below we calculate the year and month (year_mon) in the following way:
-- (1) Get the first day of the year, e.g. if date_purchased is 2012-12-28,
-- this gives us 2012-01-01.
-- (2) Get the month number, e.g. 12 for 2012-12-28) and add that many months
-- to the first day of the year, which gives us the first day of the
-- month, 2012-12-01.
-- (3) Add "n" months, where "n" is the number we get from the numbers table,
-- starting at 0.
DATE_ADD( -- (3)
DATE_ADD( -- (2)
MAKEDATE( YEAR(purchases.date_purchased), 1 ), -- (1)
INTERVAL MONTH(purchases.date_purchased) - 1 MONTH -- (2)
),
INTERVAL generator_16.n MONTH -- (3)
) AS year_mon,
purchases.amount_income / purchases.duration AS amount
FROM purchases
-- The below JOIN means that if `purchases.duration` is 3, we get three rows
-- that have 0, 1, and 2 in the `n` column, which we use as the number of dates
-- to add in (3) above.
JOIN generator_16
ON generator_16.n BETWEEN 0 AND purchases.duration - 1
ORDER BY purchases.purchase_id, year_mon;
This gives us a result like this (SQL Fiddle):
purchase_id date_purchased duration n year_mon amount
----------- -------------- -------- - ------------ ------
1 2013-12-28 … 2 0 2013-12-01 … 7.5
1 2013-12-28 … 2 1 2014-01-01 … 7.5
2 2014-01-04 … 1 0 2014-01-01 … 10
3 2014-02-04 … 6 0 2014-02-01 … 6.6667
3 2014-02-04 … 6 1 2014-03-01 … 6.6667
3 2014-02-04 … 6 2 2014-04-01 … 6.6667
3 2014-02-04 … 6 3 2014-05-01 … 6.6667
3 2014-02-04 … 6 4 2014-06-01 … 6.6667
3 2014-02-04 … 6 5 2014-07-01 … 6.6667
I inserted blank lines to separate the purchase_id groups so you can see how n increases from 0 to duration - 1 with each item in the group. As you can see, year_mon is equal to n months after the first day of the date_purchased month plus n months, and amount is equal to amount_income / duration.
We're almost done, but as you can see year_mon has repetition: 2014-01-01 is shown twice. This is great news, because we can then use GROUP BY to group by that column and SUM(amount) to get the total for that month:
SELECT
DATE_ADD(
DATE_ADD(
MAKEDATE( YEAR(purchases.date_purchased), 1 ),
INTERVAL MONTH(purchases.date_purchased) - 1 MONTH
),
INTERVAL generator_16.n MONTH
) AS year_mon,
SUM(purchases.amount_income / purchases.duration) AS total
FROM purchases
JOIN generator_16
ON generator_16.n BETWEEN 0 AND purchases.duration - 1
GROUP BY year_mon
ORDER BY year_mon;
The only difference between this query and the previous month is that we do GROUP BY year_mon and then SUM(amount_income / duration) to get total for the month, yielding this result (SQL Fiddle):
year_mon total
------------ ------
2013-12-01 … 7.5
2014-01-01 … 17.5
2014-02-01 … 6.6667
2014-03-01 … 6.6667
2014-04-01 … 6.6667
2014-05-01 … 6.6667
2014-06-01 … 6.6667
2014-07-01 … 6.6667
...and of course you can use DATE_FORMAT and CAST or ROUND to get nicely-formatted dates and amounts, or you can do that in your front-end code.
What about :
SELECT a.my_date, a.income, IFNULL(SUM(DISTINCT(a.income)) + sum( b.income ), a.income) as roll_up
FROM (
SELECT purchase_id, DATE_FORMAT( date_purchased, '%y-%m') AS my_date, SUM( amount_income / duration ) AS "income"
FROM incomes
GROUP BY my_date
) AS a
LEFT OUTER JOIN (
SELECT purchase_id, DATE_FORMAT( date_purchased, '%y-%m') AS my_date, SUM(amount_income / duration ) AS "income"
FROM incomes
GROUP BY my_date
) AS b ON ( a.purchase_id > b.purchase_id )
GROUP BY a.purchase_id
It's a bit tricky to do that in one shot - and it might be improved - but that gives the following results :
my_date income roll_up
13-12 8.5000 8.5000
14-01 10.0000 18.5000
14-02 16.6667 35.1667
My data set is :
1 2013-12-28 00:00:00 1 2 15
2 2014-01-04 00:00:00 2 1 10
3 2014-02-04 00:00:00 3 6 40
4 2013-12-29 00:00:00 4 1 1
5 2014-02-28 00:00:00 5 2 20