mysql query using aggregate function to select field values - mysql

MySQL 5.5.29
Here is a mysql query I am working on without success:
SELECT ID, Bike,
(SELECT IF( MIN( ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) , Reading_Date, NULL ) FROM odometer WHERE Bike=10 ) AS StartDate,
(SELECT IF( MIN( ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) , Miles, NULL ) FROM odometer WHERE Bike=10 ) AS BeginMiles,
(SELECT IF( MIN( ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) , Reading_Date, NULL ) FROM odometer WHERE Bike=10 ) AS EndDate,
(SELECT IF( MIN( ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) , Miles, NULL ) FROM odometer WHERE Bike=10 ) AS EndMiles
FROM `odometer`
WHERE Bike =10;
And the result is:
ID Bike StartDate BeginMiles EndDate EndMiles
14 10 [->] 2011-04-15 27.0 NULL NULL
15 10 [->] 2011-04-15 27.0 NULL NULL
16 10 [->] 2011-04-15 27.0 NULL NULL
Motocycle owners enter odometer readings once a year at or near January 1. I want to calculate the total mileage by motorcycle.
Here is what the data in the table odometer looks like:
(source: bmwmcindy.org)
So to calculate the mileage for this bike for 2011, I need determine which of these records is closer to Jan. 1, 2011 and that is record 14. The starting mileage would be 27. I need to find the record closest to Jan. 1, 2012 and that is record 15. The ending mileage for 2011 is 10657 (which will also be the starting odometer reading when 2012 is calculated.
Here is the table:
DROP TABLE IF EXISTS `odometer`;
CREATE TABLE IF NOT EXISTS `odometer` (
`ID` int(3) NOT NULL AUTO_INCREMENT,
`Bike` int(3) NOT NULL,
`is_MOA` tinyint(1) NOT NULL,
`Reading_Date` date NOT NULL,
`Miles` decimal(8,1) NOT NULL,
PRIMARY KEY (`ID`),
KEY `Bike` (`Bike`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=22 ;
data for table odometer
INSERT INTO `odometer` (`ID`, `Bike`, `is_MOA`, `Reading_Date`, `Miles`) VALUES
(1, 1, 0, '2012-01-01', 5999.0),
(2, 6, 0, '2013-02-01', 14000.0),
(3, 7, 0, '2013-03-01', 53000.2),
(6, 1, 1, '2012-04-30', 10001.0),
(7, 1, 0, '2013-01-04', 31000.0),
(14, 10, 0, '2011-04-15', 27.0),
(15, 10, 0, '2011-12-31', 10657.0),
(16, 10, 0, '2012-12-31', 20731.0),
(19, 1, 1, '2012-09-30', 20000.0),
(20, 6, 0, '2011-12-31', 7000.0),
(21, 7, 0, '2012-01-03', 23000.0);
I am trying to get dates and miles from different records so that I can subtact the beginning miles from the ending miles to get total miles for a particular bike (in the example Bike=10) for a particular year (in this case 2011).
I have read quite a bit about aggregate functions and problems of getting values from the correct record. I thought the answer is somehow in a subqueries. But when try the query above I get data from only the first record. In this case the ending miles should come from the second record.
I hope someone can point me in the right direction.

Miles should be steadily increasing. It would be nice if something like this worked:
select year(Reading_Date) as yr,
max(miles) - min(miles) as MilesInYear
from odometer o
where bike = 10
group by year(reading_date)
Alas, your logic is really much harder than you think. This would be easier in a database such as SQL Server 2012 or Oracle that has the lead and lag functions.
My approach is to find the first and last reading dates for each year. You can calculate this using a correlated subquery:
select o.*,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= o.date - dayofyear(o.date) + 1
) ReadDateForYear
from odometer o
Next, summarize this at the bike and year levels. If there is no read date for the year one or before the beginning of the year, use the first date:
select bike, year(date) as yr,
coalesce(min(ReadDateForYear), min(date)) as FirstReadDate,
coalesce(min(ReadDateForNextYear), max(date)) as LastReadDate
from (select o.*,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= o.date - dayofyear(o.date) + 1
) ReadDateForYear,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= date_add(o.date - dayofyear(0.date) + 1 + interval 1 year)
) ReadDateForNextYear
from odometer o
) o
group by bike, year(date)
Let me call this . To get the final results, you need something like:
select the fields you need
from <q> q join
odometer s
on s.bike = q.bike and year(s.date) = q.year join
odometer e
on s.bike = q.bike and year(e.date) = q.year
Note: this SQL is untested. I'm sure there are syntax errors.

Related

GROUP BY function ret

start
end
category
2022:10:14 17:13:00
2022:10:14 17:19:00
A
2022:10:01 16:29:00
2022:10:01 16:49:00
B
2022:10:19 18:55:00
2022:10:19 19:03:00
A
2022:10:31 07:52:00
2022:10:31 07:58:00
A
2022:10:13 18:41:00
2022:10:13 19:26:00
B
The table is sample data about trips
the target is to calculate the time consumed for each category . EX: category A = 02:18:02
1st I changed the time stamp criteria in the csv file as YYYY/MM/DD HH:MM:SS to match with MYSQL, and removed the headers
I created a table in MYSQL Workbench as the following code
CREATE TABLE trip (
start TIMESTAMP,
end TIMESTAMP,
category VARCHAR(6)
);
Then to calculate the consumed time I coded as
SELECT category, SUM(TIMEDIFF(end, start)) as length
FROM trip
GROUP BY CATEGORY;
The result was solid numbers as A=34900 & B = 38000
SO I added a convert, Time function as following:
SELECT category, Convert(SUM(TIMEDIFF(end, start)), Time) as length
FROM trip
GROUP BY category;
THE result was great with category A =03:49:00 , but unfortunately category B= NULL instead of 03:08:00
WHAT I'VE DONE WRONG , what is the different approach I should've done
You can do it as follows :
This is useful to Surpass MySQL's TIME value limit of 838:59:59
SELECT category,
CONCAT(FLOOR(SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))/3600),":",FLOOR((SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))%3600)/60),":",(SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))%3600)%60) as `length`
FROM trip
GROUP BY category;
This is to get time like 00:20:00 instead of 0:20:0
SELECT category,
CONCAT(
if(FLOOR(SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))/3600) > 10, FLOOR(SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))/3600), CONCAT('0',FLOOR(SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))/3600)) ) ,
":",
if(FLOOR((SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))%3600)/60) > 10, FLOOR((SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))%3600)/60), CONCAT('0', FLOOR((SUM(TIMESTAMPDIFF(SECOND, `start`, `end`))%3600)/60) ) ),
":",
if( (SUM(TIMESTAMPDIFF(SECOND, `start`, `end`) )%3600)%60 > 10, (SUM(TIMESTAMPDIFF(SECOND, `start`, `end`) )%3600)%60, concat('0', (SUM(TIMESTAMPDIFF(SECOND, `start`, `end`) )%3600)%60))
) as `length`
FROM trip
GROUP BY category;
You'd calculate the length for each separate trip in seconds, get sum of the lengths per category then convert seconds to time:
SELECT category, SEC_TO_TIME(SUM(TIMESTAMPDIFF(SECOND, `end`, `start`))) as `length`
FROM trip
GROUP BY category;
If SUM() exceeds the limit for TIME datatype (838:59:59) then this MAXVALUE will be returned.
For the values which exceeds the limit for TIME value use
SELECT category,
CONCAT_WS(':',
secs DIV (60 * 60),
LPAD(secs DIV 60 MOD 60, 2, 0),
LPAD(secs MOD 60, 2, 0)) AS `length`
FROM (
SELECT category, SUM(TIMESTAMPDIFF(SECOND, `end`, `start`)) AS secs
FROM trip
GROUP BY category
) subquery
;

count wthout invalid use group of function mysql

I have a table like this,
CREATE TABLE order_match
(`order_buyer_id` int, `createdby` int, `createdAt` datetime, `quantity` decimal(10,2))
;
INSERT INTO order_match
(`order_buyer_id`, `createdby`, `createdAt`, `quantity`)
VALUES
(19123, 19, '2017-02-02', 5),
(193241, 19, '2017-02-03', 5),
(123123, 20, '2017-02-03', 1),
(32242, 20, '2017-02-04', 4),
(32434, 20, '2017-02-04', 5),
(2132131, 12, '2017-02-02', 6)
;
here's the fiddle
on this table, order_buyer_id is id of the transaction, createdby are the buyer, createdAt are the time of each transaction, quantity are the quantity of transaction
I want to find out the maximum, minimum, median and average for each repeat order (the buyer with transaction > 1)
so on this table, expected results are just like this
+-----+-----+---------+--------+
| MAX | MIN | Average | Median |
+-----+-----+---------+--------+
| 3 | 2 | 2.5 | 3 |
+-----+-----+---------+--------+
note: im using mysql 5.7
I am using this syntax
select -- om.createdby, om.quantity, x1.count_
MAX(count(om.createdby)) AS max,
MIN(count(om.createdby)) AS min,
AVG(count(om.createdby)) AS average
from (select count(xx.count_) as count_
from (select count(createdby) as count_ from order_match
group by createdby
having count(createdby) > 1) xx
) x1,
(select createdby
from order_match
group by createdby
having count(createdby) > 1) yy,
order_match om
where yy.createdby = om.createdby
and om.createdAt <= '2017-02-04'
and EXISTS (select 1 from order_match om2
where om.createdby = om2.createdby
and om2.createdAt >= '2017-02-02'
and om2.createdAt <= '2017-02-04')
but it's said
Invalid use of group function
We can try aggregating by createdby, and then taking the aggregates you want:
SELECT
MAX(cnt) AS MAX,
MIN(cnt) AS MIN,
AVG(cnt) AS Average
FROM
(
SELECT createdby, COUNT(*) AS cnt
FROM order_match
GROUP BY createdby
HAVING COUNT(*) > 0
) t
To simulate the median in MySQL 5.7 is a lot of work, and ugly. If you have a long term need for median, consider upgrading to MySQL 8+.

alternative for outer apply

This is my tables
create table #vehicles (vehicle_id int, sVehicleName varchar(50))
create table #location_history ( vehicle_id int, location varchar(50), date datetime)
insert into #vehicles values
(1, 'MH 14 aa 1111'),
(2,'MH 12 bb 2222'),
(3,'MH 13 cc 3333'),
(4,'MH 42 dd 4444')
insert into #location_history values
( 1, 'aaa', getdate()),
( 1, 'bbb' , getdate()),
( 2, 'ccc', getdate()),
( 2, 'ddd', getdate()),
(3, 'eee', getdate()),
( 3, 'fff', getdate()),
( 4, 'ggg', getdate()),
( 4 ,'hhh', getdate())
This is query which I execute in SQL server.
select v.sVehicleName as VehicleNo, ll.Location
from #vehicles v outer APPLY
(select top 1 Location from #location_history where vehicle_id = v.vehicle_id
) ll
This is output in SQL server.
VehicleNO|Location
MH14aa1111 | aaa
MH12bb2222 | ccc
MH13cc3333 | eee
MH42dd4444 |ggg
I want to execute this in MySQL. and I want same output mentioned above.
First, the SQL Server query doesn't actually make sense, because you are using top without an order by.
Presumably, you intend something like this:
select v.sVehicleName as VehicleNo, ll.Location
from #vehicles v outer APPLY
(select top 1 Location
from #location_history
where vehicle_id = v.vehicle_id
order by ?? -- something to indicate ordering
) ll;
You need a method to get the latest record for each vehicle. Under normal circumstances, I think date would contain this information -- however, this is not true in your sample data.
Assuming that date really does contain unique values, then you can do:
select v.sVehicleName as VehicleNo, ll.Location
from vehicles v join
location_history lh
using (vehicle_id)
where lh.date = (select max(lh2.date)
from location_history lh2
where lh2.vehicle_id = lh.vehicle_id
);
Otherwise, you can do what you want using a correlated subquery. However, this will return an arbitrary matching value on the most recent date:
select v.sVehicleName as VehicleNo,
(select ll.Location
from location_history lh2
where lh2.vehicle_id = lh.vehicle_id
order by date desc
limit 1
) as location
from vehicles v ;

Sum Multiple Row Date Difference Mysql

Table car_log
Speed LogDate
5 2013-04-30 10:10:09 ->row1
6 2013-04-30 10:12:15 ->row2
4 2013-04-30 10:13:44 ->row3
17 2013-04-30 10:15:32 ->row4
22 2013-04-30 10:18:19 ->row5
3 2013-04-30 10:22:33 ->row6
4 2013-04-30 10:24:14 ->row7
15 2013-04-30 10:26:59 ->row8
2 2013-04-30 10:29:19 ->row9
I want to know how long the car get speed under 10.
In my mind, i will count the LogDate difference between row 1 - row4 (because in 10:14:44 => between row4 and row3, the speed is 4) + (sum) LogDate difference between row6 - row8. I am doubt if it right or no.
How can i count it in mysql queries. Thank you.
For every row, find a first row with higher (later) LogDate. If the speed in this row is less than 10, count date difference between this row's date and next row's date, else put 0.
A query that would give a list of the values counted this way should look like:
SELECT ( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate
LIMIT 1
) AS seconds_below_10
FROM car_log c1
Now its just a matter of summing it up:
SELECT sum( seconds_below_10) FROM
( SELECT ( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate
LIMIT 1
) AS seconds_below_10
FROM car_log c1 ) seconds_between_logs
Update after comment about adding CarId:
When you have more than 1 car you need to add one more WHERE condition inside dependent subquery (we want next log for that exact car, not just any next log) and group whole rowset by CarId, possibly adding said CarId to the select to show it too.
SELECT sbl.carId, sum( sbl.seconds_below_10 ) as `seconds_with_speed_less_than_10` FROM
( SELECT c1.carId,
( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate AND c2.carId = c1.carId
LIMIT 1 ) AS seconds_below_10
FROM car_log c1 ) sbl
GROUP BY sbl.carId
See an example at Sqlfiddle.
If the type of column 'LogDate' is a MySQL DATETIME type, you can use the timestampdiff() function in your select statement to get the difference between timestamps. The timestampdiff function is documented in the manual at:
http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_timestampdiff
You need to break the query down into subqueries, and then use the TIMESTAMPDIFF function.
The function takes three arguments, the units you want the result in (ex. SECOND, MINUTE, DAY, etc), and then Value2, and last Value1.
To get the maximum value for LogDate where speed is less than 10 use:
select MAX(LogDate) from <yourtable> where Speed<10
To get the minimum value for LogDate where speed is less than 10 use:
select MIN(LogDate) from <yourtable> where Speed<10
Now, combine these into a single query with the TIMESTAMPDIFF function:
select TIMESTAMPDIFF(SECOND, (select MAX(LogDate) from <yourtable> where Speed<10, (select MIN(LogDate) from <yourtable> where Speed<10)));
If LogDate is of a different type, there are other Date/Time Diff functions to handle math between any of these types. You will just need to change 'TIMESTAMPDIFF' to the correct function for your column type.
Additional ref: http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html
Try this SQL:
;with data as (
select *
from ( values
( 5, convert(datetime,'2013-04-30 10:10:09') ),
( 6, convert(datetime,'2013-04-30 10:12:15') ),
( 4, convert(datetime,'2013-04-30 10:13:44') ),
(17, convert(datetime,'2013-04-30 10:15:32') ),
(22, convert(datetime,'2013-04-30 10:18:19') ),
( 3, convert(datetime,'2013-04-30 10:22:33') ),
( 4, convert(datetime,'2013-04-30 10:24:14') ),
(15, convert(datetime,'2013-04-30 10:26:59') ),
( 2, convert(datetime,'2013-04-30 10:29:19') )
) data(speed,logDate)
)
, durations as (
select
duration = case when speed<=10
then datediff(ss, logDate, endDate)
else 0
end
from (
select
t1.speed, t1.logDate, endDate = (
select top 1 logDate
from data
where data.logDate > t1.logDate
)
from data t1
) T
where endDate is not null
)
select TotalDuration = sum(duration)
from durations
which calculates 589 seconds from the sample data provided.

Sum amount of overlapping datetime ranges in MySQL

I have a question that is almost the same as Sum amount of overlapping datetime ranges in MySQL, so I'm reusing part of his text, hope that is ok...
I have a table of events, each with a StartTime and EndTime (as type DateTime) in a MySQL Table.
I'm trying to output the sum of overlapping times for each type of event and the number of events that overlapped.
What is the most efficient / simple way to perform this query in MySQL?
CREATE TABLE IF NOT EXISTS `events` (
`EventID` int(10) unsigned NOT NULL auto_increment,
`EventType` int(10) unsigned NOT NULL,
`StartTime` datetime NOT NULL,
`EndTime` datetime default NULL,
PRIMARY KEY (`EventID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=37 ;
INSERT INTO `events` (`EventID`, EventType,`StartTime`, `EndTime`) VALUES
(10001,1, '2009-02-09 03:00:00', '2009-02-09 10:00:00'),
(10002,1, '2009-02-09 05:00:00', '2009-02-09 09:00:00'),
(10003,1, '2009-02-09 07:00:00', '2009-02-09 09:00:00'),
(10004,3, '2009-02-09 11:00:00', '2009-02-09 13:00:00'),
(10005,3, '2009-02-09 12:00:00', '2009-02-09 14:00:00');
# if the query was run using the data above,
# the table below would be the desired output
# Number of Overlapped Events , The event type, | Total Amount of Time those events overlapped.
1,1, 03:00:00
2,1, 02:00:00
3,1, 02:00:00
1,3, 01:00:00
There is a really beautiful solution given there by Mark Byers and I'm wondering if that one can be extended to include "Event Type".
His solution without event type was:
SELECT `COUNT`, SEC_TO_TIME(SUM(Duration))
FROM (
SELECT
COUNT(*) AS `Count`,
UNIX_TIMESTAMP(Times2.Time) - UNIX_TIMESTAMP(Times1.Time) AS Duration
FROM (
SELECT #rownum1 := #rownum1 + 1 AS rownum, `Time`
FROM (
SELECT DISTINCT(StartTime) AS `Time` FROM events
UNION
SELECT DISTINCT(EndTime) AS `Time` FROM events
) AS AllTimes, (SELECT #rownum1 := 0) AS Rownum
ORDER BY `Time` DESC
) As Times1
JOIN (
SELECT #rownum2 := #rownum2 + 1 AS rownum, `Time`
FROM (
SELECT DISTINCT(StartTime) AS `Time` FROM events
UNION
SELECT DISTINCT(EndTime) AS `Time` FROM events
) AS AllTimes, (SELECT #rownum2 := 0) AS Rownum
ORDER BY `Time` DESC
) As Times2
ON Times1.rownum = Times2.rownum + 1
JOIN events ON Times1.Time >= events.StartTime AND Times2.Time <= events.EndTime
GROUP BY Times1.rownum
) Totals
GROUP BY `Count`
SELECT
COUNT(*) as occurrence
, sub.event_id
, SEC_TO_TIME(SUM(LEAST(e1end, e2end) - GREATEST(e1start, e2start)))) as duration
FROM
( SELECT
, e1.event_id
, UNIX_TIMESTAMP(e1.starttime) as e1start
, UNIX_TIMESTAMP(e1.endtime) as e1end
, UNIX_TIMESTAMP(e2.starttime) as e2start
, UNIX_TIMESTAMP(e2.endtime) as e2end
FROM events e1
INNER JOIN events e2
ON (e1.eventtype = e2.eventtype AND e1.id <> e2.id
AND NOT(e1.starttime > e2.endtime OR e1.endtime < e2.starttime))
) sub
GROUP BY sub.event_id
ORDER BY occurrence DESC