Table car_log
Speed LogDate
5 2013-04-30 10:10:09 ->row1
6 2013-04-30 10:12:15 ->row2
4 2013-04-30 10:13:44 ->row3
17 2013-04-30 10:15:32 ->row4
22 2013-04-30 10:18:19 ->row5
3 2013-04-30 10:22:33 ->row6
4 2013-04-30 10:24:14 ->row7
15 2013-04-30 10:26:59 ->row8
2 2013-04-30 10:29:19 ->row9
I want to know how long the car get speed under 10.
In my mind, i will count the LogDate difference between row 1 - row4 (because in 10:14:44 => between row4 and row3, the speed is 4) + (sum) LogDate difference between row6 - row8. I am doubt if it right or no.
How can i count it in mysql queries. Thank you.
For every row, find a first row with higher (later) LogDate. If the speed in this row is less than 10, count date difference between this row's date and next row's date, else put 0.
A query that would give a list of the values counted this way should look like:
SELECT ( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate
LIMIT 1
) AS seconds_below_10
FROM car_log c1
Now its just a matter of summing it up:
SELECT sum( seconds_below_10) FROM
( SELECT ( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate
LIMIT 1
) AS seconds_below_10
FROM car_log c1 ) seconds_between_logs
Update after comment about adding CarId:
When you have more than 1 car you need to add one more WHERE condition inside dependent subquery (we want next log for that exact car, not just any next log) and group whole rowset by CarId, possibly adding said CarId to the select to show it too.
SELECT sbl.carId, sum( sbl.seconds_below_10 ) as `seconds_with_speed_less_than_10` FROM
( SELECT c1.carId,
( SELECT IF( c1.speed <10, unix_timestamp( c2.LogDate ) - unix_timestamp( c1.logdate ) , 0 )
FROM car_log c2
WHERE c2.LogDate > c1.LogDate AND c2.carId = c1.carId
LIMIT 1 ) AS seconds_below_10
FROM car_log c1 ) sbl
GROUP BY sbl.carId
See an example at Sqlfiddle.
If the type of column 'LogDate' is a MySQL DATETIME type, you can use the timestampdiff() function in your select statement to get the difference between timestamps. The timestampdiff function is documented in the manual at:
http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_timestampdiff
You need to break the query down into subqueries, and then use the TIMESTAMPDIFF function.
The function takes three arguments, the units you want the result in (ex. SECOND, MINUTE, DAY, etc), and then Value2, and last Value1.
To get the maximum value for LogDate where speed is less than 10 use:
select MAX(LogDate) from <yourtable> where Speed<10
To get the minimum value for LogDate where speed is less than 10 use:
select MIN(LogDate) from <yourtable> where Speed<10
Now, combine these into a single query with the TIMESTAMPDIFF function:
select TIMESTAMPDIFF(SECOND, (select MAX(LogDate) from <yourtable> where Speed<10, (select MIN(LogDate) from <yourtable> where Speed<10)));
If LogDate is of a different type, there are other Date/Time Diff functions to handle math between any of these types. You will just need to change 'TIMESTAMPDIFF' to the correct function for your column type.
Additional ref: http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html
Try this SQL:
;with data as (
select *
from ( values
( 5, convert(datetime,'2013-04-30 10:10:09') ),
( 6, convert(datetime,'2013-04-30 10:12:15') ),
( 4, convert(datetime,'2013-04-30 10:13:44') ),
(17, convert(datetime,'2013-04-30 10:15:32') ),
(22, convert(datetime,'2013-04-30 10:18:19') ),
( 3, convert(datetime,'2013-04-30 10:22:33') ),
( 4, convert(datetime,'2013-04-30 10:24:14') ),
(15, convert(datetime,'2013-04-30 10:26:59') ),
( 2, convert(datetime,'2013-04-30 10:29:19') )
) data(speed,logDate)
)
, durations as (
select
duration = case when speed<=10
then datediff(ss, logDate, endDate)
else 0
end
from (
select
t1.speed, t1.logDate, endDate = (
select top 1 logDate
from data
where data.logDate > t1.logDate
)
from data t1
) T
where endDate is not null
)
select TotalDuration = sum(duration)
from durations
which calculates 589 seconds from the sample data provided.
Related
raw data
no
group
date
value
flag
1
a
2022-10-13
old
y
2
a
2022-10-15
new
y
3
b
2022-01-01
old
n
4
b
2022-01-03
new
n
step1. insert no1 raw
step2. modify date value using by no2 raw
and I want to update latest date no1 raw using by no2 raw
and the condition is where `flag` = "y"
final sql table
no
group
date
value
flag
1
a
2022-10-15
old
y
3
b
2022-01-01
old
n
is it possible?
+) I insert/update raw data line by line.
Not entirely clear but I hope below answer gives you a hint if not the solution.
select no,
`group`,
case when flag='Y' then mx_dt else `date` end as new_date,
value,
flag
from ( select no,
`group`,
value,
`date`,
flag ,
row_number() over(partition by `group` order by `date` asc ) as rn,
max(`date`) over (partition by `group`,(case when flag <> 'Y' then `date` end) ) mx_dt
from raw_data
) as tbl
where rn=1;
Above code will select the max(date) per group if the flag=Y otherwise it will take the date per row.
https://dbfiddle.uk/JhRUti2h
The solution is to self join the source table and select the right field, prioritizing the latest date.
Here you have a working query:
WITH source_data AS (
SELECT 1 AS no_, 'a' AS group_, CAST('2022-10-13' AS DATE) AS date, 'old' AS value, 'y' AS flag
UNION ALL
SELECT 2, 'a', CAST('2022-10-15' AS DATE), 'new', 'y'
UNION ALL
SELECT 3, 'b', CAST('2022-01-01' AS DATE), 'old', 'n'
UNION ALL
SELECT 4, 'b', CAST('2022-01-03' AS DATE), 'new', 'n')
SELECT no_, group_, COALESCE(new_date, date), value, flag
FROM
(SELECT * FROM source_data WHERE value = 'old') old_values
LEFT JOIN (SELECT group_ AS new_group, date AS new_date FROM source_data WHERE value = 'new' AND flag='y') new_values
ON old_values.group_ = new_values.new_group
The result is what you expected:
no_ group_ f0_ value flag
1 a 2022-10-15 old y
3 b 2022-01-01 old n
I want to give condition in a column selection while performing the select statement.
I want to perform average of TOTAL_TIMEONSITE, RENAME IT, and want to average it for the values existing in the month of Jun'20, Jul'20 and Aug'20 against a visitor.
Also the range of the whole query must be the month of Aug'20 only. So I want to put the constraint on TOTAL_TIMEONSITE so that it averages the values for the months of Jun'20, Jul'20 and Aug'20 against a visitor.
select FULLVISITORID AS VISITOR_ID,
VISITID AS VISIT_ID,
VISITSTARTTIME_TS,
USER_ACCOUNT_TYPE,
(select AVG(TOTAL_TIMEONSITE) AS AVG_TOTAL_TIME_ON_SITE_LAST_3M FROM "ACRO_DEV"."GA"."GA_MAIN" WHERE
(cast((visitstarttime_ts) as DATE) >= to_date('2020-06-01 00:00:00.000') and CAST((visitstarttime_ts) AS DATE) <= to_date('2020-08-31 23:59:00.000'))
GROUP BY TOTAL_TIMEONSITE),
CHANNELGROUPING,
GEONETWORK_CONTINENT
from "ACRO_DEV"."GA"."GA_MAIN"
where (FULLVISITORID) in (select distinct (FULLVISITORID) from "ACRO_DEV"."GA"."GA_MAIN" where user_account_type in ('anonymous', 'registered')
and (cast((visitstarttime_ts) as DATE) >= to_date('2020-08-01 00:00:00.000') and CAST((visitstarttime_ts) AS DATE) <= to_date('2020-08-31 23:59:00.000')));
The issue is that it is giving me the 'select subquery for TOTAL_TIMEONSITE' as the resultant column name and the values in that column are all same but I want the values to be unique for visitors.
So for Snowflake:
So I am going to assume visitstarttime_ts is a timestamp thus
cast((visitstarttime_ts) as DATE) is the same as `visitstarttime_ts::date'
select to_timestamp('2020-08-31 23:59:00') as ts
,cast((ts) as DATE) as date_a
,ts::date as date_b;
gives:
TS
DATE_A
DATE_B
2020-08-31 23:59:00.000
2020-08-31
2020-08-31
and thus the date range also can be simpler
select to_timestamp('2020-08-31 13:59:00') as ts
,cast((ts) as DATE) as date_a
,ts::date as date_b
,date_a >= to_date('2020-08-01 00:00:00.000') and date_a <= to_date('2020-08-31 23:59:00.000') as comp_a
,date_b >= to_date('2020-08-01 00:00:00.000') and date_b <= to_date('2020-08-31 23:59:00.000') as comp_b
,date_b >= '2020-08-01'::date and date_a <= '2020-08-31 23:59:00.000'::date as comp_c
,date_b between '2020-08-01'::date and '2020-08-31 23:59:00.000'::date as comp_d
TS
DATE_A
DATE_B
COMP_A
COMP_B
COMP_C
COMP_D
2020-08-31 13:59:00.000
2020-08-31
2020-08-31
TRUE
TRUE
TRUE
TRUE
Anyways, if I understand what you want I would write it like using CTE to make it more readable (to me):
with distinct_aug_ids as (
SELECT DISTINCT
fullvisitorid
FROM acro_dev.ga.ga_main
WHERE user_account_type IN ('anonymous', 'registered')
AND visitstarttime_ts::date BETWEEN '2020-08-01::date AND '2020-08-31'::date
), three_month_avg as (
SELECT
fullvisitorid
,AVG(total_timeonsite) AS avg_total_time_on_site_last_3m
FROM acro_dev.ga.ga_main
WHERE visitstarttime_ts::DATE BETWEEN to_date('2020-06-01 00:00:00.000') AND to_date('2020-08-31 23:59:00.000')
GROUP BY 1
)
select
m.fullvisitorid as visitor_id,
m.visitid as visit_id,
m.visitstarttime_ts,
m.user_account_type,
tma.avg_total_time_on_site_last_3m,
m.channelgrouping,
m.geonetwork_continent
FROM acro_dev.ga.ga_main as m
JOIN distinct_aug_ids AS dai
ON m.fullvisitorid = dai.fullvisitorid
JOIN three_month_avg AS tma
ON m.fullvisitorid = tma.fullvisitorid
;
But if you want that to be sub-selects, they are the same:
select
m.fullvisitorid as visitor_id,
m.visitid as visit_id,
m.visitstarttime_ts,
m.user_account_type,
tma.avg_total_time_on_site_last_3m,
m.channelgrouping,
m.geonetwork_continent
FROM acro_dev.ga.ga_main as m
JOIN (
SELECT DISTINCT
fullvisitorid
FROM acro_dev.ga.ga_main
WHERE user_account_type IN ('anonymous', 'registered')
AND visitstarttime_ts::date BETWEEN '2020-08-01::date AND '2020-08-31'::date
) AS dai
ON m.fullvisitorid = dai.fullvisitorid
JOIN (
SELECT
fullvisitorid
,AVG(total_timeonsite) AS avg_total_time_on_site_last_3m
FROM acro_dev.ga.ga_main
WHERE visitstarttime_ts::DATE BETWEEN to_date('2020-06-01 00:00:00.000') AND to_date('2020-08-31 23:59:00.000')
GROUP BY 1
)AS tma
ON m.fullvisitorid = tma.fullvisitorid
;
I have a table like this:
ID_____StartDate_____EndDate
----------------------------
1______05/01/2012___02/03/2013
2______06/30/2013___07/12/2013
3______02/17/2010___02/17/2013
4______12/10/2012___11/16/2013
I'm trying to get a count of the ID's that were active during each year. If the ID was active for multiple years, it would be counted multiple times. I don't want to "hardcode" years into my query because the data is over many many multiple years. (i.e. can't use CASE YEAR(StartDate) WHEN x then y or IF...
Desired Result from the table above:
YEAR_____COUNT
2010_____1
2011_____1
2012_____3
2013_____4
I've tried:
SELECT COUNT(ID)
FROM table
WHERE (DATE_FORMAT(StartDate, '%Y-%m') BETWEEN '2013-01' AND '2013-12'
OR DATE_FORMAT(EndDate, '%Y-%m') BETWEEN '2013-01' AND '2013-12')
of course this only is for the year 2013. I also tried:
SELECT YEAR(StartDate) AS 'Start Year', YEAR(EndDate) AS 'End Year', COUNT(id)
FROM table
WHERE StartDate IS NOT NULL
GROUP BY YEAR(StartDate);
though this gave me just those that started in a given year.
Assuming that there is an auxiliary table that contains consecutive numbers from 1 .. to X (where X must be grather than possible number of years in the table):
create table series( x int primary key auto_increment );
insert into series( x )
select null from information_schema.tables;
then the query might look like:
SELECT years.year, count(*)
FROM (
SELECT mm.min_year + s.x - 1 as year
FROM (
SELECT min( year( start_date )) min_year,
max( year( end_date )) max_year
FROM tab
) mm
JOIN series s
ON s.x <= mm.max_year - mm.min_year + 1
GROUP BY mm.min_year + s.x - 1
) years
JOIN tab
ON years.year between year( tab.start_date )
and year( tab.end_date )
GROUP BY years.year
;
see a demo: http://www.sqlfiddle.com/#!2/f49ab/14
MySQL 5.5.29
Here is a mysql query I am working on without success:
SELECT ID, Bike,
(SELECT IF( MIN( ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) , Reading_Date, NULL ) FROM odometer WHERE Bike=10 ) AS StartDate,
(SELECT IF( MIN( ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2011-1-1', Reading_Date ) ) , Miles, NULL ) FROM odometer WHERE Bike=10 ) AS BeginMiles,
(SELECT IF( MIN( ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) , Reading_Date, NULL ) FROM odometer WHERE Bike=10 ) AS EndDate,
(SELECT IF( MIN( ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) ) = ABS( DATEDIFF( '2012-1-1', Reading_Date ) ) , Miles, NULL ) FROM odometer WHERE Bike=10 ) AS EndMiles
FROM `odometer`
WHERE Bike =10;
And the result is:
ID Bike StartDate BeginMiles EndDate EndMiles
14 10 [->] 2011-04-15 27.0 NULL NULL
15 10 [->] 2011-04-15 27.0 NULL NULL
16 10 [->] 2011-04-15 27.0 NULL NULL
Motocycle owners enter odometer readings once a year at or near January 1. I want to calculate the total mileage by motorcycle.
Here is what the data in the table odometer looks like:
(source: bmwmcindy.org)
So to calculate the mileage for this bike for 2011, I need determine which of these records is closer to Jan. 1, 2011 and that is record 14. The starting mileage would be 27. I need to find the record closest to Jan. 1, 2012 and that is record 15. The ending mileage for 2011 is 10657 (which will also be the starting odometer reading when 2012 is calculated.
Here is the table:
DROP TABLE IF EXISTS `odometer`;
CREATE TABLE IF NOT EXISTS `odometer` (
`ID` int(3) NOT NULL AUTO_INCREMENT,
`Bike` int(3) NOT NULL,
`is_MOA` tinyint(1) NOT NULL,
`Reading_Date` date NOT NULL,
`Miles` decimal(8,1) NOT NULL,
PRIMARY KEY (`ID`),
KEY `Bike` (`Bike`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=22 ;
data for table odometer
INSERT INTO `odometer` (`ID`, `Bike`, `is_MOA`, `Reading_Date`, `Miles`) VALUES
(1, 1, 0, '2012-01-01', 5999.0),
(2, 6, 0, '2013-02-01', 14000.0),
(3, 7, 0, '2013-03-01', 53000.2),
(6, 1, 1, '2012-04-30', 10001.0),
(7, 1, 0, '2013-01-04', 31000.0),
(14, 10, 0, '2011-04-15', 27.0),
(15, 10, 0, '2011-12-31', 10657.0),
(16, 10, 0, '2012-12-31', 20731.0),
(19, 1, 1, '2012-09-30', 20000.0),
(20, 6, 0, '2011-12-31', 7000.0),
(21, 7, 0, '2012-01-03', 23000.0);
I am trying to get dates and miles from different records so that I can subtact the beginning miles from the ending miles to get total miles for a particular bike (in the example Bike=10) for a particular year (in this case 2011).
I have read quite a bit about aggregate functions and problems of getting values from the correct record. I thought the answer is somehow in a subqueries. But when try the query above I get data from only the first record. In this case the ending miles should come from the second record.
I hope someone can point me in the right direction.
Miles should be steadily increasing. It would be nice if something like this worked:
select year(Reading_Date) as yr,
max(miles) - min(miles) as MilesInYear
from odometer o
where bike = 10
group by year(reading_date)
Alas, your logic is really much harder than you think. This would be easier in a database such as SQL Server 2012 or Oracle that has the lead and lag functions.
My approach is to find the first and last reading dates for each year. You can calculate this using a correlated subquery:
select o.*,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= o.date - dayofyear(o.date) + 1
) ReadDateForYear
from odometer o
Next, summarize this at the bike and year levels. If there is no read date for the year one or before the beginning of the year, use the first date:
select bike, year(date) as yr,
coalesce(min(ReadDateForYear), min(date)) as FirstReadDate,
coalesce(min(ReadDateForNextYear), max(date)) as LastReadDate
from (select o.*,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= o.date - dayofyear(o.date) + 1
) ReadDateForYear,
(select max(date) from odometer o2 where o.bike = o2.bike and o2.date <= date_add(o.date - dayofyear(0.date) + 1 + interval 1 year)
) ReadDateForNextYear
from odometer o
) o
group by bike, year(date)
Let me call this . To get the final results, you need something like:
select the fields you need
from <q> q join
odometer s
on s.bike = q.bike and year(s.date) = q.year join
odometer e
on s.bike = q.bike and year(e.date) = q.year
Note: this SQL is untested. I'm sure there are syntax errors.
I have a query that works correctly to pull a series of targets and total hours worked for company A. I would like to run the exact same query for company B and join them on a common date, which happens to be grouped by week. My current query:
SELECT * FROM (
SELECT org, date,
( SELECT SUM( target ) FROM target WHERE org = "companyA" ) AS companyA_target,
SUM( hours ) AS companyA_actual
FROM time_management_system
WHERE org = "companyA"
GROUP BY WEEK( date )
ORDER BY DATE
) q1
LEFT JOIN (
SELECT org, date,
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY WEEK( date )
ORDER BY DATE
) q2
ON q1.date = q2.date
The results show all of the dates / information of companyA, however companyB only shows sporadic data. Separately, the two queries will show the exact same set of dates, just with different information in the 'target' and 'actual' columns.
companyA 2012-01-28 105.00 39.00 NULL NULL NULL NULL
companyA 2012-02-05 105.00 15.00 NULL NULL NULL NULL
companyA 2012-02-13 105.00 60.50 companyB 2012-02-13 97.50 117.50
Any idea why I'm not getting all the information for companyB?
As a side note, would anybody be able to point in the direction of converting each row's week value into a column? With companyA and companyB as the only two rows?
I appreciate all the help! Thanks.
WITH no date apparent in the target table, the summation will be constant across all weeks. So, I have performed a pre-query for only those "org" values of company A and B with a group by. This will ensure only 1 record per "org" so you don't get a Cartesian result.
Then, I am querying the time_management_system ONCE for BOTH companies. Within the field computations, I am applying an IF() to test the company value and apply when correct. The WEEK activity is the same for both in the final result, so I don't have to do separately and join. This also prevents the need of having the date column appear twice. I also don't need to explicitly add the org column names as the final column names reflect that.
SELECT
WEEK( tms.date ) as GrpWeek,
IF( tms.org = "companyA", TargetSum.CompTarget, 00000.00 )) as CompanyATarget,
SUM( IF( tms.org = "companyA", tms.hours, 0000.00 )) as CompanyAHours,
IF( tms.org = "companyB", TargetSum.CompTarget, 00000.00 )) as CompanyBTarget,
SUM( IF( tms.org = "companyB", tms.hours, 000.00 )) as CompanyBHours
from
Time_Management_System tms
JOIN ( select
t.org,
SUM( t.target ) as CompTarget
from
Target T
where
t.org in ( "companyA", "companyB" )
group by
t.org ) as TargetSums
ON tms.org = TargetSums.org
where
tms.org in ( "companyA", "companyB" )
group by
WEEK( tms.date )
order by
WEEK( tms.date )
Both of your subqueries are wrong.
Either you want this:
SELECT
org,
WEEK(date),
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY WEEK( date )
Or else you want this:
SELECT
org,
date,
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY date
The way you are doing it now is not correctly formed SQL. In pretty much any other database your query would fail immediately with an error. MySQL is more lax and runs the query but gives indeterminate results.
GROUP BY and HAVING with Hidden Columns