I have the following sql table:
id time
1 2018-12-30
1 2018-12-31
1 2018-01-03
2 2018-12-15
2 2018-12-30
I want to make a query which will result in following data:
id start_time end_time
1 2018-12-30 2018-12-31
1 2018-12-31 2018-01-03
2 2018-12-15 2018-12-30
Is this even possible to do with sql in reasonable amount of time or it is better to do this with other means?
Failed approach (it takes too much time):
SELECT id, time as start_time, (
SELECT MIN(time)
FROM table as T2
WHERE T2.id = T1.id
AND T2.time < T1.time
) as end_time
FROM table as T1
I have dates in my db, each of them have non unique id. I want to calculate time range between closest dates for each id. So transformation should be performed on each id separately and should not affect other ids. We can even forget about ids, and just imagine that I have only one column in my DB which is dates. I want to sort my dates and perform sliding window with step 1 and capacity 2. So if I have 10 dates, I want to have in a result 9 time ranges, which are should be in increasing order. Assume we have four dates: D1 < D2 < D3 < D4. Result should be (D1,D2), (D2,D3), (D3,D4)
In MySQL 8.x you can use the LEAD() function to peek at the next row:
with x as (
select
id,
time as start_time,
lead(time) over(partition by id order by time) as end_time
from my_table
)
select * from x where end_time is not null
Related
I need to extract data from a MySQL table, but am not allowed to include a record if there's a previous record less than a year old.
Given the following records, only the records 1, 3 and 5 should be included (because record 2 was created 1 month after record 1, and record 4 was created 1 month after record 3):
1 2019-12-21
2 2020-01-21
3 2021-12-21
4 2022-01-21
5 2023-12-21
I came up with the following non-functional solution:
SELECT
*
FROM
table t
WHERE
(created_at > DATE_ADD(
(SELECT
created_at
FROM
table t2
WHERE
t2.created_at < t.created_at
ORDER BY
t2.created_at
DESC LIMIT 1), INTERVAL 1 YEAR)
But this only returns the first and the last record, but not the third:
1 2019-12-21
5 2023-12-21
I know why: the third record gets excluded because record 2 is less than a year old. But record 2 shouldn't be taken into account, because it won't make the list itself.
How can I solve this?
Using lag, assuming your MySql supports it, you can calculate the difference in months using period_diff
with d as (
select * ,
period_diff(extract(year_month FROM date),
extract(year_month from lag(date,1,date) over (order by date))
) as m
from t
)
select id, date
from d
where m=0 or m>12
Demo Fiddle
Given a database table that contains a list of race times, I need to be able to identify which of the performances are faster than earlier finish times for that athlete at a specific distance, e.g. it was their best time at the time of the performance.
Also, would it be better to update this and store as a boolean in an additional column at rest, rather than trying to calculate when doing a SELECT. The database isn't populated in chronological order, so not sure if a TRIGGER would help. I was thinking of a query that runs on the whole table after any inserts/updates. Appreciate this may have a performance impact, so could be run periodically rather than on each row update.
This is on a MySQL 5.6.47 server.
Example table
athleteId date distance finishTime
1 2020-01-04 5K 30:00
1 2020-01-11 5K 30:09
1 2020-01-18 5K 29:45
1 2020-01-25 5K 29:32
1 2020-02-01 5K 31:18
1 2020-02-02 10K 1:06:07
1 2020-02-08 5K 28:25
1 2020-02-23 10K 1:06:02
1 2020-02-23 10K 1:07:30
Expected output
athleteId date distance finishTime isPersonalBest
1 2020-01-04 5K 30:00 Y
1 2020-01-11 5K 30:09 N
1 2020-01-18 5K 29:45 Y
1 2020-01-25 5K 29:32 Y
1 2020-02-01 5K 31:18 N
1 2020-02-02 10K 1:06:07 Y
1 2020-02-08 5K 28:25 Y
1 2020-02-23 10K 1:06:02 Y
1 2020-02-23 10K 1:07:30 N
The data is just an example. The actual finish times are stored in seconds. There will be many more athletes and different event distances. If a performance is the first for that athlete at that distance, it would be classed as a personal best.
If you are running MysQL 8.0, you can use window functions:
select
t.*,
case when finishTime < min(finishTime) over(
partition by athleteId, distance
order by date
rows between unbounded preceding and 1 preceding
)
then 'Y'
else 'N'
end isPersonalBest
from mytable t
In earlier versions, one option is a correlated subquery:
select
t.*,
case when exists(
select 1
from mytable t1
where
t1.athleteId = t.athleteId
and t1.distance = t.distance
and t1.date < t.date
and t1.finishTime <= t.finishTime
)
then 'N'
else 'Y'
end isPersonalBest
from mytable t
I wouldn't recommend actually storing this derived information. Instead, you use the above query to create a view.
You can use a cumulative min in MySQL 8+:
select t.*,
(case when finishTime >=
min(finishTime) over (partition by athleteid, distance
order by date
rows between unbounded preceding and 1 preceding
)
then 'N' else 'Y'
end) as isPersonalBest
from t;
Here is a db<>fiddle.
In earlier versions, you could use not exists:
select t.*,
(case when not exists (select 1
from t t2
where t2.atheleteid = t.athleteid and
t2.distance = t.distance and
t2.date < t.date and
t2.finishTime <= t.finishTime
)
then 'Y' else 'N'
end) as isPersonalBest
from t;
I have the following database schema
ID creation_date
1 2019-06-03
2 2019-06-04
3 2019-06-04
4 2019-06-10
5 2019-06-11
I need to find out the total size of the table group by week. The output I am looking for is something like
year week number_of_records
2019 23 3
2019 24 5
I am writing the following query which only gives me number of record created in each week
> select year(creation_date) as year, weekofyear(creation_date) as week,
> count(id) from input group by year, week;
Output I get is
year week number_of_records
2019 23 3
2019 24 2
Take a look to window (or analytic) functions.
Unlike aggregate functions, window functions preserve resulting rows and facilitate operations related to them. When using order by in over clause, windowing is done from first row to current row according to specified order, which is exactly what you need.
select year, week, sum(number_of_records) over (order by year, week)
from (
select year(creation_date) as year, weekofyear(creation_date) as week,
count(id) as number_of_records
from input group by year, week
) your_sql
I guess you will also need to reset sum for each year, which I leave as exercise for you (hint: partition clause).
For versions prior to 8.0...
Schema (MySQL v5.7)
CREATE TABLE my_table
(ID SERIAL PRIMARY KEY
,creation_date DATE NOT NULL
);
INSERT INTO my_table VALUES
(1 , '2019-06-03'),
(2 , '2019-06-04'),
(3 , '2019-06-04'),
(4 ,'2019-06-10'),
(5 ,'2019-06-11');
Query #1
SELECT a.yearweek
, #i:=#i+a.total running
FROM
(SELECT DATE_FORMAT(x.creation_date,'%x-%v') yearweek
, COUNT(*) total
FROM my_table x
GROUP BY yearweek
)a
JOIN (SELECT #i:=0) vars
ORDER BY a.yearweek;
| yearweek | running |
| -------- | ------- |
| 2019-23 | 3 |
| 2019-24 | 5 |
---
View on DB Fiddle
You seem to want a cumulative sum. You can do this with window functions directly in an aggregation query:
select year(creation_date) as year, weekofyear(creation_date) as week,
count(*) as number_of_starts,
sum(count(*)) over (order by min(creation_date)) as number_of_records
from input
group by year, week;
I want to visualize my entries by counting how many have been created at the same day.
SELECT dayname(created_at), count(*) FROM logs
group by day(created_at)
ORDER BY created_at desc
LIMIT 7
So I get something like:
Thursday 4
Wednesday 12
Monday 4
Sunday 1
Saturday 20
Friday 23
Thursday 10
But I also want to have the Tuesday in there with 0 so I have it for one week.
Is there a way to do this with full mysql or do I need to update the result before I can give it to the chart?
EDIT:
This is the final query:
SELECT
DAYNAME(date_add(NOW(), interval days.id day)) AS day,
count(logs.id) AS amount
FROM days LEFT OUTER JOIN
(SELECT *
FROM logs
WHERE TIMESTAMPDIFF(DAY,DATE(created_at),now()) < 7) logs
on datediff(created_at, NOW()) = days.id
GROUP BY days.id
ORDER BY days.id desc;
The table days includes numbers from 0 to -6
You only need a table of offsets which could be a real table or something built on the fly like select 0 ofs union all select -1 ....
create table days (ofs int);
insert into days (ofs) values
(0), (-1), (-2), (-3),
(-4), (-5), (-6), (-7);
select
date_add('20160121', interval days.ofs day) as created_at,
count(data.id) as cnt
from days left outer join logs data
on datediff(data.created_at, '20160121') = days.ofs
group by days.ofs
order by days.ofs;
http://sqlfiddle.com/#!9/3e6bc7/1
For performance it would probably be better to limit the search in the data (logs) table:
select
date_add('20160121', interval days.ofs day) as created_at,
count(data.id) as cnt
from days left outer join
(select * from logs where created_at between <start> and <end>) data
on datediff(data.created_at, '20160121') = days.offset
group by days.offset
order by days.offset;
One downside is that you do have to parameterize this with a fixed anchor date in a couple of expressions. It might be better to have a table of real dates sitting in a table somewhere so you don't have to do the calculations.
Use RIGHT JOIN to a dates table, so you can request data for each and all days, no matter if some days have data or not, simply, mull days will show as CERO or NULL.
You can create a dates table, some sort of calendar table.
id_day | day_date |
--------------------
1 | 2016-01-01 |
2 | 2016-01-02 |
.
.
365 | 2016-12-31 |
With this table, you can relate date, then extract day, month, week, whatever you want with MYSQL DATE AND TIME FUNCTIONS
SELECT t2.dayname(day_date), count(t1.created_at) FROM logs t1 right join dates_table t2 on t1.created_at=t2.day_date group by t2.day_date ORDER BY t1.created_at desc LIMIT 7
I want to make a MySQL to get daily differential values from a table who looks like this:
Date | VALUE
--------------------------------
"2011-01-14 19:30" | 5
"2011-01-15 13:30" | 6
"2011-01-15 23:50" | 9
"2011-01-16 9:30" | 10
"2011-01-16 18:30" | 15
I have made two subqueries. The first one is to get the last daily value, because I want to compute the difference values from this data:
SELECT r.Date, r.VALUE
FROM table AS r
JOIN (
SELECT DISTINCT max(t.Date) AS Date
FROM table AS t
WHERE t.Date < CURDATE()
GROUP BY DATE(t.Date)
) AS x USING (Date)
The second one is made to get the differential values from the result of the first one (I show it with "table" name):
SELECT Date, VALUE - IFNULL(
(SELECT MAX( VALUE )
FROM table
WHERE Date < t1.table) , 0) AS diff
FROM table AS t1
ORDER BY Date
At first, I tried to save the result of first query in a temporary table but it's not possible to use temporary tables with the second query. If I use the first query inside the FROM of second one between () with an alias, the server complaints about table alias doesn't exist. How can get a something like this:
Date | VALUE
---------------------------
"2011-01-15 00:00" | 4
"2011-01-16 00:00" | 6
Try this query -
SELECT
t1.dt AS date,
t1.value - t2.value AS value
FROM
(SELECT DATE(date) dt, MAX(value) value FROM table GROUP BY dt) t1
JOIN
(SELECT DATE(date) dt, MAX(value) value FROM table GROUP BY dt) t2
ON t1.dt = t2.dt + INTERVAL 1 DAY