I'm trying to find a way of retrieving a value from the previous row. What I want to do is first sort the rows by Date 1 (earliest first). Then, if Date 2 is later than all previous dates in that column, I want to pull out that row (plus the first initial row). My server does not support the LAG function. I have tried suggestions using CTE, but my server does not seem to recognise that either.
What I want to do is check whether, after sorting by Date 1, if Date_2 for row 2 > Date_2 for row 1, and if so return that row.
Here's an example table. As you can see, the ID is not in the same order as Date 1.
ID Date 1 Date 2
1 2000-01-01 2010-01-01
2 2001-08-01 2013-06-01
3 2000-06-01 2011-01-01
4 1999-07-01 2010-12-01
5 2002-02-01 2012-12-01
So in my example, I want these 3 records to be returned:
ID Date_1 Date_2 Previous_max
4 1999-07-01 2010-12-01 NULL
3 2000-06-01 2011-01-01 2010-12-01
2 2001-08-01 2013-06-01 2011-01-01
ID 1 and 5 are not returned because Date 1 is later and Date 2 is earlier than another row (4 and 2 respectively).
You should be able to do this with a correlated subquery:
select t.*,
(select max(date_2) from table t2 where t2.date_1 < t.date_1) as prev_max
from table t
having prev_max is null or prev_max < date_2;
Related
I need to extract and migrate values from one table to another. the source table contains sumarized values for a specific effectivity date. If a value is changed, a new line is written if something is changed on the component values with the data valid starting at this effective date.
source_id
entity_id
effective_date
component_1
component_2
component_3
int(ai)
int
date
int
int
int
1
159
2020-01-01
100
0
90
2
159
2020-05-01
140
50
90
3
159
2020-08-01
0
30
90
5
159
2020-12-01
0
30
50
i need now migrate this data to a new table like this. the goal is that selecting data for a given month the result is the valid data for this month is given.
id
source_id
entity_id
startdate
enddate
component_type
value
int(ai)
int
int
date
date
int
int
each row represents a value for a component valid for a period of month.
I now run the insert update for each effective month by setting it as a parameter.
I insert value changes as new rows to the table an prevent duplicates by using a unique key (entity_id,effective_date,component_type)
SET #effective_date = '2020-01-01';
INSERT INTO component_final
select NULL,
source_id,
entity_id,
effective_date,
NULL,
1,
component_1
FROM component_source
WHERE effective_date = #effective_date
AND component_1>0;
after migrating the first row it should be that result
id
source_id
entity_id
startdate
enddate
component_type
value
1
1
159
2020-01-01
NULL
1
100
2
1
159
2020-01-01
NULL
3
90
SET #effective_date = '2020-05-01';
INSERT INTO component_final
select NULL,
source_id,
entity_id,
effective_date,
NULL,
1,
component_1
FROM component_source
WHERE effective_date = #effective_date
AND component_1>0;
after migrating the second row it should be that result
id
source_id
entity_id
startdate
enddate
component_type
value
1
1
159
2020-01-01
2020-04-30
1
100
2
1
159
2020-01-01
NULL
3
90
3
2
159
2020-05-01
NULL
1
140
4
2
159
2020-05-01
NULL
2
50
so if there is a value change in the future an end date has to be set.
I'm not able to do the second step, updating the data, if the component is changed in the future.
Maybe it is possible to have it as triggers after insert new row with same entity and component - but I was not able to make it work.
Some ideas? I want to handle this only inside of the MySQL.
You do not need the column enddate in the table component_final, because it's value depends on other values in the same table:
SELECT
id,
source_id,
entity_id,
startdate,
( SELECT DATE_ADD(MIN(cf2.startdate),INTERVAL -1 DAY)
FROM component_final cf2
WHERE cf2.startdate > cf1.startdate
AND cf2.source_id = cf1.source_id
AND cf2.entity_id = cf1.entity_id
) as enddate,
component_type,
value
FROM component_final cf1;
I understand that the core issue is how to find the source_ids where a component changes (0 means a removal, so we don't want these entries in the result) and how to assign the respective end dates at the same time. For the sake of illustration I simplify your example a bit:
There is only one component_type (I take into account that there might then be consecutive entries with unchanged value)
there is only one entity_id, so we can ignore it
It should be easy to extend this simpler version to your real-world problem.
To this is an example input:
source_id
effective_date
value
1
2020-01-01
100
2
2020-01-03
100
3
2020-01-05
80
4
2020-01-10
0
5
2020-01-12
30
I would expect the following output to be generated:
source_id
start_date
end_date
value
1
2020-01-01
2020-01-04
100
3
2020-01-05
2020-01-09
80
5
2020-01-12
NULL
30
You can achieve this with one query by joing each row with the previous one to check if the value has changed (find the start dates of periods) and the first row that is in the future and has a different value (find the start of the next period). If there is no previous row, it is considered a start as well. If there is no later update of the value, we have no end_date.
SELECT
main.source_id,
main.effective_date as start_date,
DATE_SUB(next_start.effective_date, INTERVAL 1 DAY) as end_date,
main.value
FROM source main
LEFT JOIN source prev ON prev.effective_date = (
SELECT MAX(effective_date)
FROM source
WHERE effective_date < main.effective_date
)
LEFT JOIN source next_start ON next_start.effective_date = (
SELECT MIN(effective_date)
FROM source
WHERE effective_date > main.effective_date AND value <> main.value
)
WHERE
ISNULL(prev.source_id) OR prev.value <> main.value
AND main.value <> 0
ORDER BY main.source_id
As I said: This will have to be adapted to your problem, e.g. by adding proper join conditions for the entity_id.
#Luuk pointed out that you don't need the end date because it can be derived from the data. This would be the case if you had entries for the start of "0 periods" as well, i.e. if there is no value set. If you don't have entries for these, you can't derive the end from the start of the respectively next period since there might be a gap in between.
Here is a sample data:
Key Start Date Stop date Order
1 2010-07-10 11:50:11 2011-10-20 9:10:59 1
1 2013-01-09 13:04:12 2013-03-11 13:42:25 2
1 2014-05-23 14:45:40 2015-10-16 8:53:54 3
1 2013-01-09 13:04:12 9999-12-31 0:00:00 4
2 2015-12-15 11:16:06 2016-12-15 11:16:06 1
2 2016-12-15 11:16:06 2017-12-15 11:16:06 2
2 2017-12-15 11:16:06 9999-12-31 0:00:00 3
I want to check whether the start and stop dates of a particular order do not invalidate another order for the same key. Only one order is possible within a start and stop range of date.
I want to write a MySQL query to print the key of all which have invalid order.
Here in this example Key one has an invalid order as Order 2 and Order 4 are invalid. Is it possible to check this by MySQL query?
I think you can get the rows that are out of order using:
select s.*
from sample s
where exists (select 1
from sample s2
where s2.key = s.key and
s2.startdate < s.startdate and
s2.order > s.order
);
Note that order and key are SQL keywords, so they are very poor choices for column names.
I want to write query to fetch user from table who register before week interval.
For ex. todays date is 2017-08-17, then I need user who register on 2017-08-10, 2017-08-03,2017-07-27 and so on. Like this if todays date is 2017-08-20 then user will be register on 2017-08-13, 2017-08-06.
id name date
1 ABC 2018-08-16
2 PQR 2018-08-10
3 LMN 2018-07-27
4 AAA 2018-01-01
Output will be
id name date
2 PQR 2018-08-10
3 LMN 2018-07-27
One way to express this problem is to recognize that we want to retain dates whose difference from today are multiple of 7 days. We can compare the UNIX timestamps of each record and check to see if the number of seconds, when divided by the number of seconds in 7 days, is zero.
SELECT *
FROM yourTable
WHERE
MOD(UNIX_TIMESTAMP(CURDATE()) -
UNIX_TIMESTAMP(DATE(reg_date)), 7*24*60*60) = 0
Demo here:
Rextester
SELECT * FROM user WHERE WEEKDAY(`date`) = WEEKDAY(NOW());
This will get you all users that registered 0, 7, 14, 21 etc. days ago.
here's my example table (room reservation system):
id available room_id
----------------------------
1 2014-02-05 4
2 2014-02-06 4
3 2014-02-07 4
4 2014-02-09 4
5 2014-02-10 4
i want to query if room with id 4 is available between 2014-02-05 and 2014-02-10.
i know i can query by using the BETWEEN operator, but the problem is that i need to consider continuous date ranges, so it should return zero records as the record for 2014-02-08 is missing.
any ideas?
thanks
Here is an idea. Count the number of rows that match and then compare these to the number of days in the period:
select room_id
from example
where available between date('2014-02-05') and date('2014-02-10')
group by room_id
having count(*) = datediff(date('2014-02-05'), date('2014-02-10')) + 1;
I have a dataset (df1)
ID DATE
1 10-April-2013
2 11-April-2013
3 12-April-2013
1 12-April-2013
2 13-April-2013
4 16-April-2013
I need to get 1 row/ID reporting the earliest DATE
ID DATE
1 10-April-2013
2 11-April-2013
3 12-April-2013
4 16-April-2013
undf1 <- unique(df1[ ,c("ID","DATE")]) is not working since DATE is unique as well
I'd really appreciate any input here...
SELECT DATE, MAX(id) FROM df1 GROUP BY DATE