Update first occurrence of value in a time interval - mysql

I'm trying to set the value of another column on the first occurrence of any value in a username column in monthly intervals, if there's another column with an specific value.
create table table1
(
username varchar(30) not null,
`date` date not null,
eventid int not null,
firstFlag int null
);
insert table1 (username,`date`, eventid) values
('john','2015-01-01', 1)
, ('kim','2015-01-01', 1)
, ('john','2015-01-01', 1)
, ('john','2015-01-01', 1)
, ('john','2015-03-01', 2)
, ('john','2015-03-01', 1)
, ('kim','2015-01-01', 1)
, ('kim','2015-02-01', 1);
This should result in:
| username | date | eventid | firstFlag |
|----------|------------|---------|-----------|
| john | 2015-01-01 | 1 | 1 |
| kim | 2015-01-01 | 1 | 1 |
| john | 2015-01-01 | 1 | (null) |
| john | 2015-01-01 | 1 | (null) |
| john | 2015-03-01 | 2 | 1 |
| john | 2015-03-01 | 1 | (null) |
| kim | 2015-01-01 | 1 | (null) |
| kim | 2015-02-01 | 1 | 1 |
I've tried using joins as described here, but it updates all rows:
update table1 t1
inner join
( select username,min(`date`) as minForGroup
from table1
group by username,`date`
) inr
on inr.username=t1.username and inr.minForGroup=t1.`date`
set firstFlag=1;

As a1ex07 points out, it would need another per row unique constrain to update the rows I need to:
update table1 t1
inner join
( select id, username,min(`date`) as minForGroup
from table1
where eventid = 1
group by username,month(`date`)
) inr
on inr.id=t1.id and inr.username=t1.username and inr.minForGroup=t1.`date`
set firstFlag=1;
Add an Id column, and use it on the join on constrains.
To allow only those that satisfies a specific condition on another column you need the where clause inside the subquery, otherwise it would try to match different rows as the subquery would return rows with eventid=2 while the update query would return only those with eventid=1.
To use yearly intervals instead of monthly, change the group by statement to use years.

Related

A join or function in Mysql to include all unique items of one column and all unique items of second column

I need to combine the unique values of one column and the unique values of another column as one new column. As shown in Table example 2
Union to append all DISTINCT return dates
drop table if exists t;
create table t
(cid int, bid int,cdate varchar(6),rdate varchar(6));
insert into t values
(103,23,'15-mar','26-jan'),
(103,23,'14-apr','26-jan'),
(103,23,'18-may','26-jan');
select cid,bid,cdate,rdate,cdate as newdate from t
union all
(select distinct cid,bid,null,null,rdate from t)
;
+------+------+--------+--------+---------+
| cid | bid | cdate | rdate | newdate |
+------+------+--------+--------+---------+
| 103 | 23 | 15-mar | 26-jan | 15-mar |
| 103 | 23 | 14-apr | 26-jan | 14-apr |
| 103 | 23 | 18-may | 26-jan | 18-may |
| 103 | 23 | NULL | NULL | 26-jan |
+------+------+--------+--------+---------+
4 rows in set (0.002 sec)
select column1,column2,count(column1) as c1 ,count(column2) as c2 from MyTable having c1 =1 OR c2 =1

Cross join or subquery when one table is empty

I have two tables 'property' and 'bookings'. I want to find out property for city, checkin and checkout when bookings table is empty.
when bookings table empty, for city = 'bali' and checkin = '2020-07-20' and checkout = '2020-07-30', expected output is property id 1 and 2.
when bookings table not empty, for city = 'bali' and checkin = '2020-07-20' and checkout = '2020-07-30', expected output is property id 1.
Query should work both when bookings table is empty / not empty.
property:
+----+---------+------+
| id | city | type |
+----+---------+------+
| 1 | bali | 1 |
| 2 | bali | 1 |
| 3 | bangkok | 1 |
+----+---------+------+
bookings:
+----+-------------+------------+------------+
| id | property_id | checkin | checkout |
+----+-------------+------------+------------+
| 1 | 1 | 2020-07-18 | 2020-07-19 |
| 2 | 2 | 2020-07-20 | 2020-07-25 |
| 3 | 3 | 2020-07-20 | 2020-07-30 |
+----+-------------+------------+------------+
What will be best approach subquery or left join? I tried both approach but unable to get the expected result.
As #Strawberry suggested, I am able to make it work:
SELECT property.id FROM property
LEFT JOIN bookings
ON bookings.checkin < '2020-08-30'
AND bookings.checkout > '2020-08-20'
AND bookings.property_id = property.id
WHERE city = 'bali'
AND bookings.id IS NULL
Try this:
SELECT *
FROM property
LEFT JOIN bookings ON bookings.property_id = property.id
WHERE city='bali'
AND IF(bookings.id IS NULL, 1, bookings.checkin = '2020-07-18')
AND IF(bookings.id IS NULL, 1, bookings.checkout = '2020-07-19')
If there doesn't exists any record in the bookings table then simply apply the filter on the city only otherwise filter will be applied ont eh checkin and checkout as well.

Performant way to self-join and filter by revised rows

I'm trying to select all rows in this table, with the constraint that revised id's are selected instead of the original ones. So, if a row has a revision, that revision is selected instead of that row, if there are multiple revision numbers the highest revision number is preferred.
I think an example table, output, and query will explain this better:
Table:
+----+-------+-------------+-----------------+-------------+
| id | value | original_id | revision_number | is_revision |
+----+-------+-------------+-----------------+-------------+
| 1 | abcd | null | null | 0 |
| 2 | zxcv | null | null | 0 |
| 3 | qwert | null | null | 0 |
| 4 | abd | 1 | 1 | 1 |
| 5 | abcde | 1 | 2 | 1 |
| 6 | zxcvb | 2 | 1 | 1 |
| 7 | poiu | null | null | 0 |
+----+-------+-------------+-----------------+-------------+
Desired Output:
+----+-------+-------------+-----------------+
| id | value | original_id | revision_number |
+----+-------+-------------+-----------------+
| 3 | qwert | null | null |
| 5 | abcde | 1 | 2 |
| 6 | zxcvb | 2 | 1 |
| 7 | poiu | null | null |
+----+-------+-------------+-----------------+
View Called revisions_max:
SELECT
responses.original_id AS original_id,
MAX(responses.revision_number) AS revision
FROM
responses
WHERE
original_id IS NOT NULL
GROUP BY responses.original_id
My Current Query:
SELECT
responses.*
FROM
responses
WHERE
id NOT IN (
SELECT
original_id
FROM
revisions_max
)
AND
is_revision = 0
UNION
SELECT
responses.*
FROM
responses
INNER JOIN revisions_max ON revisions_max.original_id = responses.original_id
AND revisions_max.revision_number = responses.revision_number
This query works, but takes 0.06 seconds to run. With a table of only 2000 rows. This table will quickly start expanding to tens or hundreds of thousands of rows. The query under the union is what takes most of the time.
What can I do to improve this queries performance?
How about using coalesce()?
SELECT COALESCE(y.id, x.id) AS id,
COALESCE(y.value, x.value) AS value,
COALESCE(y.original_id, x.original_id) AS original_id,
COALESCE(y.revision_number, x.revision_number) AS revision_number
FROM responses x
LEFT JOIN (SELECT r1.*
FROM responses r1
INNER JOIN (SELECT responses.original_id AS
original_id,
Max(responses.revision_number) AS
revision
FROM responses
WHERE original_id IS NOT NULL
GROUP BY responses.original_id) rev
ON r1.original_id = rev.original_id
AND r1.revision_number = rev.revision) y
ON x.id = y.original_id
WHERE y.id IS NOT NULL
OR x.original_id IS NULL;
The approach I would take with any other DBMS is to use NOT EXISTS:
SELECT r1.*
FROM Responses AS r1
WHERE NOT EXISTS
( SELECT 1
FROM Responses AS r2
WHERE r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
);
To remove any rows where a higher revision number exists for the same id (or original_id if it is populated). However, in MySQL, LEFT JOIN/IS NULL will perform better than NOT EXISTS1. As such I would rewrite the above as:
SELECT r1.*
FROM Responses AS r1
LEFT JOIN Responses AS r2
ON r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
WHERE r2.id IS NULL;
Example on DBFiddle
I realise that you have said that you don't want to use LEFT JOIN and check for nulls, but I don't see that there is a better solution.
1. At least this was the case historically, I don't actively use MySQL so don't keep up to date with developments in the optimiser

SQL increment date using recursion

I have a table which contains the following structure:
|-----------|------------|-----------|
| Number | Date | Subject |
|-----------|------------|-----------|
| 1 | 2015-01-01 | ABC |
| 2 | 2015-01-01 | ABC |
| 3 | 2015-01-01 | ABC |
| 4 | 2015-01-01 | ABC |
|-----------|------------|-----------|
I need to loop through the table and increment of n days each date.
So, assuming n = 10 I should get this result:
|-----------|------------|-----------|
| Number | Date | Subject |
|-----------|------------|-----------|
| 1 | 2015-01-01 | ABC |
| 2 | 2015-01-11 | ABC |
| 3 | 2015-01-21 | ABC |
| 4 | 2015-01-31 | ABC |
|-----------|------------|-----------|
The problem is a bit more complicated because n is generated by using a function which needs the previous generated date
I am trying to use CTE to accomplish this with the following CTE, but I get too many rows than expected.
WITH myCte(Number, Date, Subject)
AS
(
SELECT * FROM MyTable
UNION ALL
SELECT
Number, dbo.get_next_date(Date)
FROM MyCte
)
SELECT * FROM MyCte
That is because you have no WHERE clause in the recursive CTE. This would cause the query to stop when MAXRECURSION value is reached (default 100).
Here is an example to set a limit with a WHERE clause
DECLARE #MyTable TABLE
(
Number int,
Dt Date,
Sub CHAR(3)
)
INSERT INTO #MyTable
VALUES
(1,'2015-01-01','ABC'),
(2,'2015-01-11','ABC'),
(3,'2015-01-21','ABC'),
(4,'2015-01-31','ABC')
;WITH myCte(Number, Date, Subject)
AS
(
SELECT * FROM #MyTable
UNION ALL
SELECT
Number, DATEADD(day, 10, Date),Subject
FROM MyCte
WHERE Date < GETDATE()
)
SELECT * FROM MyCte
EDIT - If you know the number of rows, then you could just use TOP and get those rows.
SELECT TOP 4 * FROM MyCte

Update the next row of the target row in MySQL

Suppose I have a table that tracks if a payment is missed like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | NULL |
| 5 | 1 | 100 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
And, for example, I ran a query that checks if a payment is missed like this:
UPDATE loan_payments
SET missed = 1
WHERE DATEDIFF(NOW(), due_at) >= 10
AND paid_at IS NULL
Then suppose that the row with id = 1 gets affected. I want the amount_due of row with id = 1 be added to the amount_due of the next row so the table would look like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | 1 |
| 5 | 1 | 200 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
Any advice on how to do it?
Thanks
Take a look at this :
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE loan_payments
(`id` int, `loan_id` int, `amount_due` int,
`due_at` varchar(10), `paid_at` varchar(4), `missed` varchar(4))
;
INSERT INTO loan_payments
(`id`, `loan_id`, `amount_due`, `due_at`, `paid_at`, `missed`)
VALUES
(1, 1, 100, '2013-09-17', NULL, NULL),
(3, 2, 100, '2013-09-17', NULL, NULL),
(5, 1, 100, '2013-10-17', NULL, NULL),
(7, 1, 100, '2013-11-17', NULL, NULL)
;
UPDATE loan_payments AS l
LEFT OUTER JOIN (SELECT loan_id, MIN(ID) AS ID
FROM loan_payments
WHERE DATEDIFF(NOW(), due_at) < 0
GROUP BY loan_id) AS l2 ON l.loan_id = l2.loan_id
LEFT OUTER JOIN loan_payments AS l3 ON l2.id = l3.id
SET l.missed = 1, l3.amount_due = l3.amount_due + l.amount_due
WHERE DATEDIFF(NOW(), l.due_at) >= 10
AND l.paid_at IS NULL
;
Query 1:
SELECT *
FROM loan_payments
Results:
| ID | LOAN_ID | AMOUNT_DUE | DUE_AT | PAID_AT | MISSED |
|----|---------|------------|------------|---------|--------|
| 1 | 1 | 100 | 2013-09-17 | (null) | 1 |
| 3 | 2 | 100 | 2013-09-17 | (null) | 1 |
| 5 | 1 | 200 | 2013-10-17 | (null) | (null) |
| 7 | 1 | 100 | 2013-11-17 | (null) | (null) |
Unfortunately I don't have time at the moment to write out full-blown SQL, but here's the psuedocode I think you need to implement:
select all DISTINCT loan_id from table loan_payments
for each loan_id:
set missed = 1 for all outstanding payments for loan_id (as determined by date)
select the sum of all outstanding payments for loan_id
add this sum to the amount_due for the loan's next due date after today
Refer to this for how to loop using pure MySQL: http://dev.mysql.com/doc/refman/5.7/en/cursors.html
I fixed my own problem by adding a missed_at field. I put the current timestamp ($now) in a variable before I update the first row to missed = 1 and missed_at = $now then I ran this query to update the next row's amount_due:
UPDATE loan_payments lp1 JOIN loan_payments lp2 ON lp1.due_at > lp2.due_at
SET lp1.amount_due = lp2.amount_due + lp1.amount_due
WHERE lp2.missed_at = $now AND DATEDIFF(lp1.due_at, lp2.due_at) <= DAYOFMONTH(LAST_DAY(lp1.due_at))
I wish I could use just use LIMIT 1 to that query but it turns out that it's not possible for an UPDATE query with a JOIN.
So all in all, I used two queries to achieve what I want. It did the trick.
Please advise if you have better solutions.
Thanks!