self join providing wrong answers - mysql

Hypothetical data - tbl1:
orderID
SupplierID
Status
Reason
Created At
29
1
22-01-2021 22:08
29
2
22-01-2021 22:10
29
265
3
23-01-2021 06:25
29
2
sometext
23-01-2021 12:25
29
1605
3
24-01-2021 10:21
29
1605
4
anothertext
24-01-2021 11:03
29
324
3
26-01-2021 06:43
29
2
sometext
26-01-2021 12:43
29
1564
3
26-01-2021 16:09
Desired result:
orderID
SupplierID
Status
Reason
Created At
29
265
3
23-01-2021 06:25
29
324
3
26-01-2021 06:43
My query -
select distinct tbl1.orderID, tbl1.created_at, tbl2.supplierID
from tblxyz as tbl1 left join tblxyz as tbl2
on tbl1.orderID = tbl2.orderID
where tbl1.status=2 and tbl1.reason='sometext' and tbl2.status=3 and tbl1.created_at < (tbl2.created_at + INTERVAL 1 DAY)
group by tbl2.supplierID
I am unable to figure out where is my query wrong.

You can try to use LAG window function to get previous status and reason, then do your judgment.
Schema (MySQL v8.0)
CREATE TABLE tblxyz(
orderID int,
SupplierID INT,
Status INT,
Reason VARCHAR(50),
CreatedAt DATETIME
);
INSERT INTO tblxyz VALUES (29,NULL, 1,'','2021-01-22 22:08');
INSERT INTO tblxyz VALUES (29,NULL, 2,'','2021-01-22 22:10');
INSERT INTO tblxyz VALUES (29,265 , 3,'','2021-01-23 06:25');
INSERT INTO tblxyz VALUES (29,NULL, 2,'sometext','2021-01-23 12:25');
INSERT INTO tblxyz VALUES (29,1605, 3,'','2021-01-24 10:21');
INSERT INTO tblxyz VALUES (29,1605, 4,'anothertext','2021-01-24 11:03');
INSERT INTO tblxyz VALUES (29,324 , 3,'','2021-01-26 06:43');
INSERT INTO tblxyz VALUES (29,NULL, 2,'sometext','2021-01-26 12:43');
INSERT INTO tblxyz VALUES (29,1564, 3,'','2021-01-26 16:09');
Query #1
SELECT t1.orderID,t1.SupplierID,t1.Status,t1.Reason,t1.PreviewCreatedAt
FROM (
select *,
LAG(Status) OVER(PARTITION BY orderID ORDER BY CreatedAt) PreviewStatus,
LAG(Reason) OVER(PARTITION BY orderID ORDER BY CreatedAt) PreviewReason,
LAG(CreatedAt) OVER(PARTITION BY orderID ORDER BY CreatedAt) PreviewCreatedAt
from tblxyz
) t1
WHERE PreviewStatus = 2 AND Status = 3 AND PreviewReason='sometext';
orderID
SupplierID
Status
Reason
PreviewCreatedAt
29
1605
3
2021-01-23 12:25:00
29
1564
3
2021-01-26 12:43:00
View on DB Fiddle

Do you need in this:
SELECT t2.*
FROM tbl1 t1
JOIN tbl1 t2 USING (orderID)
WHERE t1.Status = 2
AND t2.Status = 3
AND t1.Reason = 'sometext'
AND t2.Created_At BETWEEN t1.Created_At - INTERVAL 1 DAY AND t1.Created_At
ORDER BY t1.Created_At;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=543d9150d1b23a01df01e0223f3fb3f2

Related

Select only unique ids row of table with different values in Descending order

I'm trying to get data that have the same medicine_id and unique insurance_id and last inserted row. Put Group by and Order by but in that got random data not last inserted.
I tried this code but got not last inserted data
SELECT
`m1`.`*`
FROM
(
`pricings` `m1`
LEFT JOIN `pricings` `m2` ON
(
(
(
`m1`.`medicine_id` = `m2`.`medicine_id`
)
)
)
)
WHERE m1.medicine_id = 2
group BY m1.insurance_id DESC
ORDER BY m1.created_at;
Here are the total rows.
This is a full table
id
medicine_id
insurance_id
created_at
4311
2
1
2021-04-12 16:05:07
4766
2
1
2022-01-15 11:56:06
4767
2
38
2021-05-12 08:17:11
7177
2
38
2022-03-30 10:14:11
4313
2
39
2021-04-12 16:05:46
4768
2
39
2021-05-12 08:17:30
1356
2
40
2020-11-02 11:25:43
3764
2
40
2021-03-08 15:42:16
4769
2
40
2021-05-12 08:17:44
And I want to like this
id
medicine_id
insurance_id
created_at
4766
2
1
2022-01-15 11:56:06
4768
2
39
2021-05-12 08:17:30
4769
2
40
2021-05-12 08:17:44
7177
2
38
2022-03-30 10:14:11
MySQL 5.x: Use a sub-query to find the max created_at value per group, then join that on the source table to identify the row it was from.
SELECT
p.`*`
FROM
`pricings` p
INNER JOIN
(
SELECT
`medicine_id`,
`insurance_id`,
MAX(created_at) AS `created_at`
FROM
`pricings`
GROUP BY
`medicine_id`,
`insurance_id`
)
p_max
ON p.`medicine_id` = p_max.`medicine_id`
AND p.`insurance_id` = p_max.`insurance_id`
AND p.`created_at` = p_max.`created_at`
WHERE
p.`medicine_id` = 2
ORDER BY
p.`created_at`;
MySQL 8: Use ROW_NUMBER() to enumerate each group, then pick the first row from each group.
SELECT
p.`*`
FROM
`pricings` p
FROM
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY `medicine_id`,
`insurance_id`
ORDER BY `created_at` DESC
)
AS `row_id`
FROM
`pricings`
)
p
WHERE
p.`medicine_id` = 2
AND p.`row_id` = 1
ORDER BY
p.`created_at`;
Adding it as an answer as well. I have not tested it, just fix the formating to work with whatever version of databse you are working with and let me know of the results.
SELECT m1.id , m1.Insurance_id , m1.medicine_id , max(m1,created_at)
FROM (
`pricings` `m1` LEFT JOIN `pricings` `m2` ON `m1`.`medicine_id` = `m2`.`medicine_id`
)
WHERE m1.medicine_id = 2 and m1.insurance_id in (1,39,40,38)
GROUP BY m1.insurance_id DESC
ORDER BY m1.created_at;
Edit. I also removed the 6 extra parenthesis, I don't see how they could be of any use

How can I count distinct months names in a set of date values?

I have table 'Data' and there has two field is Date_date1 and also Data_date2, and i want count it based on month.
this my database
Table: Data
Data_date1 Data_date2
---------------------------------
2019-07-23 2019-01-23
2019-08-23 2019-01-24
2019-08-24 2019-02-23
2019-09-21 2019-07-23
2019-09-22 2019-09-22
2019-09-23 2019-09-23
and i want the results like this one
Month Count_Date1 Count_Date2
Jan 0 2
Feb 0 1
July 1 1
Aug 2 0
Sep 3 9
Try this:
SELECT MONTH(data_date) m
,SUM(d=1) d1
,SUM(d=2) d2
FROM
(SELECT 1 d, data_date1 data_date FROM my_table
UNION
SELECT 2, data_date2 FROM my_table
) x
GROUP BY m
Here’s some setup with which to test this query, which produces the desired results:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(data_date1 DATE NOT NULL
,data_date2 DATE NOT NULL
);
INSERT INTO my_table VALUES
('2019-07-23','2019-01-23'),
('2019-08-23','2019-01-24'),
('2019-08-24','2019-02-23'),
('2019-09-21','2019-07-23'),
('2019-09-22','2019-09-22'),
('2019-09-23','2019-09-23');
You can use union all and group by:
select month(dte), sum(cnt1), sum(cnt2)
from ((select data_date1 as dte, 1 as cnt1, 0 as cnt2
from t
) union all
(select data_date2, 0, 1
from t
)
) dd
group by month(dte);
This shows the month number rather than the month name.
If you want the month name, you would do:
select monthname(dte), sum(cnt1), sum(cnt2)
from ((select data_date1 as dte, 1 as cnt1, 0 as cnt2
from t
) union all
(select data_date2, 0, 1
from t
)
) dd
group by monthname(dte), month(dte)
order by month(dte);

MySQL - Date Difference and Flags

I am very new to MySQL and currently working on a table with three columns: trx_id, user_id, last_activity. (Churn Analysis)
tbl_activity:
The table capture activity of users. I am finding it difficulty in performing two tasks.
1) I would like to see two new columns through SQL query
date difference between subsequent transactions.
flag based on condition > 30 days.
Desired table:
2) One of the objectives of this study is to identify when (date) a customer churned. Ideally in my case it would be the 31st day since last activity. Any way to arrive at this date?
I am new to SQL learning and finding it difficult to address SQL queries for the above tasks.
Try this:
For SQL Server:
CREATE TABLE #tbl_activity(Trx_ID INT, User_Id INT, Last_Activity DATETIME)
INSERT INTO #tbl_activity VALUES(1,1100,'2015-06-08')
INSERT INTO #tbl_activity VALUES(2,1100,'2015-06-10')
INSERT INTO #tbl_activity VALUES(3,1100,'2015-06-10')
INSERT INTO #tbl_activity VALUES(4,1100,'2015-06-12')
INSERT INTO #tbl_activity VALUES(5,1100,'2015-06-13')
INSERT INTO #tbl_activity VALUES(6,1100,'2015-06-14')
INSERT INTO #tbl_activity VALUES(7,1100,'2015-09-25')
SELECT T1.Trx_ID, T1.User_Id, T1.Last_Activity
,DATEDIFF(DAY, T1.Last_Activity, T2.Last_Activity) days_Diff
,CASE WHEN DATEDIFF(DAY, T1.Last_Activity, T2.Last_Activity) >30 THEN 1 ELSE 0 END Flag
FROM #tbl_activity T1
LEFT JOIN #tbl_activity T2 ON T1.Trx_ID = T2.Trx_ID-1
DROP TABLE #tbl_activity
For MySQL:
CREATE TABLE tbl_activity(Trx_ID INT, User_Id INT, Last_Activity DATETIME)
INSERT INTO tbl_activity VALUES(1,1100,'2015-06-08')
INSERT INTO tbl_activity VALUES(2,1100,'2015-06-10')
INSERT INTO tbl_activity VALUES(3,1100,'2015-06-10')
INSERT INTO tbl_activity VALUES(4,1100,'2015-06-12')
INSERT INTO tbl_activity VALUES(5,1100,'2015-06-13')
INSERT INTO tbl_activity VALUES(6,1100,'2015-06-14')
INSERT INTO tbl_activity VALUES(7,1100,'2015-09-25')
SELECT T1.Trx_ID, T1.User_Id, T1.Last_Activity
,DATEDIFF(T2.Last_Activity, T1.Last_Activity) days_Diff
,CASE WHEN DATEDIFF(T2.Last_Activity, T1.Last_Activity) >30 THEN 1 ELSE 0 END Flag
FROM tbl_activity T1
LEFT JOIN tbl_activity T2 ON T1.Trx_ID = T2.Trx_ID-1
DROP TABLE tbl_activity
Try this in #SQL Fiddle
Output:
Trx_ID User_Id Last_Activity days_Diff Flag
1 1100 2015-06-08 00:00:00.000 2 0
2 1100 2015-06-10 00:00:00.000 0 0
3 1100 2015-06-10 00:00:00.000 2 0
4 1100 2015-06-12 00:00:00.000 1 0
5 1100 2015-06-13 00:00:00.000 1 0
6 1100 2015-06-14 00:00:00.000 103 1
7 1100 2015-09-25 00:00:00.000 NULL 0

Selecting data interval

I have a table that has date and id column. How can I select id's of 7 days interval?
My data is:
date id
2013-07-01 11
2013-07-02 22
2013-07-03 33
2013-07-04 33
2013-07-05 44
2013-07-06 44
2013-07-07 45
2013-07-08 46
2013-07-09 47
2013-07-10 48
2013-07-11 48
2013-07-12 49
2013-07-13 50
2013-07-14 51
2013-07-15 52
2013-07-16 52
2013-07-17 53
2013-07-18 53
2013-07-19 54
What I want is:
date id
2013-07-01 11
2013-07-08 46
2013-07-15 52
Thanks
SELECT date,id FROM table1 GROUP BY WEEK(`date`, 1)
http://sqlfiddle.com/#!2/b128c/1
the group by does not do the trick with (any) sql - ok mysql does - others wont allow this
SELECT date,id FROM table1 GROUP BY WEEK(`date`, 1)
would result in something like error: id is not part of group by ....
if you group by a col1 and not by col2 the database does not know what value you want for col2.
MySql seems to assume "if i do not group by others, take the smallest or first in databsse order'
If - and only if - you want the first (!) result - or what ever MySql decides to use for your missing 2nd grouping element, you are ok.
explanation:
assume this:
CREATE TABLE table1
(`id` int, `val` int );
INSERT INTO table1
(`id`, `val`)
VALUES
(1,99), --'!!!!'
(1,2),
(1,3),
(1,4),
(1,2),
(2,1),
(1,1),
(2,2),
(3,1),
(4,1)
;
See - please, there a 6 rows with 'id'=1, two with id '2' others unique
select id, val FROM table1 GROUP BY id
evaluates to:
ID VAL
1 99
2 1
3 1
4 1
this is only - probably - what you want and if it is a date you have a chance(!) the it will be what you want
to get a valid (without database intepret) result set you will have to use:
select id, some_aggregation_function(val) from table1 group by id
where aggregation is min, max or so.
the is some thing like :
select id, val FROM table1 a
where (id,val)=(select id, min(val) from table1 b where a.id=b.id)
if you want the minimum ....

Select last Order state from Orders

I have tables:
orders:
id_order id_customer
1 1
2 2
3 1
orders_history
id_history id_order id_order_state date_add
1 1 1 2010-01-01 00:00:00
2 1 2 2010-01-02 00:00:00
3 1 3 2010-01-03 00:00:00
4 2 2 2010-05-01 00:00:00
5 2 3 2011-05-02 00:00:00
6 3 1 2011-05-03 00:00:00
7 3 2 2011-06-01 00:00:00
order_state
id_order_state name
1 New
2 Sent
3 Rejected
4 ...
How to get all order_id's where last id_order_state of that order (by last I mean this with MAX(id_history) or MAX(date_add)) is not equal 1 or 3?
select oh.id_history, oh.id_order, oh.id_order_state, oh.date_add
from (
select id_order, max(date_add) as MaxDate
from orders_history
where id_order_state not in (1, 3)
group by id_order
) ohm
inner join orders_history oh on ohm.id_order = oh.id_order
and ohm.MaxDate = oh.date_add
I think what he's after is what orders are complete... ie their final status, not those that are exclusive of the 1 and 3 specifically. The first pre-query should be the max ID regardless of the status code
select
orders.*
from
( select oh.id_order,
max( oh.id_history ) LastID_HistoryPerOrder
from
orders_history oh
group by
oh.id_order ) PreQuery
join orders_history oh2
on PreQuery.ID_Order = oh2.id_order
AND PreQuery.LastID_HistoryPerOrder = oh2.id_history
AND NOT OH2.id_order_state IN (1, 3) <<== THIS ELIMINATES 1's & 3's from result set
join Orders <<= NOW, anything left after above ^ is joined to orders
on PreQuery.ID_Order = Orders.ID_Order
Just to re-show YOUR data... I've marked the last SEQUENCE (ID_History) per ORDER... This is what the PREQUERY is going to return...
id_history id_order id_order_state date_add
1 1 1 2010-01-01 00:00:00
2 1 2 2010-01-02 00:00:00
**3 1 3 2010-01-03 00:00:00
4 2 2 2010-05-01 00:00:00
**5 2 3 2011-05-02 00:00:00
6 3 1 2011-05-03 00:00:00
**7 3 2 2011-06-01 00:00:00
The "PreQuery" will result with the following subset
ID_Order LastID_HistoryPerOrder (ID_History)
1 3 (state=3) THIS ONE WILL BE SKIPPED IN FINAL RESULT
2 5 (state=3) THIS ONE WILL BE SKIPPED IN FINAL RESULT
3 7 (state=2)
Now, the result of this is then re-joined back to order history on just these two elements... yet adds the criteria to EXCLUDE the 1,3 entries for "order state".
In this case,
1 would be rejected as its state = 3 (sequence #3),
2 would be rejected since its last history is state = 3 (sequence #5).
3 would be INCLUDED since its state = 2 (sequence #7)
Finally, all that joined to the orders will result with ONE ID, and nicely match up with the orders table on the Order_ID alone and get the desired results.
Another possible solution:
SELECT DISTINCT
id_order
FROM
Orders_History OH1
LEFT OUTER JOIN Orders_History OH2 ON
OH2.id_order = OH1.id_order AND
OH2.is_order_state IN (1, 3) AND
OH2.date_add >= OH1.date_add
WHERE
OH2.id_order IS NULL
I'm using "answer for my question" because I need to post results of your queries. So.
Unfortunately not all of your answers guys works. Let's prepare test environment:
CREATE TABLE `order_history` (
`id_order_history` int(11) NOT NULL AUTO_INCREMENT,
`id_order` int(11) NOT NULL,
`id_order_state` int(11) NOT NULL,
`date_add` datetime NOT NULL,
PRIMARY KEY (`id_order_history`)
) ENGINE=MyISAM AUTO_INCREMENT=11 DEFAULT CHARSET=latin2;
CREATE TABLE `orders` (
`id_order` int(11) NOT NULL AUTO_INCREMENT,
`id_customer` int(11) DEFAULT NULL,
PRIMARY KEY (`id_order`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin2;
INSERT INTO `order_history`
(`id_order_history`, `id_order`, `id_order_state`, `date_add`) VALUES
(1,1,1,'2011-01-01 00:00:00'),
(2,1,2,'2011-01-01 00:10:00'),
(3,1,3,'2011-01-01 00:20:00'),
(4,2,1,'2011-02-01 00:00:00'),
(5,2,2,'2011-02-01 00:25:01'),
(6,2,3,'2011-02-01 00:25:59'),
(7,3,1,'2011-03-01 00:00:01'),
(8,3,2,'2011-03-01 00:00:02'),
(9,3,3,'2011-03-01 00:01:00'),
(10,3,2,'2011-03-02 00:00:01');
COMMIT;
INSERT INTO `orders` (`id_order`, `id_customer`) VALUES
(1,1),
(2,2),
(3,3),
(4,4),
(5,5),
(6,6),
(7,7);
COMMIT;
Now, lets select Last/Max State for each Order, so let's run simple query:
select id_order, max(date_add) as MaxDate
from `order_history`
group by id_order
this gives us PROPER results, no rocket science right now:
id_order MaxDate
---------+-------------------
1 2011-01-01 00:20:00 //last order_state=3
2 2011-02-01 00:25:59 //last order_state=3
3 2011-03-02 00:00:01 //last order_state=2
Now for simplicity, lest change our queries to get Orders where Last State is not equal 3.
We're expecting to get one row result with id_order = 3.
So let's test our queries:
QUERY 1 made by RedFilter:
select oh.id_order, oh.id_order_state, oh.date_add
from (
select id_order, max(date_add) as MaxDate
from `order_history`
where id_order_state not in (3)
group by id_order
) ohm
inner join `order_history` oh on ohm.id_order = oh.id_order
and ohm.MaxDate = oh.date_add
Result:
id_order id_order_state date_add
-------------------------------------------------
1 2 2011-01-01 00:10:00
2 2 2011-02-01 00:25:01
3 2 2011-03-02 00:00:01
So it's not true
QUERY 2 made by Tom H.:
SELECT DISTINCT OH1.id_order
FROM order_history OH1
LEFT OUTER JOIN order_history OH2 ON
OH2.id_order = OH1.id_order AND
OH2.id_order_state NOT IN (3) AND
OH2.`id_order_history` >= OH1.`id_order_history`
WHERE
OH2.id_order IS NULL
Result:
id_order
--------
1
2
So it's not true
Any suggestions appreciated.
EDIT
Thanks to Andriy M. comment we have proper solution. It's a modification of Tom H. query all should looks as follow:
SELECT DISTINCT
OH1.id_order
FROM
order_history OH1
LEFT OUTER JOIN order_history OH2 ON
OH2.id_order = OH1.id_order
AND OH2.date_add > OH1.date_add
WHERE OH1.id_order_state NOT IN (3) AND OH2.id_order IS NULL
EDIT 2:
QUERY 3 made by DRapp:
select
distinct orders.`id_order`
from
( select oh.id_order,
max( oh.id_order_history ) LastID_HistoryPerOrder
from
order_history oh
group by
oh.id_order ) PreQuery
join order_history oh2
on PreQuery.id_order = oh2.id_order
AND PreQuery.LastID_HistoryPerOrder = oh2.id_order_history
AND NOT oh2.id_order_state IN (1,3)
join orders
on PreQuery.id_order = orders.id_order
Result:
id_order
--------
3
So it's finally true