Find Missing numbers in SQL query - mysql

I need to find the missing numbers between 0 and 16.
My table is like this:
CarId FromCity_Id ToCity_Id Ran_Date RunId
1001 0 2 01-08-2013 1
1001 5 9 02-08-2013 2
1001 11 16 03-08-2013 3
1002 0 11 02-08-2013 4
1002 11 16 08-08-2013 5
I need to find out:
In past three months from now(), between which cities the car has not ran.
For example, in the above records:
Car 1001 not ran between 02-05 & 09-11
Car 1002 has run fully (ie between 0-11 and 11-16)
Over all is that, I need to generate a query which shows the section between which the car has not run in past 3 months with showing the last run date.
How to make such an query please. If any Stored Procedure please advise.

God help me. This uses a doubly-correlated subquery, a table that might not exist in your system, and too much caffeine. But hey, it works.
Right, here goes.
SELECT CarId, GROUP_CONCAT(DISTINCT missing) missing
FROM MyTable r,
(SELECT #a := #a + 1 missing
FROM mysql.help_relation, (SELECT #a := -1) t
WHERE #a < 16 ) y
WHERE NOT EXISTS
(SELECT r.CarID FROM MyTable m
WHERE y.missing BETWEEN FromCity_Id AND ToCity_Id
AND r.carid = m.carid)
GROUP BY CarID;
Produces (changing the first row for CarID 1002 to 0-9 to open up 10 and give us better test data):
+-------+---------+
| CarId | missing |
+-------+---------+
| 1001 | 3,4,10 |
| 1002 | 10 |
+-------+---------+
2 rows in set (0.00 sec)
And how does it all work?
Firstly...
The inner query gives us a list of numbers from 0 to 16:
(SELECT #a := #a + 1 missing
FROM mysql.help_relation, (SELECT #a := -1) t
WHERE #a < 16 ) y
It does that by starting at -1, and then displaying the result of adding 1 to that number for each row in some sacrificial table. I'm using mysql.help_relation because it's got over a thousand rows and most basic systems have it. YMMV.
Then we cross join that with MyTable:
SELECT CarId, ...
FROM MyTable r,
(...) y
This gives us every possible combination of rows, so we have each CarId and To/From IDs mixed with every number from 1-16.
Filtering...
This is where it gets interesting. We need to find rows that don't match the numbers, and we need to do so per CarID. This sort of thing would do it (as long as y.missing exists, which it will when we correlate the subquery):
SELECT m.CarID FROM MyTable m
WHERE y.missing BETWEEN FromCity_Id AND ToCity_Id
AND m.CarID = 1001;
Remember: y.missing is set to a number between 1-16, cross-joined with the rows in MyTable. This gives us a list of all numbers from 1-16 where CarID 1001 is busy. We can invert that set with a NOT EXISTS, and while we're at it, correlate (again) with CarId so we can get all such IDs.
Then it's an easy matter of filtering the rows that don't fit:
SELECT CarId, ...
FROM MyTable r,
(...) y
WHERE NOT EXISTS
(SELECT r.CarID FROM MyTable m
WHERE y.missing BETWEEN FromCity_Id AND ToCity_Id
AND r.carid = m.carid)
Output
To give a sensible result (attempt 1), we could then get distinct combinations. Here's that version:
SELECT DISTINCT CarId, missing
FROM MyTable r,
(SELECT #a := #a + 1 missing
FROM mysql.help_relation, (SELECT #a := -1) t
WHERE #a < 16 ) y
WHERE NOT EXISTS
(SELECT r.CarID FROM MyTable m
WHERE y.missing BETWEEN FromCity_Id AND ToCity_Id
AND r.carid = m.carid);
This gives:
+-------+---------+
| CarId | missing |
+-------+---------+
| 1001 | 3 |
| 1001 | 4 |
| 1001 | 10 |
| 1002 | 10 |
+-------+---------+
4 rows in set (0.01 sec)
The simple addition of a GROUP BY and a GROUP CONCAT gives the pretty result you get at the top of this answer.
I apologise for the inconvenience.

select * from carstable where CarId not in
(select distinct CarId from ranRecordTable where DATEDIFF(NOW(), Ran_Date) <= 90)
Hope this helps.

Here is the idea. Create a list of all cars and all numbers. Then, return all combinations that are not covered by the data. This is hard because there is more than one row for each car.
Here is one method:
select cars.CarId, n.n
from (select distinct CarId from t) cars cross join
(select 0 as n union all select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all select 7 union all
select 8 union all select 9 union all select 10 union all select 11 union all
select 12 union all select 13 union all select 14 union all select 15 union all
select 16
) n
where t.ran_date >= now() - interval 90 day and
not exists (select 1
from t t2
where t2.ran_date >= now() - interval 90 day and
t2.CarId = cars.CarId and
n.n not between t2.FromCity_id and t2.ToCity_id
);

SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE Table1
(`CarId` int, `FromCity_Id` int, `ToCity_Id` int, `Ran_Date` datetime, `RunId` int)
;
INSERT INTO Table1
(`CarId`, `FromCity_Id`, `ToCity_Id`, `Ran_Date`, `RunId`)
VALUES
(1001, 0, 2, '2013-08-01 00:00:00', 1),
(1001, 5, 9, '2013-08-02 00:00:00', 2),
(1001, 11, 16, '2013-08-03 00:00:00', 3),
(1002, 0, 11, '2013-08-02 00:00:00', 4),
(1002, 11, 16, '2013-08-08 00:00:00', 5)
;
Query 1:
SELECT r1.CarId,r1.ToCity_Id as Missing_From, r2.FromCity_Id as Missing_To,
max(t.Ran_Date) as Last_Run_Date
FROM (
SELECT #i1:=#i1+1 AS rownum, t.*
FROM Table1 as t, (SELECT #i1:=0) as foo
ORDER BY CarId, Ran_Date) as r1
INNER JOIN (
SELECT #i2:=#i2+1 AS rownum, t.*
FROM Table1 as t, (SELECT #i2:=0) as foo
ORDER BY CarId, Ran_Date) as r2 ON r1.CarId = r2.CarId AND
r1.ToCity_Id != r2.FromCity_Id AND
r2.rownum = (r1.rownum + 1)
INNER JOIN Table1 as t ON r1.CarId = t.CarId
WHERE r1.Ran_Date >= now() - interval 90 day
GROUP BY r1.CarId, r1.ToCity_Id, r2.FromCity_Id
Results:
| CARID | MISSING_FROM | MISSING_TO | LAST_RUN_DATE |
|-------|--------------|------------|-------------------------------|
| 1001 | 2 | 5 | August, 03 2013 00:00:00+0000 |
| 1001 | 9 | 11 | August, 03 2013 00:00:00+0000 |

Related

How to pick a row randomly based on a number of tickets you have

I have this table called my_users
my_id | name | raffle_tickets
1 | Bob | 3
2 | Sam | 59
3 | Bill | 0
4 | Jane | 10
5 | Mike | 12
As you can see Sam has 59 tickets so he has the highest chance of winning.
Chance of winning:
Sam = 59/74
Bob = 3/74
Jane = 10/74
Bill = 0/74
Mike = 12/74
PS: 74 is the number of total tickets in the table (just so you know I didn't randomly pick 74)
Based on this, how can I randomly pick a winner, but ensure those who have more raffles tickets have a higher chance of being randomly picked? Then the winner which is picked, has 1 ticket deducted from their total tickets
UPDATE my_users
SET raffle_tickets = raffle_tickets - 1
WHERE my_id = --- Then I get stuck here...
Server version: 5.7.30
For MySQL 8+
WITH
cte1 AS ( SELECT name, SUM(raffle_tickets) OVER (ORDER BY my_id) cum_sum
FROM my_users ),
cte2 AS ( SELECT SUM(raffle_tickets) * RAND() random_sum
FROM my_users )
SELECT name
FROM cte1
CROSS JOIN cte2
WHERE cum_sum >= random_sum
ORDER BY cum_sum LIMIT 1;
For 5+
SELECT cte1.name
FROM ( SELECT t2.my_id id, t2.name, SUM(t1.raffle_tickets) cum_sum
FROM my_users t1
JOIN my_users t2 ON t1.my_id <= t2.my_id
WHERE t1.raffle_tickets > 0
GROUP BY t2.my_id, t2.name ) cte1
CROSS JOIN ( SELECT RAND() * SUM(raffle_tickets) random_sum
FROM my_users ) cte2
WHERE cte1.cum_sum >= cte2.random_sum
ORDER BY cte1.cum_sum LIMIT 1;
fiddle
You want a weighted pull from a random sample. For this purpose, variables are probably the most efficient solution:
select u.*
from (select u.*, (#t := #t + raffle_tickets) as running_tickets
from my_users u cross join
(select #t := 0, #r := rand()) params
where raffle_tickets > 0
) u
where #r >= (running_tickets - raffle_tickets) / #t and
#r < (running_tickets / #t);
What this does is calculate the running sum of tickets and then divide by the number of tickets to get a number between 0 and 1. For example this might produce:
my_id name raffle_tickets running_tickets running_tickets / #t
1 Bob 3 3 0.03571428571428571
2 Sam 59 62 0.7380952380952381
4 Jane 10 72 0.8571428571428571
5 Mike 12 84 1
The ordering of the original rows doesn't matter -- which is why there is no ordering in the subquery.
The ratio is then used with rand() to select a particular row.
Note that in the outer query, #t is the total number of tickets.
Here is a db<>fiddle.

Select duplicates while concatenating every one except the first

I am trying to write a query that will select all of the numbers in my table, but those numbers with duplicates i want to append something on the end that shows it as a duplicate. However I am not sure how to do this.
Here is an example of the table
TableA
ID Number
1 1
2 2
3 2
4 3
5 4
SELECT statement output would be like this.
Number
1
2
2-dup
3
4
Any insight on this would be appreciated.
if you mysql version didn't support window function. you can try to write a subquery to make row_number then use CASE WHEN to judgement rn > 1 then mark dup.
create table T (ID int, Number int);
INSERT INTO T VALUES (1,1);
INSERT INTO T VALUES (2,2);
INSERT INTO T VALUES (3,2);
INSERT INTO T VALUES (4,3);
INSERT INTO T VALUES (5,4);
Query 1:
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,(SELECT COUNT(*)
FROM T tt
where tt.Number = t1.Number and tt.id <= t1.id
) rn
FROM T t1
)t1
Results:
| id | Number |
|----|--------|
| 1 | 1 |
| 2 | 2 |
| 3 | 2-dup |
| 4 | 3 |
| 5 | 4 |
If you can use window function you can use row_number with window function to make rownumber by Number.
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,row_number() over(partition by Number order by id) rn
FROM T t1
)t1
sqlfiddle
I made a list of all the IDs that weren't dups (left join select) and then compared them to the entire list(case when):
select
case when a.id <> b.min_id then cast(a.Number as varchar(6)) + '-dup' else cast(a.Number as varchar(6)) end as Number
from table_a
left join (select MIN(b.id) min_id, Number from table_a b group by b.number)b on b.number = a.number
I did this in MS SQL 2016, hope it works for you.
This creates the table used:
insert into table_a (ID, Number)
select 1,1
union all
select 2,2
union all
select 3,2
union all
select 4,3
union all
select 5,4

Use my sql to randomly select exclusive records

I have a Table A as below
id (integer)
follow_up (integer, days under observation)
matched_id (integer)
id ; follow_up ; matched_id
1 ; 10 ; 19
1 ; 10 ; 20
1 ; 10 ; 21
2 ; 5 ; 22
2 ; 5 ; 23
2 ; 5 ; 24
2 ; 5 ; 19
2 ; 5 ; 20
3 ; 6 ; 25
3 ; 6 ; 26
3 ; 6 ; 27
4 ; 7 ; 19
4 ; 7 ; 28
4 ; 7 ; 29
I would like to limit to 2 records per id, and the records should be randomly picked up and be exclusive for each id. For, example
matched_id: "19" and "20" were given to id:1, then "19" and "20" should not be given to id:2
matched_id: "19" was given to id:1, then "19" should not be given to id:4
and so on for the rest of the table.
require output
id ; follow_up ; matched_id
1 ; 10 ; 19
1 ; 10 ; 20
2 ; 5 ; 22
2 ; 5 ; 23
3 ; 6 ; 25
3 ; 6 ; 26
4 ; 7 ; 28
4 ; 7 ; 29
Please help me. Thank you so much!
This is a very good and very challenging SQL question.
You have a very challenging set of requirements:
1. No matched_id should appear more than once in the result set
2. No ID be given more than two matches
3. The matching be random
We will stick to a pure SQL solution, assuming that you can't return, say, a larger result set and do some filtering using business logic in your implementation language.
First, let's tackle random assignment. Randomly ordering items inside of groups is a fun question. I decided to tackle it by ordering on a SHA1 hash of the data in the row (id, follow_up, matched_id), which will give a repeatable result with a feeling of randomness. (This would be best if there were a column that contained the date/time created or modified.)
SELECT * FROM
(
SELECT
a.id,
a.follow_up,
a.matched_id,
a.rank_hash,
count(*) rank
FROM
(SELECT *, SHA1(CONCAT(id, follow_up, matched_id)) rank_hash FROM TableA) a
JOIN
(SELECT *, SHA1(CONCAT(id, follow_up, matched_id)) rank_hash FROM TableA) b
ON a.rank_hash >= b.rank_hash
AND a.id = b.id
GROUP BY a.id, a.matched_id
ORDER BY a.id, rank
) groups
WHERE rank <= 2
GROUP BY matched_id
This might suffice for your use case if there are sufficient matched_id values for each id. But what if there is a hidden fourth requirement:
4. If possible, an ID should receive a match.
In other words, what if, as a result of random shuffling, a matched_id was assigned to an id that had several other matches, but further down the result set it was the only match for an id? An optimal solution in which every ID were matched with a matched_id was possible, but it never happened because all the matched_ids were used up earlier in the process?
For example:
CREATE TABLE TableA
(`id` int, `follow_up` int, `matched_id` varchar(1))
;
INSERT INTO TableA
(`id`, `follow_up`, `matched_id`)
VALUES
(1, 10, 'A'),
(1, 10, 'B'),
(1, 10, 'C'),
(2, 5, 'D'),
(2, 5, 'E'),
(2, 5, 'F'),
(3, 5, 'C')
;
In the above set, if IDs and their matches are assigned randomly, if ID 1 gets assigned matched_id C, then ID 3 will not get a matched_id at all.
What if we first find out how many matches an ID received, and order by that first?
SELECT
a.*,
frequency
FROM TableA a
JOIN
( SELECT
matched_id,
count(*) frequency
FROM
TableA
GROUP BY matched_id
) b
ON a.matched_id = b.matched_id
GROUP BY a.matched_id
ORDER BY b.frequency
This is where a middleman programming language might come in handy to help limit the result set.
But note that we also lost our requirement of randomness! As you can see, a pure SQL solution might get pretty ugly. It is indeed possible combining the techniques outlined above.
Hopefully this will get your imagination firing.
Along with RAND() and MySQL user defined variables you can achieve this:
SELECT
t.id,
t.follow_up,
t.matched_id
FROM
(
SELECT
randomTable.*,
IF(#sameID = id, #rn := #rn + 1,
IF(#sameID := id, #rn := 1, #rn := 1)
) AS rowNumber
FROM
(
SELECT
*
FROM tableA
ORDER BY id, RAND()
) AS randomTable
CROSS JOIN (SELECT #sameID := 0, #rn := 0) var
) AS t
WHERE t.rowNumber <= 2
ORDER BY t.id
See Demo
Here's a solution for the specific problem given. It does not scale!
SELECT *
FROM
( SELECT a.matched_id m1
, b.matched_id m2
, c.matched_id m3
, d.matched_id m4
FROM my_table a
JOIN my_table b
ON b.matched_id NOT IN(a.matched_id)
JOIN my_table c
ON c.matched_id NOT IN(a.matched_id,b.matched_id)
JOIN my_table d
ON d.matched_id NOT IN(a.matched_id,b.matched_id,c.matched_id)
WHERE a.id = 1
AND b.id = 2
AND c.id = 3
AND d.id = 4
) x
JOIN
( SELECT a.matched_id n1
, b.matched_id n2
, c.matched_id n3
, d.matched_id n4
FROM my_table a
JOIN my_table b
ON b.matched_id NOT IN(a.matched_id)
JOIN my_table c
ON c.matched_id NOT IN(a.matched_id,b.matched_id)
JOIN my_table d
ON d.matched_id NOT IN(a.matched_id,b.matched_id,c.matched_id)
WHERE a.id = 1
AND b.id = 2
AND c.id = 3
AND d.id = 4
) y
ON y.n1 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n2 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n3 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n4 NOT IN(x.m1,x.m2,x.m3,x.m4)
ORDER
BY RAND() LIMIT 1;
+----+----+----+----+----+----+----+----+
| m1 | m2 | m3 | m4 | n1 | n2 | n3 | n4 |
+----+----+----+----+----+----+----+----+
| 20 | 24 | 27 | 29 | 21 | 23 | 26 | 28 |
+----+----+----+----+----+----+----+----+
So, in this example, the pairs are:
id1: 20,21
id2: 24,23
id3: 27,26
id4: 29,28

How to show 0 when no data

I want to show 0 or something i want when no data.And this is my query.
SELECT `icDate`,IFNULL(SUM(`icCost`),0) AS icCost
FROM `incomp`
WHERE (`icDate` BETWEEN "2016-01-01" AND "2016-01-05")
AND `compID` = "DDY"
GROUP BY `icDate`
And this is result of this query.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-03 | 3000.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
If every day i want to show data it have a data,It wasn't problem.But it have some day,It don't have data. This will not show this day, Like this.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
But i want it can show data like this.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-03 | 0.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
How to write query to get this answer.Thank you.
I made a simulation but I could not see your problem. I created a table for teste and after insert data this was my select. But the test was normal!
SELECT icDate,
format(ifnull(sum(icCost), 0),2) as icCost,
count(icDate) as entries
FROM incomp
WHERE icDate BETWEEN '2016-01-01' AND '2016-01-05'
AND compID = 'DDY'
group by icDate;
This is result of my test, exported in csv file:
icDate | icCost | entries
----------------------------------
2016-01-01 | 8,600.00 | 8
2016-01-02 | 5,600.00 | 4
2016-01-03 | 5,400.00 | 3
2016-01-04 | 0.00 | 1
2016-01-05 | 7,050.00 | 7
Does the icCost field is setting with null value ​​or number zero? Remember some cases that null values ​​setted may be different from other one as empty.
I found the answers, It worked with calendar table.
SELECT tbd.`db_date`,
(SELECT IFNULL(SUM(icCost),0) AS icCost
FROM `incomp`
WHERE icDate = tbd.db_date
AND compID = "DDY"
)AS icCost
FROM tb_date AS tbd
WHERE (tbd.`db_date` BETWEEN "2016-01-01" AND "2016-01-05")
GROUP BY tbd.`db_date`
LIMIT 0,100
Simply, But work.
Ok, you can investigate if you table is filled correctly every day. First you can create a temporary table like this:
CREATE TEMPORARY TABLE myCalendar (
CalendarDate date primary key not null
);
So, after you need to fill this table with valid days. For it, use this procedure:
DELIMITER $$
CREATE PROCEDURE doWhile()
BEGIN
# IF YOU WANT TO USE CURRENT MONTH
#SET #startCount = ADDDATE(LAST_DAY(SUBDATE(CURDATE(), INTERVAL 1 MONTH)), 1);
#SET #endCount = LAST_DAY(sysdate());
# USE TO SET A DATE
SET #startCount = '2016-01-01';
SET #endOfCount = '2016-01-30';
WHILE #startCount <= #endOfCount DO
INSERT INTO myCalendar (CalendarDate) VALUES (#startCount);
SET #startCount = date_add(#startCount, interval 1 day);
END WHILE;
END$$;
DELIMITER ;
You need to run this procedure by command:
CALL doWhile();
Now, run the follow:
SELECT format(ifnull(sum(t1.icCost), 0),2) as icCost,
ifnull(t1.icDate, 'Not found') as icDate,
t2.CalendarDate as 'For the day'
from incomp t1
right join myCalendar t2 ON
t2.CalendarDate = t1.icDate group by t2.CalendarDate;
I think this will help you to find a solution, for example, if exists a register for a day or not.
I hope this can help you!
[]'s
Sorry for my earlier answer. I gave a MSSQL answer instead of a MySQL answer.
You need a calendar table to have a set of all dates in your range. This could be a permanent table or a temporary table. Either way, there are a number of ways to populate it. Here is one way (borrowed from here):
set #beginDate = '2016-01-01';
set #endDate = '2016-01-05';
create table DateSequence(Date Date);
insert into DateSequence
select * from
(select adddate('1970-01-01',t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) selected_date from
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t4) v
where selected_date between #beginDate and #endDate
Your best bet is probably to make a permanent table that has every possible date. That way you only have to populate it once and it's ready to go whenever you need it.
Now you can outer join the calendar table with your inComp table.
set #beginDate date = '2016-01-01'
set #endDate date = '2016-01-05'
select d.Date,
sum(ifnull(i.icCost, 0)) inComp
from DateSequence d
left outer join inComp i on i.icDate = d.Date
where d.Date between #beginDate and #endDate
and i.compID = 'DDY'
group by d.date
order by d.Date;

SQL statement for querying with multiple conditions including 3 most recent dates

I need help in finding the rows that correspond to the most recent date, the next most recent and the one after that, where some condition ABC is "Y" and group it by a column name XYZ ASC but XYZ can appear multiple times. So, say XYZ is 50, then for the rows in the three years, the XYZ will be 50. I have the following code that executes but returns only two rows out of thousands which is impossible. I tried executing just the date condition but it returned dates that were less than or equal to MAX(DATE)-3 as well. Don't know where I am going wrong.
select * from money.cash where DATE =(
select
MAX(DATE)
from
money.cash
where
DATE > (select MAX(DATE)-3 from money.cash)
)
GROUP BY XYZ ASC
having ABC = "Y";
The structure of the table is as follows (only a schematic, not the real thing).
Comp_ID DATE XYZ ABC $$$$ ....
1 2012-1-1 10 Y SOME-AMOUNT
2 2011-1-1 10 Y
3 2006-1-1 10 Y
4 2011-1-1 20 Y
5 2002-1-1 20 Y
6 2000-1-1 20 Y
7 1998-1-1 20 Y
The desired o/p would be the first three rows for XYZ=10 in ascending order and the most recent 3 dates for XYZ=20.
LAST AND IMPORTANT-This table's values keeps changing as new data comes in. So, the o/p(which will be in a new table) must reflect the dynamics in the 1st/original/above TABLE.
MySQL doesn't have functionallity that is friendly to greatest-n-per-group queries.
One option would be...
- Find the MAX(Date) per group (XYZ)
- Then use that result to find the MAX(Date) of all records before that date
- Then do it again for all records before that date
It's really innefficient, but MySQL hasn't got the functionality required to do this efficiently. Sorry...
CREATE TABLE yourTable
(
comp_id INT,
myDate DATE,
xyz INT,
abc VARCHAR(1)
)
;
INSERT INTO yourTable SELECT 1, '2012-01-01', 10, 'Y';
INSERT INTO yourTable SELECT 2, '2011-01-01', 10, 'Y';
INSERT INTO yourTable SELECT 3, '2006-01-01', 10, 'Y';
INSERT INTO yourTable SELECT 4, '2011-01-01', 20, 'Y';
INSERT INTO yourTable SELECT 5, '2002-01-01', 20, 'Y';
INSERT INTO yourTable SELECT 6, '2000-01-01', 20, 'Y';
INSERT INTO yourTable SELECT 7, '1998-01-01', 20, 'Y';
SELECT
yourTable.*
FROM
(
SELECT
lookup.XYZ,
COALESCE(MAX(yourTable.myDate), lookup.MaxDate) AS MaxDate
FROM
(
SELECT
lookup.XYZ,
COALESCE(MAX(yourTable.myDate), lookup.MaxDate) AS MaxDate
FROM
(
SELECT
yourTable.XYZ,
MAX(yourTable.myDate) AS MaxDate
FROM
yourTable
WHERE
yourTable.ABC = 'Y'
GROUP BY
yourTable.XYZ
)
AS lookup
LEFT JOIN
yourTable
ON yourTable.XYZ = lookup.XYZ
AND yourTable.myDate < lookup.MaxDate
AND yourTable.ABC = 'Y'
GROUP BY
lookup.XYZ,
lookup.MaxDate
)
AS lookup
LEFT JOIN
yourTable
ON yourTable.XYZ = lookup.XYZ
AND yourTable.myDate < lookup.MaxDate
AND yourTable.ABC = 'Y'
GROUP BY
lookup.XYZ,
lookup.MaxDate
)
AS lookup
INNER JOIN
yourTable
ON yourTable.XYZ = lookup.XYZ
AND yourTable.myDate >= lookup.MaxDate
WHERE
yourTable.ABC = 'Y'
ORDER BY
yourTable.comp_id
;
DROP TABLE yourTable;
There are other options, but they're all a bit hacky. Search SO for greatest-n-per-group mysql.
My results using your example data:
Comp_ID | DATE | XYZ | ABC
------------------------------
1 | 2012-1-1 | 10 | Y
2 | 2011-1-1 | 10 | Y
3 | 2006-1-1 | 10 | Y
4 | 2011-1-1 | 20 | Y
5 | 2002-1-1 | 20 | Y
6 | 2000-1-1 | 20 | Y
Here's another way, hopefully more efficient than Dems' answer.
Test it with an index on (abc, xyz, date):
SELECT m.xyz, m.date --- for all columns: SELECT m.*
FROM
( SELECT DISTINCT xyz
FROM money.cash
WHERE abc = 'Y'
) AS dm
JOIN
money.cash AS m
ON m.abc = 'Y'
AND m.xyz = dm.xyz
AND m.date >= COALESCE(
( SELECT im.date
FROM money.cash AS im
WHERE im.abc = 'Y'
AND im.xyz = dm.xyz
ORDER BY im.date DESC
LIMIT 1
OFFSET 2 --- to get 3 latest rows per xyz
), DATE('1000-01-01') ) ;
If you have more than rows with same (abc, xyz, date), the query may return more than 3 rows per xyz (all tied in 3rd place will all be shown).