Here's the SQLFiddle with schema and data.
I'm trying to sum 2 columns, one at parent level and the other at child level.
The current query I'm using gives me the right sum amount on child level, but doubles up the amount on parent level, due to another 1-many relationship involved on the child level.
Ugh... that's a terrible explanation - here's the English version:
Joe the salesman is involved in 2 sales.
For the 1st sale, he get's 2 sets of commissions, based on 2 different commission types. I'm trying to show Joe's total sale value, alongside the total value of his applicable splits. The split value total is fine, but sale value get's doubled up because I'm obviously, grouping/joining incorrectly (see the last example below).
This is fine:
select sp.person_name, pr.description,
sum(spl.split) as SplitValue
from sale s, product pr, sales_person sp, sales_split spl
where s.product_id = pr.id
and s.id = spl.sale_id
and sp.id = spl.sales_person_id
group by sp.id;
person_name | description | SplitValue
----------- ----------- | ----------
Joe | Widget 1 | 50
Sam | Widget 1 | 10
This is also yields the correct split and sale values, but now 3 rows are displayed for Joe (i.e 2nd row is a duplicate of the 1st one) - I only want to display Joe's "Widget 1" sale once, so not correct:
select sp.person_name, pr.description,
sum(s.sale_value) as SaleValue, sum(spl.split) as SplitValue
from sale s, product pr, sales_person sp, sales_split spl, sales_split_agreement ssa
where s.id = spl.sale_id
and s.product_id = pr.id
and sp.id = spl.sales_person_id
and sp.id = ssa.sales_person_id
and spl.sales_person_id = ssa.sales_person_id
and ssa.id = spl.sales_split_agreement_id
group by sp.id, spl.id;
person_name | description | SplitValue | SaleValue
----------- ----------- ---------- ---------
Joe | Widget 1 | 10 | 20
Joe | Widget 1 | 10 | 20
Joe | Widget 2 | 30 | 30
Sam | Widget 1 | 10 | 20
Now the duplicated row is gone, but Joe's SaleValue is incorrect - it should be 50, not 70:
select sp.person_name, pr.description,
sum(spl.split) as SplitValue, sum(s.sale_value) as SaleValue
from sale s, product pr, sales_person sp, sales_split spl, sales_split_agreement ssa
where s.id = spl.sale_id
and s.product_id = pr.id
and sp.id = spl.sales_person_id
and sp.id = ssa.sales_person_id
and spl.sales_person_id = ssa.sales_person_id
and ssa.id = spl.sales_split_agreement_id
group by sp.id;
person_name | description | SplitValue | SaleValue
----------- ----------- --------- ----------
Joe | Widget 1 | 50 | 70
Sam | Widget 1 | 10 | 20
I.e. I'm after the query that will yield this result (i.e. Joe's correct SaleValue of 50):
person_name | description | SplitValue | SaleValue
----------- ----------- --------- ----------
Joe | Widget 1 | 50 | 50
Sam | Widget 1 | 10 | 20
Any help will be greatly appreciated!
UPDATE 1:
For clarity - here's the schema and test data from the fiddle:
CREATE TABLE product
(`id` int, `description` varchar(12))
;
INSERT INTO product
(`id`, `description`)
VALUES
(1, 'Widget 1'),
(2, 'Widget 2')
;
CREATE TABLE sales_person
(`id` int, `person_name` varchar(7))
;
INSERT INTO sales_person
(`id`, `person_name`)
VALUES
(1, 'Joe'),
(2, 'Sam')
;
CREATE TABLE sale
(`id` int, `product_id` int, `sale_value` int)
;
INSERT INTO sale
(`id`, `product_id`, `sale_value`)
VALUES
(1, 1, 20.00),
(2, 2, 30.00)
;
CREATE TABLE split_type
(`id` int, `description` varchar(6))
;
INSERT INTO split_type
(`id`, `description`)
VALUES
(1, 'Type 1'),
(2, 'Type 2')
;
CREATE TABLE sales_split_agreement
(`id` int, `sales_person_id` int, `split_type_id` int, `percentage` int)
;
INSERT INTO sales_split_agreement
(`id`, `sales_person_id`, `split_type_id`, `percentage`)
VALUES
(1, 1, 1, 50),
(2, 1, 2, 50),
(3, 2, 1, 50),
(4, 1, 1, 100)
;
CREATE TABLE sales_split
(`id` int, `sale_id` int, `sales_split_agreement_id` int, `sales_person_id` int, `split` int )
;
INSERT INTO sales_split
(`id`, `sale_id`, `sales_split_agreement_id`, `sales_person_id`, `split`)
VALUES
(1, 1, 1, 1, 10),
(2, 1, 2, 1, 10),
(3, 1, 3, 2, 10),
(4, 2, 4, 1, 30)
;
I think you were on to the right track, but I decided to restart and approach from the beginning. Getting the SplitValue for each person does not require all those tables. In fact, all you need are sales_split and sales_person, like this:
SELECT sp.person_name, SUM(ss.split) AS SplitValue
FROM sales_person sp
JOIN sales_split ss ON sp.id = ss.sales_person_id
GROUP BY sp.id;
Similarly, you can get the total sale value for each person with a join between sale, sales_split, and sales_person:
SELECT sp.person_name, SUM(s.sale_value) AS SaleValue
FROM sale s
JOIN sales_split ss ON ss.sale_id = s.id
JOIN sales_person sp ON sp.id = ss.sales_person_id
GROUP BY sp.id;
At this point, I realize you have an error in your expected results (for this data set). Joe does in fact have a sale value of 70, because sale id 1 (value 20), 2 (value 20), and 4 (value 30) add up to 70. However, I still think this query will help you out more than the one you have.
At this point, you can get the values for each sales_person_id by joining those two subqueries to the sales_person table. I took out the join to sales_person in the subqueries, as it became irrelevant now. It even makes the subqueries a little cleaner:
SELECT sp.person_name, COALESCE(t1.SplitValue, 0) AS SplitValue, COALESCE(t2.SaleValue, 0) AS SaleValue
FROM sales_person sp
LEFT JOIN(
SELECT ss.sales_person_id, SUM(ss.split) AS SplitValue
FROM sales_split ss
GROUP BY ss.sales_person_id) t1 ON t1.sales_person_id = sp.id
LEFT JOIN(
SELECT ss.sales_person_id, SUM(s.sale_value) AS SaleValue
FROM sale s
JOIN sales_split ss ON ss.sale_id = s.id
GROUP BY ss.sales_person_id) t2 ON t2.sales_person_id = sp.id;
Here is an SQL Fiddle example.
EDIT: I understand now why Joe's actual sale price is 50, because he split twice on sale id 1. To work around this, I first got a list of distinct sales for each sales_person like this:
SELECT DISTINCT sale_id, sales_person_id
FROM sales_split;
This way, there is only one row for sales_person_id = 1 and sale_id = 1. Then, it was easy enough to join that to the sale table and get the proper sales value for each sales_person:
SELECT t.sales_person_id, SUM(s.sale_value) AS SaleValue
FROM(
SELECT DISTINCT sale_id, sales_person_id
FROM sales_split) t
JOIN sale s ON s.id = t.sale_id
GROUP BY t.sales_person_id;
The rest of my answer above still fits. I wrote one query to get SplitValue, and one query to get SaleValue, and I joined them together. So, all I have to do now is replace the subquery I just gave you, with the incorrect subquery from further up:
SELECT sp.person_name, COALESCE(t1.SplitValue, 0) AS SplitValue, COALESCE(t2.SaleValue, 0) AS SaleValue
FROM sales_person sp
LEFT JOIN(
SELECT ss.sales_person_id, SUM(ss.split) AS SplitValue
FROM sales_split ss
GROUP BY ss.sales_person_id) t1 ON t1.sales_person_id = sp.id
LEFT JOIN(
SELECT t.sales_person_id, SUM(s.sale_value) AS SaleValue
FROM(
SELECT DISTINCT sale_id, sales_person_id
FROM sales_split) t
JOIN sale s ON s.id = t.sale_id
GROUP BY t.sales_person_id) t2 ON t2.sales_person_id = sp.id;
Here is the updated SQL Fiddle.
You mentioned in the comments that you shortened your data for brevity, which is fine. I am leaving my joins as they are, and I trust that it gives you enough direction that you can adjust them accordingly to match your proper structure.
Related
I want to Join to table. the condition is I want to only join those rows which have only one row to match. eg.
books:
id | name | price
1 | book1 | 19
2 | book2 | 19
3 | book3 | 30
price_offer:
id | offer | price
1 | offer1 | 19
2 | offer2 | 30
so now if I do select query on these table:
SELECT * FROM price_offer
JOIN books ON price_offer.price = books.price
I only want to join book with id 3 as it have only one match with price_offer table.
You could use a self join for books table to pick a book with only single match
select po.*, b1.*
from price_offer po
join books b1 on po.price = b1.price
join (
select price,max(id) id
from books
group by price
having count(*) = 1
) b2 on b1.id = b2.id
Demo
Try following query:
Sample data:
create table books(id int, name varchar(10), price int);
insert into books values
(1, 'book1', 19),
(2, 'book2', 19),
(3, 'book3', 30);
create table price_offer(id int, offer varchar(10), price int);
insert into price_offer values
(1, 'offer1', 19),
(2, 'offer2', 30);
Query:
select max(b.id)
from price_offer p
left join books b on b.price = p.price
where p.id is not null
group by b.price
having count(*) = 1;
If you want to avoid nesting queries where you have to use self-joins, you can use window-functions of MySQL 8.0.11, which are exactly for cases like this
I have two tables on MySql, the first contains an ID and the name of some products. I have to get the cheapest combination of brand/market for each product. So, I've inserted some itens into both tables:
UPDATE: Inserted new product (bed) with no 'Product_Brand_Market' to test LEFT JOIN.
UPDATE: Changed some product prices for better testing.
CREATE TABLE Product(
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL);
CREATE TABLE Product_Brand_Market(
product INT UNSIGNED,
market INT UNSIGNED, /*this will be a FOREIGN KEY*/
brand INT UNSIGNED, /*this will be a FOREIGN KEY*/
price DECIMAL(10,2) UNSIGNED NOT NULL,
PRIMARY KEY(product, market, brand),
CONSTRAINT FOREIGN KEY (product) REFERENCES Product(id));
INSERT INTO Product
(name) VALUES
('Chair'), /*will get id=1*/
('Table'), /*will get id=2*/
('Bed'); /*will get id=3*/
INSERT INTO Product_Brand_Market
(product, market, brand, price) VALUES
(1, 1, 1, 8.00), /*cheapest chair (brand=1, market=1)*/
(1, 1, 2, 8.50),
(1, 2, 1, 9.00),
(1, 2, 2, 9.50),
(2, 1, 1, 11.50),
(2, 1, 2, 11.00),
(2, 2, 1, 10.50),
(2, 2, 2, 10.00); /*cheapest table (brand=2, market=2)*/
/*no entries for bed, must return null*/
And tried the following code to get the desired values:
UPDATE: Changed INNER JOIN for LEFT JOIN.
SELECT p.id product, MIN(pbm.price) price, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON p.id = pbm.product
GROUP BY p.id;
The returned price is OK, but I'm getting the wrong keys:
| product | price | brand | market |
|---------|-------|-------|--------|
| 1 | 8 | 1 | 1 |
| 2 | 10 | 1 | 1 |
| 3 | null | null | null |
So the only way I could think to solve it is with subqueries, but I had to use two subqueries to get both brand and market:
SELECT
p.id product,
(
SELECT pbm.brand
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as brand,
(
SELECT pbm.market
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as market
FROM Product p;
It returns the desired table:
| product | brand | market |
|---------|-------|--------|
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | null | null |
But I want to know if I really should use these two similar subqueries or there is a better way to do that on MySql, any ideas?
Use a correlated subquery with LIMIT 1 in the WHERE clause:
SELECT product, brand, market
FROM Product_Brand_Market pbm
WHERE (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
)
This will return only one row per product, even if there are two or many of them with the same lowest price.
Demo: http://rextester.com/UIC44628
Update:
To get all products even if they have no entries in the Product_Brand_Market table, you will need a LEFT JOIN. Note that the condition should be moved to the ON clause.
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON pbm.product = p.id
AND (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
);
Demo: http://rextester.com/MGXN36725
The follwing query might make a better use of your PK for the JOIN:
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON (pbm.product, pbm.market, pbm.brand) = (
SELECT pbm1.product, pbm1.market, pbm1.brand
FROM Product_Brand_Market pbm1
WHERE pbm1.product = p.id
ORDER BY pbm1.price ASC
LIMIT 1
);
An index on Product_Brand_Market(product, price) should also help to improve the performance of the subquery.
I have two tables:
CREATE TABLE instructions (
`id_instruction` INT(11),
`id_step` INT(11)
);
CREATE TABLE steps (
`id_instruction` INT(11),
`id_step` INT(11),
`val` VARCHAR(255)
);
One table contains instructions, another table contains steps. Each instruction may have many steps. Now, the data is:
INSERT INTO instructions (`id_instruction`, `id_step`) VALUES (1, 0), (1, 1), (1, 2);
INSERT INTO steps (`id_instruction`, `id_step`, `val` ) VALUES (1, 0, 'One'), (1, 0, 'Two'), (1, 0, 'Three'); /* step 0 */
INSERT INTO steps (`id_instruction`, `id_step`, `val` ) VALUES (1, 1, 'Five'), (1, 1, 'Six'), (1, 1, 'Seven'); /* step 1 */
INSERT INTO steps (`id_instruction`, `id_step`, `val` ) VALUES (1, 2, 'Eight'), (1, 2, 'Nine'), (1, 2, 'Ten'); /* step 2 */
For each instruction I want to have two concatenations - one which concatenates values from val column for the zero step, and another one which concatenates values from the same column for the largest step of the instruction. I know how to get the largest step and how to make a single group concatenation, but trying to do two concatenations, I get duplicates. Now, my query looks like this:
SELECT maxstep, i.id_instruction, i.id_step, GROUP_CONCAT(s.val) AS val_0
FROM instructions i
INNER JOIN (
SELECT MAX(id_step) AS maxstep, id_instruction FROM instructions i
GROUP BY i.id_instruction
) i2 ON i2.id_instruction = i.id_instruction
LEFT JOIN steps s ON s.id_instruction = i.id_instruction AND s.id_step = i.id_step
GROUP BY i.id_instruction, i.id_step
It just concatenates values per a pair instruction-step. But I want to have one more concatenation which would also concatenate values for the maxstep. The desired result should look like this:
| maxstep | id_instruction | val_0 | val_1 |
| 2 | 1 | One,Two, Three | Eight, Nine, Ten |
PS. I do join instead of just MAX and grouping, because I want to use its value in additional joining for further concatenation.
What you're trying to do is called pivoting. In MySQL there's no built-in function for this, but you can do it like this:
SELECT maxstep, id_instruction,
MAX(CASE id_step WHEN 0 THEN val END) AS val_0,
MAX(CASE id_step WHEN 1 THEN val END) AS val_1,
MAX(CASE id_step WHEN 2 THEN val END) AS val_2
FROM (
SELECT maxstep, i.id_instruction, i.id_step, GROUP_CONCAT(s.val) AS val
FROM instructions i
INNER JOIN (
SELECT MAX(id_step) AS maxstep, id_instruction FROM instructions i
GROUP BY i.id_instruction
) i2 ON i2.id_instruction = i.id_instruction
LEFT JOIN steps s ON s.id_instruction = i.id_instruction AND s.id_step = i.id_step
GROUP BY i.id_instruction, i.id_step
) sq
GROUP BY maxstep, id_instruction
Result:
maxstep id_instruction val_0 val_1 val_2
-----------------------------------------------------------------
2 1 One,Two,Three Five,Six,Seven Ten,Eight,Nine
By changing the query a little so that the inner join only gets the highest step and by setting the outer query to only take id_step=0 you can get what you want.
SELECT maxstep, i.id_instruction,GROUP_CONCAT(s.val) AS val_0, val_1
FROM instructions i
INNER JOIN (
SELECT MAX(ins.id_step) AS maxstep, ins.id_instruction, GROUP_CONCAT(st.val) as val_1 FROM instructions ins
LEFT JOIN steps st ON st.id_instruction = ins.id_instruction AND st.id_step = ins.id_step
where (ins.id_instruction, ins.id_step) in (select id_instruction, max(id_step) from instructions group by id_instruction)
GROUP BY ins.id_instruction, ins.id_step
order by maxstep, ins.id_instruction, st.val
)
i2 ON i2.id_instruction = i.id_instruction
LEFT JOIN steps s ON s.id_instruction = i.id_instruction AND s.id_step = i.id_step
where i.id_step=0
GROUP BY i.id_instruction, i.id_step;
Result from the query with extended data now looks like
| maxstep | id_instruction | val_0 | val_1 |
| 2 | 1 | One,Two,Three | Eight,Nine,Ten |
| 3 | 2 | One,Two,Three | 21,22,23 |
I have two tables in my SQL
For example Table1 - ItemPrice:
DATETIME | ITEM | PRICE
2011-08-28 | ABC 123
2011-09-01 | ABC 125
2011-09-02 | ABC 124
2011-09-03 | ABC 127
2011-09-04 | ABC 126
Table2 - DayScore:
DATETIME | ITEM | SCORE
2011-08-28 | ABC 1
2011-08-29 | ABC 8
2011-09-01 | ABC 4
2011-09-02 | ABC 2
2011-09-03 | ABC 7
2011-09-04 | ABC 3
I want to write a query, which given a item ID (e.g. ABC), will return the price at that date from ItemPrice (of there is no price for that date then the query should not return anything). If a valid price is found for the query date, the query should return (in 9 columns)
the price of the item from ItemPrice for the past three days (i.e. the most recent 3 prices before the date queried).
In the next three columns it should return, from DayScore, the matching score for those 3 dates selected from ItemPrice.
Finally the dates (t-1 to t-3) selected
In otherwords the results for this query looking at just date='2011-09-03' as an example for item='abc' would return:
DATE | ITEM | PRICE | SCR | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1 | DATE_t-2 | DATE_t-3
2011-09-03| ABC | 127 | 7 | 124 | 125 | 123 | 2 | 4 | 1 | 2011-09-02| 2011-09-01| 2011-08-28
....
Etc for each date that appears in ItemPrice table.
What is the neatest and most efficient way to run this query (as its something that will be run over many millions of rows)?
Cheers!
Pretty no but it does produce the results. You could probably get rid of some subselects and make it a bit less sql but I tried to build it up in steps so you can deduct what it is doing.
The core part is this select:
SELECT
Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
(SELECT
Sub1.*
, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
FROM
(SELECT
ItemPrice.DateTime
, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1
From ItemPrice) Sub1
) Sub2
This returns a table with the dates (now, t-1, t-2, t-3). From there is is simple joining with price and score for each of those dates. The whole things including testdata the becomes this bulk of sql
/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);
INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);
INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/
SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM
(SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
(SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
FROM
(SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist
INNER JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)
LEFT JOIN
(SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)
WHERE Current.Item = 'ABC'
;
/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/
I'm curious about your explain plan when you do this on 1M rows :) It might not even be that horrible if you have the right indexes which you probably do.
In MySQL, I have two tables with a 1:n relationship.
Table items has products, whose state is kept in another table, like so :
items:
id |ref_num|name |...
1 |0001 |product1|...
2 |0002 |product2|...
items_states :
id|product_id|state_id|date
1 |1 |5 |2010-05-05 10:25:20
2 |1 |9 |2010-05-08 12:38:00
3 |1 |6 |2010-05-10 20:45:12
...
The states table is not relevant and only relates the state_id to the state name and so on.
How can I get products where the latest state is the one I specify, one item per row?
Thank you
You may want to try the following:
SELECT i.ref_num, i.name, s.latest_date
FROM items i
JOIN (
SELECT product_id, MAX(date) as latest_date
FROM items_states
GROUP BY product_id
) s ON (s.product_id = i.id);
If you want to return just one item, simply add a WHERE i.id = ? to the query.
Test case:
CREATE TABLE items (id int, ref_num varchar(10), name varchar(10));
CREATE TABLE items_states (id int, product_id int, state_id int, date datetime);
INSERT INTO items VALUES (1, '0001', 'product1');
INSERT INTO items VALUES (2, '0002', 'product2');
INSERT INTO items_states VALUES (1, 1, 5, '2010-05-05 10:25:20');
INSERT INTO items_states VALUES (2, 1, 9, '2010-05-08 12:38:00');
INSERT INTO items_states VALUES (3, 1, 6, '2010-05-10 20:45:12');
Result:
+---------+----------+---------------------+
| ref_num | name | latest_date |
+---------+----------+---------------------+
| 0001 | product1 | 2010-05-10 20:45:12 |
+---------+----------+---------------------+
1 row in set (0.02 sec)
Either LEFT JOIN the items_states table to itself, requiring a second.date > first.date, and put a WHERE second.id IS NULL clause in it:
SELECT a.*
FROM item_states a
LEFT JOIN item_states b
ON a.product_id = b.product_id
AND b.product_id > a.product_id
WHERE b.id IS NULL AND a.state_id = <desired state>
Or make a row based query: see Mark Byers' example.