I have tried to solve the following problem for the last couple of hours and could not find anything that pointed me in the right direction on Google or Stackoverflow. I believe that this could be a similar problem, but I did not really understand what the author wanted to achieve, hence I am trying it with my own concrete example:
I have a table that basically tracks prices of different products over time:
+------------+--------+----------+
| Product_id | Price | Time |
+------------+--------+----------+
| 1 | 1.30 | 13:00:00 |
| 1 | 1.10 | 13:30:00 |
| 1 | 1.50 | 14:00:00 |
| 1 | 1.60 | 14:30:00 |
| 2 | 2.10 | 13:00:00 |
| 2 | 2.50 | 13:30:00 |
| 2 | 1.90 | 14:00:00 |
| 2 | 2.00 | 14:30:00 |
| 3 | 1.45 | 13:00:00 |
| 3 | 1.15 | 13:30:00 |
| 3 | 1.50 | 14:00:00 |
| 3 | 1.55 | 14:30:00 |
+------------+--------+----------+
I would now like to query the table so that the rows with max. Price for each product are returned:
+------------+--------+----------+
| Product_id | Price | Time |
+------------+--------+----------+
| 1 | 1.60 | 14:30:00 |
| 2 | 2.50 | 13:30:00 |
| 3 | 1.55 | 14:30:00 |
+------------+--------+----------+
Also, in case of duplicates, i.e. if there is a max. Price at two different points in time, it should only return one row, preferably the one with the smallest value of time.
I have tried MAX() and GREATEST(), but could not achieve the desired outcome to show the wanted values for each product. Efficiency of the query is not the most important factor, but I have about 500 different products with several million rows of data, hence splitting the table by unique product did not seem like an appropriate solution.
Group the data product id and pick the max price and max time
select t1.product_id,t1.price,min(t1.time) as time from your_table t1
inner join (
select Product_id,max(price)as price from
your_table group by Product_id
) t2 on t1.Product_id=t2.Product_id and t1.price=t2.price group by t1.product_id
Sql Fiddle Example:
http://sqlfiddle.com/#!9/020c3/9
http://sqlfiddle.com/#!9/020c3/1
SELECT p.*
FROM prices p
LEFT JOIN prices p1
ON p.product_id = p1.product_id
AND p.time<p1.time
WHERE p1.product_id IS NULL
If you need maximum price to get you can:
http://sqlfiddle.com/#!9/020c3/6
SELECT p.*
FROM prices p
LEFt JOIN prices p1
ON p.product_id = p1.product_id
AND p.price<p1.price
WHERE p1.product_id IS NULL;
And the last approach since I didn't get the goal from the beggining:
http://sqlfiddle.com/#!9/ace04/2
SELECT p.*
FROM prices p
LEFt JOIN prices p1
ON p.product_id = p1.product_id
AND (
p.price<p1.price
OR (p.price=p1.price AND p.time<p1.time)
)
WHERE p1.product_id IS NULL;
This solution is assuming the existence of an additional my_table.id column that needs to be used in case there are duplicate values for (Product_id, price, time) in your table. id is assumed to be a unique value in the table.
SELECT *
FROM my_table t1
WHERE NOT EXISTS (
SELECT *
FROM my_table t2
WHERE t1.Product_id = t2.Product_id
AND ((t1.price < t2.price) OR
(t1.price = t2.price AND t1.time > t2.time) OR
(t1.price = t2.price AND t1.time = t2.time AND t1.id > t2.id))
)
Alternatively, the predicate on price and time could also be expressed using a row value expression predicate (not sure if it's more readable, as t1 and t2 columns are mixed in a each row value expression):
SELECT *
FROM my_table t1
WHERE NOT EXISTS (
SELECT *
FROM my_table t2
WHERE t1.Product_id = t2.Product_id
AND (t1.price, t2.time, t2.id) < (t2.price, t1.time, t1.id)
)
Related
The table seems
group_id | time | data | others...
1 2020-11-30 12:00:00 13
1 2020-11-30 13:00:00 15
2 2020-11-30 12:30:00 254
3 2021-02-21 18:00:00 25565
...
I wanted to get the "data" of "latest time"(which can be different for each group) for each group specified,
so I wrote down like
SELECT group_id, max(time), data
where group_id between (...)
(and others = ...)
group by group_id
but It's data was not the value of that time (actually seems like the latest record)
How can I get the data of maximum time for each group?
Since you are using a primary key on group_id, time, you can use a standard exclusion join, as opposed to a GROUP BY, since the records are unambiguous.
In this instance to obtain the group_id and MAX(time) with the corresponding other values you would LEFT JOIN table AS b ON a.group_id = b.group_id AND a.time < b.time Then remove the undesired entries by filtering the left join table results with WHERE b.group_id IS NULL
Exclusion Join
DB-Fiddle
SELECT t1.group_id, t1.`time`, t1.`data`
FROM `table` AS t1
LEFT JOIN `table` AS t2
ON t2.group_id = t1.`group_id`
AND t1.`time` < t2.`time`
WHERE t1.group_id BETWEEN ...
AND t1.others = ...
AND t2.group_id IS NULL;
Result:
| group_id | time | data |
| -------- | ------------------- | ----- |
| 1 | 2020-11-30 13:00:00 | 15 |
| 2 | 2020-11-30 12:30:00 | 254 |
| 3 | 2021-02-21 18:00:00 | 25565 |
Alternatively, since you have group_id, time as your primary key, you can use a compound IN subquery criteria with the MAX(time).
Compound IN Subquery
DB-Fiddle
SELECT t1.group_id, t1.`time`, t1.`data`
FROM `table` AS t1
WHERE t1.others = ....
AND (t1.group_id, t1.`time`) IN(
SELECT group_id, MAX(`time`)
FROM `table`
WHERE group_id BETWEEN ....
GROUP BY group_id
);
Result:
| group_id | time | data |
| -------- | ------------------- | ----- |
| 1 | 2020-11-30 13:00:00 | 15 |
| 2 | 2020-11-30 12:30:00 | 254 |
| 3 | 2021-02-21 18:00:00 | 25565 |
I myself found answer for this using JOIN
with tmp as (
SELECT group_id, max(time) as time FROM table
WHERE group_id between (...)
(and others = ...)
group by group_id
)
SELECT r.data FROM table as r
INNER JOIN tmp ON r.group_id = tmp.group_id and r.time = tmp.time
where group_id between (...)
(and others = ...)
I am trying to get the tracking number from a customers most recent order, but I am having trouble using MAX.
This just keeps returning nothing, even though I know table2 has values in there with dates. What's wrong with my query?
SELECT
t1.Invoice_Num,
t1.Tracking_Num
FROM
table1 t1
JOIN
table2 t2a on t1.Invoice_Num = t2a.Invoice_Num
JOIN (
SELECT
t2b.Invoice_Num,
MAX(t2b.Invoice_Date) Last_Sale
FROM
table2 t2b
WHERE
t2b.Customer_Num = 'cust1'
GROUP BY t2b.Invoice_Num
) LS
on t1.Invoice_Num = LS.Invoice_Num
--------------------------------------------------
Table1
+-------------+--------------+
| Invoice_Num | Tracking_Num |
+-------------+--------------+
| abc123 | 12345678 |
| def456 | 87654321 |
+-------------+--------------+
Table2
+-------------+--------------+--------------+
| Invoice_Num | Customer_Num | Invoice_Date |
+-------------+--------------+--------------+
| abc123 | cust1 | 10/25/2017 |
| def456 | cust1 | 10/24/2017 |
+-------------+--------------+--------------+
Desired output is -
+-------------+--------------+
| Invoice_Num | Tracking_Num |
+-------------+--------------+
| abc123 | 12345678 |
+-------------+--------------+
based on the most recent Invoice_Date of cust1
Here's a generic alternative approach that can come in handy:
use ORDER BY .. DESC and LIMIT 1:
SELECT
t1.Invoice_Num,
t1.Tracking_Num
FROM table1 t1
JOIN table2 t2 USING(Invoice_Num)
WHERE t2.Customer_Num = 'cust1'
ORDER BY t2.Invoice_Date DESC
LIMIT 1
SQL Fiddle
SQL DEMO
SELECT
t1.Invoice_Num,
t1.Tracking_Num
FROM table1 t1
JOIN table2 t2
ON t1.Invoice_Num = t2.Invoice_Num
JOIN ( SELECT MAX(t2b.Invoice_Date) Last_Sale
FROM table2 t2b
WHERE t2b.Customer_Num = 'cust1'
) LS
ON t2.Invoice_Date = LS.Last_Sale
Be carefull because if multiple rows share the Last Sale you will get multiple rows.
For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;
I am faced with a complicated problem of taking difference in values in rows.
Sales column shows in total sales and it is automatically updated. I would like create a table with column SalesUpdate where it takes the difference in Sales from the two most recent Sales value in TABLE 1.
TABLE 1.
№ | Date | Product | Sales
----------------------------------------
1 | 2017-03-01 | Coke | 10
2 | 2017-03-02 | Pepsi | 9
3 | 2017-03-03 | Tea | 12
4 | 2017-03-04 | Coke | 20
5 | 2017-03-05 | Coke | 22
6 | 2017-03-06 | Pepsi | 15
TABLE 2.
№ | Product | Date | SalesUpdate
---------------------------------------------------------------
1 | Coke | 2017-03-01 | 22-20 = 2
2 | Pepsi | 2017-03-02 | 15-9 = 6
3 | Tea | 2017-03-03 | 12-0 = 12
Not quite elegant solution, but, at least, not DBMS-specific :) Also, didn't get, what data you want to receive in Date column of resulting set.
SELECT md.product,
md.max_Date Date,
(t1.sales - CASE WHEN t2.sales IS NULL THEN 0 ELSE t2.sales END) SalesUpdate
FROM (SELECT MAX(DATE) max_date,
product
FROM TABLE1
GROUP BY PRODUCT) md
INNER JOIN TABLE1 t1 ON md.product = t1.product AND md.max_date = t1.DATE
LEFT JOIN TABLE1 t2 ON t2.product = t1.product AND t2.date < t1.date
LEFT JOIN TABLE1 t3 ON t2.product = t3.product AND t3.date > t2.date AND t3.date < t1.date
WHERE t3.product IS NULL
I have two tables, one is a products table, and the other is an offers table for different vendors offering the same product.
Table1:
sku name
----|-----
a | Iphone
b | Galaxy 5
c | Nexus 6
Table2:
sku price vendor
----|-------|--------
a | 5.00 | storeX
a | 6.00 | storeY
a | 7.00 | storeZ
b | 15.00 | storeP
b | 20.00 | storeQ
b | 30.00 | storeR
c | 11.00 | storeD
c | 12.00 | storeF
c | 13.00 | storeG
I am trying to run a SELECT on these tables so I can get the lowest offer for each item. So my result would be:
sku price vendor
----|--------|--------
a | 5.00 | storeX
b | 15.00 | storeP
c | 11.00 | stored
I have tried SELECT table1.sku,table2.price FROM table2 JOIN table1 ON table2.sku = table1.sku WHERE table2.sku IN ('a','b','c');
But that just give me all offers. Any help with this query is appreciated.
You need two queries: one to determine the min price per product, and then a parent query to select the other fields related to that min price:
SELECT table2.*, table1.name
FROM table2
LEFT JOIN (
SELECT MIN(price) AS min, sku
FROM table2
GROUP BY sku
) AS child ON ((table2.price = child.min) AND (table2.sku = child.sku))
LEFT JOIN table1 ON (table2.sku = table1.sku)
This also has the advantage of showing if multiple stores have the same minimum price.