Mysql Join get Latest value of each group - mysql

Mysql version: 8.0.21
I am lookig of get the latest value of each "TableData" which has the type "fruit".
Table Name: TableNames
_________________________________________
| id | name | id_group | type |
|-----------------------------------------|
| 0 | AppleGroup | apple | fruit |
| 1 | BananaGroup | banana | fruit |
| 2 | OtherGroup | other | other |
Table Name: TableData
__________________________
| id | id_group | value |
|--------------------------|
| 0 | apple | 12 |
| 1 | banana | 8 |
| 2 | apple | 3 | <--get latest
| 3 | banana | 14 |
| 4 | banana | 4 | <--get latest
With this Query I get all the items, but I am looking for the lastest of each.
I already tried to group by and order by, but the problem is that I first need to order by and then group by, seems that's not possible in Mysql.
SELECT
n.name,
d.value
FROM TableNames n
INNER JOIN
(
SELECT *
FROM TableData
) d ON d.`id_group` = n.`id_group`
WHERE type = 'fruit'
Expected ouput:
_____________________
| name | value |
|---------------------|
| AppleGroup | 3 |
| BananaGroup | 4 |

Without ROW_NUMBER(), because you can be on an older version of MySQL (before 8.0), you can create an inner join with the max(id):
SELECT
TableNames.name,
TableData.value
FROM
TableData
INNER JOIN (
SELECT
id_group,
MAX(id) as max
FROM TableData
GROUP BY id_group) x ON x.id_group = TableData.id_group
INNER JOIN TableNames on TableNames.id_group = TableData.id_group
WHERE x.max = TableData.id
see: DBFIDDLE

On MySQL 8+, we can use ROW_NUMBER():
WITH cte AS (
SELECT tn.name, tn.id_group, td.value,
ROW_NUMBER() OVER (PARTITION BY td.id_group ORDER BY td.id DESC) rn
FROM TableNames tn
INNER JOIN TableData td
ON td.id_group = tn.id_group
WHERE tn.type = 'fruit'
)
SELECT name, value
FROM cte
WHERE rn = 1
ORDER BY id_group;
For optimization, you may consider adding the following index to the TableData table:
CREATE INDEX idx ON TableData (id_group);
Note that on InnoDB, MySQL will automatically include id at the end of the index, hence the index is (id_group, id). This should let MySQL efficiently do the join in the CTE and also compute ROW_NUMBER.

Related

Select Unique Rows Based on Single Distinct Column - MySQL

I want to select rows that have a distinct email, see the example table below:
Table Name = Users
+----+---------+-------------------+-------------+
| id | title | email | commentname |
+----+---------+-------------------+-------------+
| 3 | test | rob#hotmail.com | rob |
| 4 | i agree | rob#hotmail.com | rob |
| 5 | its ok | rob#hotmail.com | rob |
| 6 | hey | rob#hotmail.com | rob |
| 7 | nice! | simon#hotmail.com | simon |
| 8 | yeah | john#hotmail.com | john |
+----+---------+-------------------+-------------+
The desired result would be:
+----+-------+-------------------+-------------+
| id | title | email | commentname |
+----+-------+-------------------+-------------+
| 5 | its ok| rob#hotmail.com | rob |
| 7 | nice! | simon#hotmail.com | simon |
| 8 | yeah | john#hotmail.com | john |
+----+-------+-------------------+-------------+
Distinct value should be latest entry in Table Example id = 6
What would be the required SQL?
If you are using MySQL 5.7 or earlier, then you may join your table to a subquery which finds the most recent record for each email:
SELECT t1.id, t1.title, t1.email, t1.commentname
FROM yourTable t1
INNER JOIN
(
SELECT email, MAX(id) AS latest_id
FROM yourTable
GROUP BY email
) t2
ON t1.email = t2.email AND t1.id = t2.latest_id;
If you are using MySQL 8+, then just use ROW_NUMBER here:
WITH cte AS (
SELECT id, title, email, commentname,
ROW_NUMBER() OVER (PARTITION BY email ORDER BY id DESC) rn
FROM yourTable
)
SELECT id, title, email, commentname
FROM cte
WHERE rn = 1;
Note: Your expected output probably has a problem, and the id = 6 record is the latest for rob#hotmail.com.
You can try below using correlated subquery
select * from table1 a
where id in (select max(id) from table1 b where a.email=b.email group by b.email)
If 'id' is unique or primary key you could use this one:
select * from Users where id in (select max(id) from Users group by commentname)
Above one would up your database performance because the correlated subqueries comes from the fact that the subquery uses information from the outer query and the subquery executes once for every row in the outer query.So,I will suggest you using my answer if 'id' is unique.

MySQL: Enumerate and count

I have two tables, table1 and table2.
Example of the table1 table.
^ invoice ^ valid ^
| 10 | yes |
| 11 | yes |
| 12 | no |
Example of the table2 table
^ invoice ^ detail ^
| 10 | A |
| 10 | C |
| 10 | F |
| 11 | A |
| 11 | F |
| 10 | E |
| 12 | A |
Want to select from table 2 all rows that:
Have a valid invoice in table 1
And enumerate:
the detail for each invoice
the invoice
Here the desired result
^ invoice ^ detail ^ ordination ^ ordinationb ^
| 10 | A | 1 | 1 |
| 10 | C | 2 | 1 |
| 10 | F | 3 | 1 |
| 11 | A | 1 | 2 |
| 11 | F | 2 | 2 |
| 10 | E | 4 | 1 |
The sentence should valid for use in phpMyAdmin 4.8.4
Here is the MySQL 8+ way of doing this:
SELECT
t2.Invoice,
t2.`lines`,
ROW_NUMBER() OVER (PARTITION BY t2.Invoice ORDER BY t2.`lines`) line_order,
DENSE_RANK() OVER (ORDER BY t2.Invoice) ordination
FROM table2 t2
WHERE EXISTS (SELECT 1 FROM table1 t1 WHERE t1.Invoice = t2.Invoice AND t1.valid = 'yes');
Demo
If you are using a version of MySQL earlier than 8, then you might have to resort to using session variables. This can lead to an ugly query. If you have a long term need for queries like this one, then I recommending upgrading to MySQL 8+.
Edit:
It just dawned on me that we can use correlated subqueries to simulate both your ROW_NUMBER and DENSE_RANK requirements. Here is one way to do this query in MySQL 5.7 or earlier:
SELECT
t2.Invoice,
t2.detail,
(SELECT COUNT(*) FROM table2 t
WHERE t.Invoice = t2.Invoice AND t.detail <= t2.detail) ordination,
t.dr AS ordinationb
FROM table2 t2
INNER JOIN
(
SELECT DISTINCT
t2.Invoice,
(SELECT COUNT(*)
FROM (SELECT DISTINCT Invoice FROM table2) t
WHERE t.Invoice <= t2.Invoice) dr
FROM table2 t2
) t
ON t.Invoice = t2.Invoice
WHERE EXISTS (SELECT 1 FROM table1 t1 WHERE t1.Invoice = t2.Invoice AND t1.valid = 'yes')
ORDER BY
t2.Invoice,
t2.detail;
Demo

MySQL - Return Latest Date and Total Sum from two rows in a column for multiple entries

For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;

MySQL JOIN with LIMIT query results

I have 2 tables, products and origins
Products:
p_id | name | origin_id
------------------------
1 | P1 | 1
2 | P2 | 2
3 | P3 | 1
Origins:
o_id | name
-------------
1 | O1
2 | O2
I am using the following query :
SELECT * FROM `products` LEFT OUTER JOIN `origins`
ON ( `products`.`origin_id` = `origins`.`o_id` ) LIMIT 2
I am getting the below results
p_id | name | origin_id | o_id | name
-----------------------------------------
1 | P1 | 1 | 1 | O1
3 | P3 | 1 | 1 | O1
I was wondering how the LEFT OUTER JOIN affects the result where I am getting the first and the third row rather than the first and the second row?
When you are not using ORDER BY Clause, there is no guarantee of a specific order for your SELECT query.
So we should use ORDER BY when we need any specific order.
See this: MySQL Ref: What is The Default Sort Order of SELECT with no ORDER BY Clause
You don't control the inherent ordering of rows in a table. It behaves like a set. If you want to order it, use order by clause.
SELECT * FROM `products` p LEFT OUTER JOIN `origins` o
ON ( p.`origin_id` = o.`o_id` ) ORDER BY p.`name` LIMIT 2
Output :
p_id | name | origin_id | o_id | name
-----------------------------------------
1 | P1 | 1 | 1 | O1
2 | P2 | 2 | 2 | O2

Get one single record when existing duplicates

I have an ingredients translations table this form (some columns have been removed for simplicity, but still required in the result)
| id | name | ingredient_id | language |
| 1 | Water | 11 | en |
| 2 | Bell pepper | 12 | en |
| 3 | Sweet pepper | 12 | en |
I'm trying to build a query to retrieve just one single ingredient translation per ingredient like this (expected result)
| id | name | ingredient_id |
| 1 | Water | 11 |
| 2 | Bell pepper | 12 |
So far now I'm trying to do it with this query
select it1.*
from ingredient_translations it1
left outer join ingredient_translations it2
on it1.ingredient_id = it2.ingredient_id
and it1.id < it2.id
where it1.language = 'es'
but it's now giving the expected results :/
flag
I'm using postgresql, though I was trying to do this using joins so I can device a cross-db (Postgresql - MySQL) solution.
Please, any insight will be apreciated!!! :D
WITH CustomerCTE (
SELECT t1.*,ROW_NUMBER() OVER (PARTITION BY ingredient_id ORDER BY id DESC) AS RN
FROM ingredient_translations t1
INNER JOIN ingredient_translations t2 ON t1.ingredient_id = t2.ingredient_id
)
SELECT * FROM CustomerCTE WHERE RN = 1
ORDER BY id;
Use ROW_NUMBER() over partition.
Query
select id,name,ingredient_id,language from
(
select id,name,ingredient_id,language,
row_number() over
(
partition by ingredient_id
order by id
) rn
from tbl_Name
)t
where t.rn < 2;
SQL Fiddle