MariaDB: Select val_1, max(value_2), group by value_3 - mysql

I have the following columns:
| order_id | client_id | order_timestamp | buyer_id | (all INTs)
It started with the easy-sounding task "Show me the buyer of the last order for each client", so basically
SELECT
client_id,
max(order_timestamp),
buyer_id
FROM table t
GROUP BY client_id;
if GROUP BY would work as one would expect/wish. I know that this is kind of a common problem, but I've never seen this case in particular where you need another value in addition to the one you're grouping by. I guess using the Window functions could help, but we're using MariaDB 10.0, so that's not really an option. I tried different subselect and joins but it always ends with the problem that I can't use the order_id to join, since I have to group by the client_id. It also came to my mind to join using the client_id AND order_timestamp but the combination is not unique in the table, since it's possible to have orders with the exact same (Unix) timestamp for one client or client/buyer combination (so yeah, this would be an edge case, I would need the buyer of the order with the higher order_id, but that's a problem for another day I guess).
If the table was filled like
| order_id | client_id | order_timestamp | buyer_id |
| 1 | 123 | 9876543 | 2 |
| 2 | 123 | 9876654 | 3 |
| 3 | 234 | 9945634 | 2 |
| 4 | 234 | 9735534 | 1 |
I would like to get
| client_id | buyer_id |
------------|----------|
| 123 | 3 |
| 234 | 2 |
Hopefully, somebody can help me, so I can go to sleep in peace tonight.

If your MariaDB version supports window functions you can use ROW_NUMBER():
select t.client_id, t.buyer_id
from (
select *,
row_number() over (partition by client_id order by order_timestamp desc, order_id desc) rn
from tablename
) t
where t.rn = 1
See the demo.
Results:
| client_id | buyer_id |
| --------- | -------- |
| 123 | 3 |
| 234 | 2 |
Without window functions use NOT EXISTS:
select t.client_id, t.buyer_id
from tablename t
where not exists (
select 1 from tablename
where client_id = t.client_id
and (
order_timestamp > t.order_timestamp
or (order_timestamp = t.order_timestamp and order_id > t.order_id)
)
)

If you use max(field), it will pickup the first column of the group condition. In your case first occuring client_id per group which is not what you want.
Try this.
select client_id, order_timestamp, buyer_id from t
where order_timestamp=
(select max(ot) from t as tcopy where tcopy.client_id= t.client_id )
group by client_id;

Related

MySQL: get max row in each group when max value is not unique

There are many questions out there close to this, but I can't find one with a solid example of how to do quite what I want. I need to get a single max row for each group when the maximum value is not unique within a group. Here's a table:
| id | source | name | message_time |
|----|--------|------|--------------|
| 1 | a | cool | 2020-08-18 |
| 2 | a | cool | 2020-08-18 |
| 3 | a | neat | 2020-08-02 |
| 4 | b | nice | 2020-08-19 |
| 5 | b | wow | 2020-08-17 |
For each source, I need a single full row associated with the maximum message_time. Since the max message time is not unique within a group, both of these are valid outputs:
| id | source | name | message_time |
|----|--------|------|--------------|
| 1 | a | cool | 2020-08-18 |
| 4 | b | nice | 2020-08-19 |
| id | source | name | message_time |
|----|--------|------|--------------|
| 2 | a | cool | 2020-08-18 |
| 4 | b | nice | 2020-08-19 |
When there are multiple candidates for max, I just want to randomly select a single row. How can I achieve this with a mysql query?
I'm using MySQL 5.7
Edit:
So I messed around some more and realized this works:
SELECT table.*
FROM (
SELECT source FROM table
GROUP BY source
) groups
LEFT JOIN table
ON id = (
SELECT id FROM table
WHERE source = groups.source
ORDER BY message_time desc
LIMIT 1
);
I think I even understand why it works, but I don't know what no good, very bad practices I am doing here. Also, can it be simplified?
To do this, you use the row_number window function, which requires mysql 8.0 or mariadb 10.2 or better.
select id, source, name, message_time
from (
select id, source, name, message_time,
row_number() over (partition by source order by message_time desc, rand()) as row_num
from a_table
) a_table_with_row_numbers
where row_num=1;
On earlier versions, you can do this:
select
0+substr(max(concat(prefix,id)),19) id,
substr(max(concat(prefix,source)),19) source,
substr(max(concat(prefix,name)),19) name,
max(message_time)
from (
select id, source, name, message_time, concat(message_time, lpad(1e8*rand(),8,'0')) prefix
from a_table
) a_table_with_sortable_prefix_string
group by source;
SELECT
MAX(m.id) as id,
x.source,
x.message_time
FROM
(SELECT
source,
MAX(message_time) AS message_time
FROM
source
GROUP BY
source ) x
INNER JOIN source m ON m.source=x.source AND m.message_time=x.message_time
GROUP BY
x.source,
x.message_time;
This will select the source and the max(message_time),
and then select the max(id) for that source and message_time
EDIT: some typo's in the query, and a test added:
sampledata created using:
CREATE TABLE source (id INTEGER, source CHAR(1), name VARCHAR(10), message_time DATE);
INSERT INTO source VALUES
(1,'a','cool','2020-08-18'),
(2,'a','cool','2020-08-18'),
(3,'a','neat','2020-08-02'),
(4,'b','nice','2020-08-19'),
(5,'b','wow','2020-08-17');
output of query:
+------+--------+--------------+
| id | source | message_time |
+------+--------+--------------+
| 2 | a | 2020-08-18 |
| 4 | b | 2020-08-19 |
+------+--------+--------------+

How do I get multiple maximum column values for multiple rows with the same id?

I need to select the maximum amounts in one column for a common id in another column. There could be several id's in the report_id column that have the same, maximum last_update amounts.
Data structure:
+------+-------+--------------------------------------------+
| id | report_id | last_update |
+------+-------------+--------------------------------------+
| 1 | 1 | 2019-01-24 |
| 2 | 1 | 2019-01-24 |
| 3 | 1 | 2019-01-24 |
| 4 | 2 | 2019-01-24 |
| 5 | 3 | 2019-01-23 |
+------+-------+--------------------------------------------+
The problem I am having so far is I can't seem to isolate my results simply by the report_id. For example, with the following query:
"SELECT report_id, last_update
FROM reports
WHERE last_update=(
SELECT MAX(last_update) FROM reports
WHERE report_id='1'
);
";
This returns:
+------+-------+--------------------------------------------+
| id | report_id | last_update |
+------+-------------+--------------------------------------+
| 1 | 1 | 2019-01-24 |
| 2 | 1 | 2019-01-24 |
| 3 | 1 | 2019-01-24 |
| 4 | 2 | 2019-01-24 |
+------+-------+--------------------------------------------+
So it is nearly correct, but it also is including report_id 2 because it also has the MAX value of 2019-01-24 in last_update.
What I really need to do is select all columns with report_id as 1, and then select only the rows from that result set with MAX(last_update) but I have been looking at every greatest-nth-per-group and associated question on SO and I just can't get this one.
Anytime I bring MAX into the query it seems to negate the fact I am trying to isolate by report_id as well.
Here are a few solutions:
Tuple comparison:
SELECT report_id, last_update
FROM reports
WHERE (report_id, last_update) = (
SELECT report_id, MAX(last_update) FROM reports
WHERE report_id='1'
GROUP BY report_id
);
Tuple comparison with a derived table instead of a dependent subquery:
SELECT report_id, last_update
FROM reports
INNER JOIN (
SELECT report_id, MAX(last_update) AS last_update
FROM reports WHERE report_id='1' GROUP BY report_id
) USING (report_id, last_update);
No-subquery solution, using exclusion join to find the reports for which no other report has the same report_id and a greater update date:
SELECT r1.*
FROM reports AS r1
LEFT OUTER JOIN reports AS r2
ON r1.report_id=r2.report_id AND r1.last_update<r2.last_update
WHERE r2.report_id IS NULL;
MySQL 8.0 solution with windowing functions:
WITH ranked_reports AS (
SELECT r.*, DENSE_RANK() OVER (PARTITION BY report_id ORDER BY last_update DESC) AS dr
FROM reports WHERE report_id='1'
)
SELECT * FROM ranked_reports WHERE dr=1;

Get one single record when existing duplicates

I have an ingredients translations table this form (some columns have been removed for simplicity, but still required in the result)
| id | name | ingredient_id | language |
| 1 | Water | 11 | en |
| 2 | Bell pepper | 12 | en |
| 3 | Sweet pepper | 12 | en |
I'm trying to build a query to retrieve just one single ingredient translation per ingredient like this (expected result)
| id | name | ingredient_id |
| 1 | Water | 11 |
| 2 | Bell pepper | 12 |
So far now I'm trying to do it with this query
select it1.*
from ingredient_translations it1
left outer join ingredient_translations it2
on it1.ingredient_id = it2.ingredient_id
and it1.id < it2.id
where it1.language = 'es'
but it's now giving the expected results :/
flag
I'm using postgresql, though I was trying to do this using joins so I can device a cross-db (Postgresql - MySQL) solution.
Please, any insight will be apreciated!!! :D
WITH CustomerCTE (
SELECT t1.*,ROW_NUMBER() OVER (PARTITION BY ingredient_id ORDER BY id DESC) AS RN
FROM ingredient_translations t1
INNER JOIN ingredient_translations t2 ON t1.ingredient_id = t2.ingredient_id
)
SELECT * FROM CustomerCTE WHERE RN = 1
ORDER BY id;
Use ROW_NUMBER() over partition.
Query
select id,name,ingredient_id,language from
(
select id,name,ingredient_id,language,
row_number() over
(
partition by ingredient_id
order by id
) rn
from tbl_Name
)t
where t.rn < 2;
SQL Fiddle

How to query last records

so, if i have this table:
| ID | Date | Status | Value |
| 1 | 2-2-2012 | A | 5 |
| 2 | 3-4-2012 | B | 3 |
| 1 | 5-6-2012 | C | 1 |
| 2 | 1-1-2012 | D | 4 |
and I need to get total value and "most recent" status for every IDs, how to do the query? i tried using group by , but the somehow only oldest status shown in the query result.
I need to get the data became like this:
| ID | Date | Status |sum(Value)|
| 2 | 3-4-2012 | B | 7 |
| 1 | 5-6-2012 | C | 6 |
i'm a total newbie in this SQL thing, not an IT person, just because my boss ask to extract some data from our database....
thanks in advance...
Since you have not mentioned any RDBMs, the query below works on almost all RDBMS.
This uses a subquery which separately gets the latest date (assuming that the data type of date is really stored as DATE and not as a string) for every ID. The result of the subquery is then joined back on the table itself in order to get the other columns.
SELECT a.ID, a.Date, a.Status, b.TotalSum
FROM tableName a
INNER JOIN
(
SELECT ID, MAX(date) max_date, SUM(Value) totalSum
FROM tableName
GROUP BY ID
) b ON a.ID = b.ID AND
a.date = b.max_date
SQLFiddle Demo
If you are using mysql then this will work
SELECT id,date,status,sum(value)
FROM (select * from yourTable order by DATE desc ) t
group by ID
order by ID desc

How to get the latest items distinctively in a row?

I want to get the remaining/latest balance of the cardnumber from the rows. Below is the sample of the table.
trans_id | cardnumber | trans_date | balance
---------------------------------------------------------------
1 | 1000005240000008 | 2009-07-03 04:54:27 | 88
2 | 1000005120000008 | 2009-07-04 05:00:07 | 2
3 | 1000005110000008 | 2009-07-05 13:18:39 | 3
4 | 1000005110000008 | 2009-07-06 13:18:39 | 4
5 | 1000005110000008 | 2009-07-07 14:25:32 | 4.5
6 | 1000005120000002 | 2009-07-08 16:50:51 | -1
7 | 1000005240000002 | 2009-07-09 17:03:17 | 1
The result should look like this:
trans_id | cardnumber | trans_date | balance
---------------------------------------------------------------
1 | 1000005110000008 | 2009-07-07 14:25:32 | 4.5
2 | 1000005120000002 | 2009-07-08 16:50:51 | -1
3 | 1000005240000002 | 2009-07-09 17:03:17 | 1
I already have a query but it goes something like this:
SELECT cardnumber, MAX(balance), trans_date
FROM transactions
GROUP BY cardnumber
I really need help on this, im having a hard time. :(
Thanks in advance.
Mark
I don't have a MySQL in front of me at the moment, but something like this should work:
SELECT latest.cardnumber, latest.max_trans_date, t2.balance
FROM
(
SELECT t1.cardnumber, MAX(t1.trans_date) AS max_trans_date
FROM transactions t1
GROUP BY t1.cardnumber
) latest
JOIN transactions t2 ON (
latest.cardnumber = t2.cardnumber AND
latest.max_trans_date = t2.trans_date
)
Probably requires 5.0.x or later. There may be a better way. It's 3AM :-D
Almost the same as derobert's, but other way around. The idea is anyway that you make a subquery that takes the cardnumber with the latest (max) transaction date and then join that with the original table. This of course assumes that there aren't any transactions on cardnumber occuring at the exact same time.
SELECT t1.trans_id, t1.cardnumber, t1.trans_date, t1.balance
FROM transaction AS t1
JOIN (SELECT MAX(trans_date), cardnumber FROM transactions) AS t2 ON t2.cardnumber = t1.cardnumber
SELECT * FROM transactions WHERE (cardnumber,trans_date) in (SELECT cardnumber, MAX(trans_date) FROM transactions GROUP BY cardnumber);