Groupwise maximum

Groupwise maximum - mysql

I have a table from which I am trying to retrieve the latest position for each security:
The Table:
My query to create the table: SELECT id, security, buy_date FROM positions WHERE client_id = 4
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 26 | PCS | 2012-02-08 |
| 27 | PCS | 2013-01-19 |
| 28 | RDN | 2012-04-17 |
| 29 | RDN | 2012-05-19 |
| 30 | RDN | 2012-08-18 |
| 31 | RDN | 2012-09-19 |
| 32 | HK | 2012-09-25 |
| 33 | HK | 2012-11-13 |
| 34 | HK | 2013-01-19 |
| 35 | SGI | 2013-01-17 |
| 36 | SGI | 2013-02-16 |
| 18084 | KERX | 2013-02-20 |
| 18249 | KERX | 0000-00-00 |
+-------+----------+------------+
I have been messing with versions of queries based on this page, but I cannot seem to get the result I'm looking for.
Here is what I've been trying:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security)
But this just returns me:
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 27 | PCS | 2013-01-19 |
+-------+----------+------------+
I'm trying to get the maximum/latest buy date for each security, so the results would have one row for each security with the most recent buy date. Any help is greatly appreciated.
EDIT: The position's id must be returned with the max buy date.

You can use this query. You can achieve results in 75% less time. I checked with more data set. Sub-Queries takes more time.
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
left join
positions p2
on p1.security = p2.security
and p1.buy_date < p2.buy_date
where
p2.id is null;
SQL-Fiddle link

You can use a subquery to get the result:
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
inner join
(
SELECT MAX(buy_date) MaxDate, security
FROM positions
group by security
) p2
on p1.buy_date = p2.MaxDate
and p1.security = p2.security
See SQL Fiddle with Demo
Or you can use the following in with a WHERE clause:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security
group by t2.security)
See SQL Fiddle with Demo

This is done with a simple group by. You want to group by the securities and get the max of buy_date. The SQL:
SELECT security, max(buy_date)
from positions
group by security
Note, this is faster than bluefeet's answer but does not display the ID.

The answer by #bluefeet has two more ways to get the results you want - and the first will probably be more efficient than your query.
What I don't understand is why you say that your query doesn't work. It seems pretty fine and returns the expected result. Tested at SQL-Fiddle
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security ) ;
If the problems appears when you add the client_id = 4 condition, then it's because you add it only in one WHERE clause while you have to add it in both:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE client_id = 4
AND buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE client_id = 4
AND t1.security = t2.security ) ;

select security, max(buy_date) group by security from positions;
is all you need to get max buy date for each security (when you say out loud what you want from a query and you include the phrase "for each x", you probably want a group by on x)
When you use a group by, all columns in your select must either be columns that have been grouped by or aggregates, so if, for example, you wanted to include id, you'd probably have to use a subquery similar to what you had before, since there doesn't seem to be any aggregate you can reasonably use on the ids, and another group by would give you too many rows.

Related

MySQL GROUP_CONCAT with SUM() and multiple JOINs inside subquery

I'm very average with MySQL, but usually I can write all the needed queries after reading documentation and searching for examples. Now, I'm in the situation where I spent 3 days re-searching and re-writing queries, but I can't get it to work the exact way I need. Here's the deal:
1st table (mpt_companies) contains companies:
| company_id | company_title |
------------------------------
| 1 | Company A |
| 2 | Company B |
2nd table (mpt_payment_methods) contains payment methods:
| payment_method_id | payment_method_title |
--------------------------------------------
| 1 | Cash |
| 2 | PayPal |
| 3 | Wire |
3rd table (mpt_payments) contains payments for each company:
| payment_id | company_id | payment_method_id | payment_amount |
----------------------------------------------------------------
| 1 | 1 | 1 | 10.00 |
| 2 | 2 | 3 | 15.00 |
| 3 | 1 | 1 | 20.00 |
| 4 | 1 | 2 | 10.00 |
I need to list each company along with many stats. One of stats is the sum of payments in each payment method. In other words, the result should be:
| company_id | company_title | payment_data |
--------------------------------------------------------
| 1 | Company A | Cash:30.00,PayPal:10.00 |
| 2 | Company B | Wire:15.00 |
Obviously, I need to:
Select all the companies;
Join payments for each company;
Join payment methods for each payment;
Calculate sum of payments in each method;
GROUP_CONCAT payment methods and sums;
Unfortunately, SUM() doesn't work with GROUP_CONCAT. Some solutions I found on this site suggest using CONCAT, but that doesn't produce the list I need. Other solutions suggest using CAST(), but maybe I do something wrong because it doesn't work too. This is the closest query I wrote, which returns each company, and unique list of payment methods used by each company, but doesn't return the sum of payments:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
Then I tried:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title), ':', CAST(SUM(mpt_payments.payment_amount) AS CHAR))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
...and many other variations, but all of them either returned query errors, either didn't return/format data I need.
The closest answer I could find was MySQL one to many relationship: GROUP_CONCAT or JOIN or both? but after spending 2 hours re-writing the provided query to work with my data, I couldn't do it.
Could anyone give me a suggestion, please?

You can do that by aggregating twice. First for the sum of payments per method and company and then to concatenate the sums for each company.
SELECT x.company_id,
x.company_title,
group_concat(payment_amount_and_method) payment_data
FROM (SELECT c.company_id,
c.company_title,
concat(pm.payment_method_title, ':', sum(p.payment_amount)) payment_amount_and_method
FROM mpt_companies c
INNER JOIN mpt_payments p
ON p.company_id = c.company_id
INNER JOIN mpt_payment_methods pm
ON pm.payment_method_id = p.payment_method_id
GROUP BY c.company_id,
c.company_title,
pm.payment_method_id,
pm.payment_method_title) x
GROUP BY x.company_id,
x.company_title;
db<>fiddle

Here you go
SELECT company_id,
company_title,
GROUP_CONCAT(
CONCAT(payment_method_title, ':', payment_amount)
) AS payment_data
FROM (
SELECT c.company_id, c.company_title, pm.payment_method_id, pm.payment_method_title, SUM(p.payment_amount) AS payment_amount
FROM mpt_payments p
JOIN mpt_companies c ON p.company_id = c.company_id
JOIN mpt_payment_methods pm ON pm.payment_method_id = p.payment_method_id
GROUP BY p.company_id, p.payment_method_id
) distinct_company_payments
GROUP BY distinct_company_payments.company_id
;

Select corresponding non-aggregated column after group by statment in MySQL

I have a temporary table I've derived from a much larger table.
+-----+----------+---------+
| id | phone | attempt |
+-----+----------+---------+
| 1 | 12345678 | 15 |
| 2 | 87654321 | 0 |
| 4 | 12345678 | 16 |
| 5 | 12345678 | 14 |
| 10 | 87654321 | 1 |
| 11 | 87654321 | 2 |
+-----+----------+---------+
I need to find the id (unique) corresponding to the highest attempt made on each phone number. Phone and attempt are not unique.
SELECT id, MAX(attempt) FROM temp2 GROUP BY phone
The above query does not return the id for the corresponding max attempt.

Try this:
select
t.*
from temp2 t
inner join (
select phone, max(attempt) attempt
from temp2
group by phone
) t2 on t.phone = t2.phone
and t.attempt = t2.attempt;
It will return rows with max attempts for a given number.
Note that this will return multiple ids if there are multiple rows for a phone if the attempts are same as maximum attempts for that phone.
Demo here

As an alternative to the answer given by #GurV, you could also solve this using a correlated subquery:
SELECT t1.*
FROM temp2 t1
WHERE t1.attempt = (SELECT MAX(t2.attempt) FROM temp2 t2 WHERE t2.phone = t1.phone)
This has the advantage of being a bit less verbose. But I would probably go with the join option because it will scale better for large data sets.
Demo

MySQL - Get records from INNER JOIN not between dates

I have two tables
Accounts:
+------------+--------+
| accountsid | name |
+------------+--------+
| 1 | Bob |
| 2 | Rachel |
| 3 | Mark |
+------------+--------+
Sales Orders
+--------------+------------+------------+--------+
| salesorderid | accountsid | so_date | amount |
+--------------+------------+------------+--------+
| 1 | 1 | 2015-12-16 | 50 |
| 2 | 1 | 2016-01-13 | 20 |
| 3 | 2 | 2015-12-14 | 10 |
| 4 | 3 | 2016-02-14 | 35 |
+--------------+------------+------------+--------+
As you can see, is a 1-N relation where Accounts has many Salesorders and Salesorder has 1 Account.
I need to retrieve "old" Accounts where are not active anymore. For example, If some Account dont have Salesorder in 2016 is an inactive Account.
So, in this example the result will be ONLY Rachel.
How can i retrieve this? I think its the "opposite" of between but I cant figure how to do it...
Thanks.
PS. Despite the title I can get this without INNER JOIN.

You're looking to effect an anti-join, for which there are three possibilities in MySQL:
Using NOT IN:
SELECT a.*
FROM Accounts a
WHERE a.accountsid NOT IN (
SELECT so.accountsid
FROM `Sales Orders` so
WHERE so.so_date >= '2016-01-01'
)
Using NOT EXISTS:
SELECT a.*
FROM Accounts a
WHERE NOT EXISTS (
SELECT *
FROM `Sales Orders` so
WHERE so.accountsid = a.accountsid
AND so.so_date >= '2016-01-01'
)
Using an outer JOIN:
SELECT a.*
FROM Accounts a LEFT JOIN `Sales Orders` so
ON so.accountsid = a.accountsid
AND so.so_date >= '2016-01-01'
WHERE so.accountsid IS NULL

why do you need to use only inner join? inner join is for cases you have data matching on two tables but in this case you don't you need to be using a subquery with either "not in" or "not exists"

What you want is to get the ids that didn´t make any order, so get the ids that made some order and the rest of them are the ones that didn´t make orders.
It should be something like this SELECT * FROM Accounts WHERE accountsid NOT IN (SELECT accountsid FROM Sales Orders WHERE so_date > your_date)

sql - Why doesn't MAX() of SUM() work?

I am trying to understand why the SQL command of MAX(SUM(col)) gives the a syntax error. I have the two tables as below-:
+--------+--------+---------+-------+
| pname | rollno | address | score |
+--------+--------+---------+-------+
| A | 1 | CCU | 1234 |
| B | 2 | CCU | 2134 |
| C | 3 | MMA | 4321 |
| D | 4 | MMA | 1122 |
| E | 5 | CCU | 1212 |
+--------+--------+---------+-------+
Personnel Table
+--------+-------+----------+
| rollno | marks | sub |
+--------+-------+----------+
| 1 | 90 | SUB1 |
| 1 | 88 | SUB2 |
| 2 | 89 | SUB1 |
| 2 | 95 | SUB2 |
| 3 | 99 | SUB1 |
| 3 | 99 | SUB2 |
| 4 | 82 | SUB1 |
| 4 | 79 | SUB2 |
| 5 | 92 | SUB1 |
| 5 | 75 | SUB2 |
+--------+-------+----------+
Results Table
Essentially I have a details table and a results table. I want to find the name and marks of the candidate who has got the highest score in SUB1 and SUB2 combined. Basically the person with the highest aggregate marks.
I can find the summation of SUB1 and SUB2 for all candidates using the following query-:
select p.pname, sum(r.marks) from personel p,
result r where p.rollno=r.rollno group by p.pname;
It gives the following output-:
+--------+--------------+
| pname | sum(r.marks) |
+--------+--------------+
| A | 178 |
| B | 167 |
| C | 184 |
| D | 198 |
| E | 161 |
+--------+--------------+
This is fine but I need the output to be only D | 198 as he is the highest scorer. Now when I modify query like the following it fails-:
select p.pname, max(sum(r.marks)) from personel p,
result r where p.rollno=r.rollno group by p.pname;
In MySQL I get the error of Invaild Group Function.
Now searching on SO I did get my correct answer which uses derived tables. I get my answer by using the following query-:
SELECT
pname, MAX(max_sum)
FROM
(SELECT
p.pname AS pname, SUM(r.marks) AS max_sum
FROM
personel p, result r
WHERE
p.rollno = r.rollno
GROUP BY p.pname) a;
But my question is Why doesn't MAX(SUM(col)) work ?
I don't understand why max can't compute the value returned by SUM(). Now an answer on SO stated that since SUM() returns only a single value so MAX() find its meaningless to compute the value of one value, but I have tested the following query -:
select max(foo) from a;
on the Table "a" which has only one row with only one column called foo that holds an integer value. So if MAX() can't compute single values then how did this work ?
Can someone explain to me how the query processor executes the query and why I get the error of invalid group function ? From the readability point of view using MAX(SUM(col)) is perfect but it doesn't work out that way. I want to know why.
Are MAX and SUM never to be used together? I am asking because I have seen queries like MAX(COUNT(col)). I don't understand how that works and not this.

Aggregate functions require an argument that provides a value for each row in the group. Other aggregate functions don't do that.
It's not very sensical anyway. Suppose MySQL accepted MAX(SUM(col)) -- what would it mean? Well, the SUM(col) yields the sum of all non-NULL values of column col over all rows in the relevant group, which is a single number. You could take the MAX() of that to be that same number, but what would be the point?
Your approach using a subquery is different, at least in principle, because it aggregates twice. The inner aggregation, in which you perform the SUM(), computes a separate sum for each value of p.pname. The outer query then computes the maximum across all rows returned by the subquery (because you do not specify a GROUP BY in the outer query). If that's what you want, that's how you need to specify it.

The error is 1111: invalid use of group function. As for why specifically MySQL has this problem I can really only say it is part of the underlying engine itself. SELECT MAX(2) does work (in spite of a lack of a GROUP BY) but SELECT MAX(SUM(2)) does not work.
This error will occur when grouping/aggregating functions such as MAX are used in the wrong spot such as in a WHERE clause. SELECT SUM(MAX(2)) also does not work.
You can imagine that MySQL attempts to aggregate both simultaneously rather than doing things in an order of operations, i.e. it does not SUM first and then get the MAX. This is why you need to do the queries as separate steps.

Try something like this:
select max(rs.marksums) maxsum from
(
select p.pname, sum(r.marks) marksums from personel p,
result r where p.rollno=r.rollno group by p.pname
) rs

with temp_table (name, max_marks) as
(select name, sum(marks) from personel p,result r, where p.rollno = r.rollno group by p.name)
select *from temp_table where max_marks = (select max(max_marks) from temp_table);
I didn't run this. But try this one. Hope it will work :)

How to get the value of a row with Max aggregation function?

I have a table for comments :
+----------+---------------------+----------+
| match_id | timestampe | comment |
+----------+---------------------+----------+
| 100 | 2014-01-01 01:00:00 | Hi |
| 200 | 2014-01-01 01:10:00 | Hi1 |
| 300 | 2014-01-01 01:20:00 | Hi2 |
| 100 | 2014-01-01 01:01:00 | Hello |
| 100 | 2014-01-01 01:02:00 | Hello1 |
| 200 | 2014-01-01 01:11:00 | hey |
+----------+---------------------+----------+
I want to get the following information from the table
SELECT match_id, max(timestampe) as maxtimestamp, count(match_id) as comments_no
FROM comments
GROUP BY match_id
order by maxtimestamp DESC
The previous explanation is working great but the problem is when I want to get the comment of the maxtimestamp.
How can I get the latest comment of each match (the comment of the maxtimestamp) using the most optimized query?

You can do it this way.
This is pretty optimal too.
SELECT c.comment, m.*
FROM
comments c
JOIN
(
SELECT t.match_id, max(t.timestampe) as maxtimestamp, count(t.match_id) as comments_no
FROM comments t
GROUP BY t.match_id
) m on c.match_id = m.match_id and c.timestampe = m.maxtimestamp
SQL Fiddle

I'm not sure about MySQL but Oracle supports window functions, so I can write something like:
select first_value(comment) over (order by timestamp desc)
from comments

Here's the easy way to do it with mysql:
SELECT * from (
SELECT match_id, timestampe as maxtimestamp, comment
FROM comments
order by maxtimestamp DESC) x
GROUP BY match_id
This exploits the customised way mysql handles group by.

to not use a subquery, you can use this below query
SELECT match_id,timestampe,comment,
IF(#prevMatchId IS NULL OR #prevMatchId != match_id,#row:=1,#row:=#row+1) as row,
#prevMatchId := match_id
FROM comments
HAVING row = 1
ORDER BY match_id,timestampe DESC
try using EXPLAIN and see which queries are more optimal
here's EXPLAIN on two queries. http://sqlfiddle.com/#!2/70efa/9/1 I am not all that familiar with EXPLAIN so maybe some experts can interpret it.
here's an EXPLAIN on two queries. if i added indexes on match_id and timestampe http://sqlfiddle.com/#!2/30266/1/1

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Groupwise maximum - mysql

You can use this query. You can achieve results in 75% less time. I checked with more data set. Sub-Queries takes more time. SELECT p1.id, p1.security, p1.buy_date FROM positions p1 left join positions p2 on p1.security = p2.security and p1.buy_date < p2.buy_date where p2.id is null; SQL-Fiddle link

This is done with a simple group by. You want to group by the securities and get the max of buy_date. The SQL: SELECT security, max(buy_date) from positions group by security Note, this is faster than bluefeet's answer but does not display the ID.

Related

MySQL GROUP_CONCAT with SUM() and multiple JOINs inside subquery

Select corresponding non-aggregated column after group by statment in MySQL

MySQL - Get records from INNER JOIN not between dates

sql - Why doesn't MAX() of SUM() work?

How to get the value of a row with Max aggregation function?

Categories

Resources