SQL Max value in a group [duplicate] - mysql

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 3 years ago.
I'm struggling to do something in SQL which I'm sure must be simple, but I can't figure it out. I want the MAX() value of a group, but I also want the value of another column in the same row as the max value. Here is an example table definition:
mysql> desc Sales;
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| StoreID | int(11) | YES | | NULL | |
| Day | int(11) | YES | | NULL | |
| Amount | int(11) | YES | | NULL | |
+---------+---------+------+-----+---------+-------+
3 rows in set (0.00 sec)
And here is some data for it:
mysql> SELECT * FROM Sales;
+---------+------+--------+
| StoreID | Day | Amount |
+---------+------+--------+
| 1 | 1 | 44 |
| 1 | 2 | 31 |
| 1 | 3 | 91 |
| 2 | 1 | 93 |
| 2 | 2 | 32 |
| 2 | 3 | 41 |
| 3 | 1 | 48 |
| 3 | 2 | 95 |
| 3 | 3 | 12 |
+---------+------+--------+
9 rows in set (0.00 sec)
What I want to know is, what Day had the most sales (Amount) for each StoreID.
Now I know I can do this:
SELECT StoreID, MAX(Amount) FROM Sales GROUP BY StoreID;
+---------+-------------+
| StoreID | MAX(Amount) |
+---------+-------------+
| 1 | 91 |
| 2 | 93 |
| 3 | 95 |
+---------+-------------+
3 rows in set (0.00 sec)
That tells me the max amount of each store, but really what I'm after is the day that it occured. But I can't add Day back in to the query because it's not in the group by, and I don't think I really want to group by that value do I?
I'm not sure where to go from here.
In short, the results I want should look like this:
+---------+------+--------+
| 1 | 3 | 91 |
| 2 | 1 | 93 |
| 3 | 2 | 95 |
+---------+------+--------+

You want to filter. Here is one simple method using a correlated subquery:
select s.*
from s
where s.sales = (select max(s2.sales)
from sales s2
where s2.storeId = s.storeId
);
If your data is on the large side, you will want an index on sales(storeId, sales).

For the maximum amounts per store there won't exist a higher amount for the same store.
SELECT *
FROM Sales s
WHERE NOT EXISTS (
SELECT 1
FROM Sales s2
WHERE s2.StoreID = s.StoreID
AND s2.Amount > s.Amount
)
ORDER BY Amount ASC, StoreID ASC;
Test here

Typically you can just join the aggregating query back to get the rest of the row data...
SELECT s.*
FROM Sales AS s
INNER JOIN (
SELECT StoreID, MAX(Amount) AS MaxAmount
FROM Sales
GROUP BY StoreID
) AS m ON s.StoreID = m.StoredID AND s.Amount = m.MaxAmount
;
If there are multiple Sales with the MaxAmount for the same StoreID, the query will return all of them, not just one of them.

Related

How to select other columns of a table when grouping? [duplicate]

This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 1 year ago.
Please assume this table:
// mytable
+--------+-------------+---------+
| num | business_id | user_id |
+--------+-------------+---------+
| 3 | 503 | 12 |
| 7 | 33 | 12 |
| 1 | 771 | 13 |
| 2 | 86 | 13 |
| 1 | 772 | 13 |
| 4 | 652 | 14 |
| 4 | 567 | 14 |
+--------+-------------+---------+
I need to group it based on user_id, So, here is my query:
select max(num), user_id from mytable
group by user_id
Here is the result:
// res
+--------+---------+
| num | user_id |
+--------+---------+
| 7 | 12 |
| 2 | 13 |
| 4 | 14 |
+--------+---------+
Now I need to also get the business_id of those rows. Here is the expected result:
// mytable
+--------+-------------+---------+
| num | business_id | user_id |
+--------+-------------+---------+
| 7 | 33 | 12 |
| 2 | 86 | 13 |
| 4 | 567 | 14 | -- This is selected randomly, because of the equality of values
+--------+-------------+---------+
Any idea how can I do that?
You don't group. You filter. One method uses window functions such as row_number():
select t.*
from (select t.*,
row_number() over (partition by user_id order by num desc) as seqnum
from mytable t
) t
where seqnum = 1;
Another method which can have slightly better performance with an index on (user_id, num) is a correlated subquery:
select t.*
from mytable t
where t.num = (select max(t2.num)
from mytable t2
where t2.user_id = t.user_id
);
You should think "group by" when you want to summarize rows. You should think "where" when you want to choose rows with particular characteristics.

Distinct order-number sequence for every customer

I have table of orders. Each customer (identified by the email field) has his own orders. I need to give a different sequence of order numbers for each customer. Here is example:
----------------------------
| email | number |
----------------------------
| test#com.com | 1 |
----------------------------
| example#com.com | 1 |
----------------------------
| test#com.com | 2 |
----------------------------
| test#com.com | 3 |
----------------------------
| client#aaa.com | 1 |
----------------------------
| example#com.com | 2 |
----------------------------
Is possible to do that in a simple way with mysql?
If you want update data in this table after an insert, first of all you need a primary key, a simple auto-increment column does the job.
After that you can try to elaborate various script to fill the number column, but as you can see from other answer, they are not so "simple way".
I suggest to assign the order number in the insert statement, obtaining the order number with this "simpler" query.
select coalesce(max(`number`), 0)+1
from orders
where email='test1#test.com'
If you want do everything in a single insert (better for performance and to avoid concurrency problems)
insert into orders (email, `number`, other_field)
select email, coalesce(max(`number`), 0) + 1 as number, 'note...' as other_field
from orders where email = 'test1#test.com';
To be more confident about not assign at the same customer two orders with the same number, I strongly suggest to add an unique constraint to the columns (email,number)
create a column order_number
SELECT #i:=1000;
UPDATE yourTable SET order_number = #i:=#i+1;
This will keep incrementing the column value in order_number column and will start right after 1000, you can change the value or even you can even use the primary key as the order number since it is unique all the time
I think one more need column for this type of out put.
Example
+------+------+
| i | j |
+------+------+
| 1 | 11 |
| 1 | 12 |
| 1 | 13 |
| 2 | 21 |
| 2 | 22 |
| 2 | 23 |
| 3 | 31 |
| 3 | 32 |
| 3 | 33 |
| 4 | 14 |
+------+------+
You can get this result:
+------+------+------------+
| i | j | row_number |
+------+------+------------+
| 1 | 11 | 1 |
| 1 | 12 | 2 |
| 1 | 13 | 3 |
| 2 | 21 | 1 |
| 2 | 22 | 2 |
| 2 | 23 | 3 |
| 3 | 31 | 1 |
| 3 | 32 | 2 |
| 3 | 33 | 3 |
| 4 | 14 | 1 |
+------+------+------------+
By running this query, which doesn't need any variable defined:
SELECT a.i, a.j, count(*) as row_number FROM test a
JOIN test b ON a.i = b.i AND a.j >= b.j
GROUP BY a.i, a.j
Hope that helps!
You can add number using SELECT statement without adding any columns in table orders.
try this:
SELECT email,
(CASE email
WHEN #email
THEN #rownumber := #rownumber + 1
ELSE #rownumber := 1 AND #email:= email END) as number
FROM orders
JOIN (SELECT #rownumber:=0, #email:='') AS t

MySQL historical data lookup

I've been looking around and trying to get this to work but I can't seem to get it. I have 2 tables:
TABLE: products
| id | name | some more values |
|----|-----------|------------------|
| 1 | Product 1 | Value 1 |
| 2 | Product 2 | Value 2 |
| 3 | Product 3 | Value 3 |
TABLE: value
| pid | value | stamp |
|-----|-----------|------------------|
| 1 | 7 | 2015-07-11 |
| 2 | 4 | 2015-07-11 |
| 3 | 8 | 2015-07-11 |
| 1 | 9 | 2015-07-21 |
| 2 | 4 | 2015-07-21 |
| 3 | 6 | 2015-07-21 |
First table simply has a list of products, second table has a value for each product (by pid), and the timestamp the value. note: timestamps are not every day, nor are they evenly spaced.
What I would like, is a resulting table like this:
| id | name | some more values | value now | value last month |
|----|-----------|------------------|-----------|------------------|
| 1 | Product 1 | Value 1 | 9 | 7 |
| 2 | Product 2 | Value 2 | 4 | 4 |
| 3 | Product 3 | Value 3 | 6 | 8 |
where 'value now' is the value of the newest timestamp, and the 'value last month' is the value of the timestamp closest to the newest timetamp - 30 days. Keep in mind that -30 days might not have a specific timestamp, the query will need to find the closest timestamp. (looking only up or down doesn't matter, it's an approximation.)
I have made some huge queries but I'm pretty sure there must be an easier way... Any help would be appreciated.
Assuming you get last month and year by PHP or by mysql function, here is a not checked query I hope it will work on first time:
SELECT *, v_now, v_lastmonth FROM products p
LEFT JOIN (SELECT `value` AS v_now FROM value ORDER BY stamp DESC) AS v_now ON p.id=v_now.pid
LEFT JOIN (SELECT `value` AS v_lastmonth FROM value
WHERE month(stamp)='$month' AND year(stamp)='$year'
ORDER BY stamp DESC) AS v_now ON p.id=v_now.pid
You can use group by to get one row for each product result.

MySQL query SELECT FROM 2 tables, COUNT the most used

I have this 2 tables and I need to return the moset used office. Note: 1 office can be used by more than 1 guys and the column ido from TableB is populate from TableA
Probaly is a query with group by and desc limit 1
TableA
| ido| office | guy |
---------------------
| 1 | office1| guy1|
| 2 | office2| guy2|
| 3 | office1| guy3|
| 4 | office1| guy4|
| 5 | office5| guy5|
| 6 | office2| guy6|
TableB
| idb| vizit | ido|
---------------------
| 1 | date | 4 |
| 2 | date | 2 |
| 3 | date | 5 |
| 4 | date | 6 |
| 5 | date | 1 |
| 6 | date | 6 |
Thanks!
You were correct in that GROUP BY, LIMIT and DESC are useful here; it leads to a fairly straight forward query;
SELECT TableA.office
FROM TableA
JOIN TableB
ON TableA.ido = TableB.ido
GROUP BY TableA.office
ORDER BY COUNT(*) DESC
LIMIT 1
What it does is basically create rows with all valid combinations, counting the number of generated rows per office. A plain descending sort by that count will give you the most frequently used office.
An SQLfiddle to test with.

how to approach this in MySql query?

I want to select the data as per condition:I have a table with physician_key and corresponding quality score for a given month. I want to select count of distinct physicians with quality score 1,2.
For a month, there could be more entries for a physician_key and accordingly the quality assigned(on scale 1-7). I want to select only the count of those physicians which have quality (1,2) and if the same physician has quality >2 in given month, I don't want to count that physician.I want the information by product and month
I created an example table, since you didn't provide one:
mysql> select * from sales_mkt_rep_qual;
+-------------------+---------+-------+-------------------+
| GEO_PHYSICIAN_KEY | product | month | SALES_REP_QUALITY |
+-------------------+---------+-------+-------------------+
| 1 | a | 8 | 1 |
| 1 | a | 8 | 2 |
| 1 | a | 8 | 3 |
| 2 | b | 8 | 2 |
| 2 | b | 8 | 1 |
| 2 | b | 9 | 2 |
| 1 | a | 9 | 2 |
| 2 | b | 9 | 3 |
| 3 | a | 9 | 2 |
+-------------------+---------+-------+-------------------+
The query from your comment indeed gives an error:
SELECT COUNT(DISTINCT GEO_PHYSICIAN_KEY) AS encount_1to2,
product,MONTH
FROM sales_mkt_rep_qual
WHERE MAX(SALES_REP_QUALITY) = 2 ;
ERROR 1111 (HY000): Invalid use of group function
If you change that to:
SELECT DISTINCT geo_physician_key AS encount_1to2, product, month
FROM sales_mkt_rep_qual
WHERE (geo_physician_key,month,product)
NOT IN (
SELECT geo_physician_key, month, product
FROM sales_mkt_rep_qual
WHERE sales_rep_quality >2 );
you see the detailed result:
+--------------+---------+-------+
| encount_1to2 | product | month |
+--------------+---------+-------+
| 2 | b | 8 |
| 1 | a | 9 |
| 3 | a | 9 |
+--------------+---------+-------+
No, you can introduce the counting:
SELECT COUNT(distinct geo_physician_key ) AS no_of_physicians,product, month
FROM sales_mkt_rep_qual
WHERE (geo_physician_key,month,product)
NOT IN (
SELECT geo_physician_key, month, product
FROM sales_mkt_rep_qual WHERE sales_rep_quality >2 )
GROUP BY month, product;
+------------------+---------+-------+
| no_of_physicians | product | month |
+------------------+---------+-------+
| 1 | b | 8 |
| 2 | a | 9 |
+------------------+---------+-------+
If that still isn't what you are looking for, give more specific table structure and data example.
Try this:
SELECT count(DISTINCT physician_key)
FROM my_table
WHERE month = desired_month
AND max(quality) = 2
GROUP BY month
Actually I want the data to be like the output below:
+--------------+---------+-------+
| encount_1to2 | product | MONTH |
+--------------+---------+-------+
| 2 | b | 8 |
+--------------+---------+-------+
and for the criteria SALES_REP_QUALITY <= 2, isn't there a possibility that while selecting the distinct geo physician key, it might select out of first 2 considering it matches the criteria? Thats the reason I have used Thanix approach of max function with group by product and month, so that the aggregate function is applied on every product within a month