Group by various column (with various joins) but sum distinct other column - mysql

I have to do some reporting, involving various tables, and having couple of SUMs, COUNTs, etc and everything is OK. But the last thing I have to resolve is SUM by another which is not in the grouped columns.
I'll give you an example (stripped down from what I have) so you can understand the tongue-twister in the previous paragraph.
Suppose I have a query with a couple of joins that get me this result, or a temporary table, or whatever:
(this is a trimmed down version, in the original I have much more columns and groupbys)
APP_ID CAT_ID CAT_DESCRIP APP_START APP_END DETAIL_ID DET_QTY DETAIL_PRICE
1 1 Categ One 900 960 1 10 150.00
1 1 Categ One 900 960 2 8 20.00
1 1 Categ One 900 960 3 12 30.00
1 1 Categ One 900 960 4 5 100.00
2 2 Categ Two 600 720 5 12 150.00
2 2 Categ Two 600 720 6 10 50.00
3 2 Categ Two 1200 1260 7 5 20.00
I need to get something like this: (the bolded column is the important)
SELECT
CAT_ID,
CAT_DESCRIP,
SUM(DET_QTY) as TotalQTY,
SUM(DETAIL_PRICE) as TotalPrice,
COUNT(DISTINCT APP_ID) as CountOfApps,
(GET THE SUM OF (APP_END - APP_START) ONLY ONE TIME BY APP_ID INTO THIS CATEG) as TimeInMinutesByCategory
FROM
MyTable
GROUP BY
CAT_ID
And the result has to give me this:
CAT_ID CAT_DESCRIP TotalQTY TotalPrice CountOfApps TimeInMinutesByCategory
1 Categ One 35 300.00 1 60
2 Categ Two 27 220.00 2 180
Thanks for your help!

I think this will do the job... or if not, a little tweaking on the sytnax for max(app_start) - max(app_end) should do the job
The idea is, summarize the data in a subquery by app_id and cat_id. Select the max value of start and end, grouped by app_id and cat_id. Since there will only be one value per each distinct pair of app_id and cat_id, we're essentially just deduping.
Then, join the subquery to the main query and summarize by category id.
SELECT
a.CAT_ID,
a.CAT_DESCRIP,
SUM(a.DET_QTY) as TotalQTY,
SUM(a.DETAIL_PRICE) as TotalPrice,
COUNT(DISTINCT a.APP_ID) as CountOfApps,
SUM(b.TimeInMinutesByCategory) AS TimeInMinutesByCategory
FROM
MyTable AS a
INNER JOIN (
SELECT APP_ID, CAT_ID, max(app_start) - max(app_end) AS TimeInMinutesByCategory
FROM MyTable
GROUP BY APP_ID, CAT_ID) AS b
ON a.cat_id = b.cat_id
AND a.app_id = b.app_id
GROUP BY
a.CAT_ID

Related

SQL - Max value from a group by when creating a new field

I have a database with a table called BOOKINGS containing the following values
main-id place-id start-date end-date
1 1 2018-8-1 2018-8-8
2 2 2018-6-6 2018-6-9
3 3 2018-5-5 2018-5-8
4 4 2018-4-4 2018-4-5
5 5 2018-3-3 2018-3-10
5 1 2018-1-1 2018-1-6
4 2 2018-2-1 2018-2-10
3 3 2018-3-1 2018-3-28
2 4 2018-4-1 2018-4-6
1 5 2018-5-1 2018-5-15
1 3 2018-6-1 2018-8-8
1 4 2018-7-1 2018-7-6
1 1 2018-8-1 2018-8-18
1 2 2018-9-1 2018-9-3
1 5 2018-10-1 2018-10-6
2 5 2018-11-1 2018-11-5
2 3 2018-12-1 2018-12-25
2 2 2018-2-2 2018-2-19
2 4 2018-4-4 2018-4-9
2 1 2018-5-5 2018-5-23
What I need to do is for each main-id I need to find the largest total number of days for every place-id. Basically, I need to determine where each main-id has spend the most time.
This information must then be put into a view, so unfortunately I can't use temporary tables.
The query that gets me the closest is
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT `BOOKINGS`.`main-id`, `BOOKINGS`.`place-id`, SUM(DATEDIFF(`end-date`, `begin-date`)) AS `total`
FROM `BOOKINGS`
GROUP BY `BOOKINGS`.`main-id`,`RESERVATION`.`place-id`
Which yields:
main-id place-id total
1 1 24
1 2 18
1 5 5
2 1 2
2 2 20
2 4 9
3 1 68
3 2 24
3 3 30
4 1 5
4 2 10
4 4 1
5 1 19
5 2 4
5 5 7
What I need is then the max total for each distinct main-id:
main-id place-id total
1 1 24
2 2 20
3 1 68
4 2 10
5 1 19
I've dug through a large amount of similar posts that recommend things like self joins; however, due to the fact that I have to create the new field total using an aggregate function (SUM) and another function (DATEDIFF) rather than just querying an existing field, my attempts at implementing those solutions have been unsuccessful.
I am hoping that my query that got me close will only require a small modification to get the correct solution.
Having hyphen character - in column name (which is also minus operator) is a really bad idea. Do consider replacing it with underscore character _.
One possible way is to use Derived Tables. One Derived Table is used to determine the total on a group of main id and place id. Another Derived Table is used to get maximum value out of them based on main id. We can then join back to get only the row corresponding to the maximum value.
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b1.main_id, b1.place_id, b1.total
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS b1
JOIN
(
SELECT dt.main_id, MAX(dt.total) AS max_total
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS dt
GROUP BY dt.main_id
) AS b2
ON b1.main_id = b2.main_id AND
b1.total = b2.max_total
MySQL 8+ solution would be utilizing the Row_Number() functionality:
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b.main_id, b.place_id, b.total
FROM
(
SELECT dt.main_id,
dt.place_id,
dt.total
ROW_NUMBER() OVER (PARTITION BY dt.main_id
ORDER BY dt.total DESC) AS row_num
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS dt
GROUP BY dt.main_id
) AS b
WHERE b.row_num = 1

Find the number of orders related to each order size in MySQL

Need some help with a MySQL query to be used in a larger database. Simplified here, I need to find the number of orders related to each order size.
I've been trying to get the query to work with a lot of combinations like: COUNT(DISTINCT item) or GROUP_CONCAT(DISTINCT order_id), GROUP BYs, ORDER BYs, HAVING COUNT(DISTINCT item_id), etc. but it's not turning out like I really need it to. Any help toward getting me going in the right direction would be greatly appreciated.
In this example table named items, the person with an order_id of 1 ordered three items, the person with an order_id of 4 ordered only one item, the person with an order_id of 5 ordered two items, etc. At the moment, they can only order up to three items max, but in the future, more items could be added so the query needs to be written in a way that can scale to 4 items, 5 items, etc.
Table name is: items
item_id order_id item
-------------------------------
1 1 apple
2 1 orange
3 1 grape
4 2 grape
5 3 apple
6 3 orange
7 4 apple
8 5 orange
9 5 apple
10 6 apple
11 6 orange
12 6 grape
13 7 orange
14 8 grape
In this example, the query result would be:
Number_of_Orders Total_Order_Size
----------------------------------------
4 1
2 2
2 3
You have to group by twice.
select item_count,count(*)
from (select order_id,count(*) as item_count
from tbl
group by order_id
) t
group by item_count
You can use two levels of aggregation:
select num_items, count(*) as num_orders
from (select order_id, count(*) as num_items
from t
group by order_id
) o
group by num_items
order by num_items;

Combining results of two different tables in MYSQL

I have to combine two tables with different sets of columns by a 'salesperson' column.
The problem with the query I've got so far is that some salespeople names are duplicated, and some from the right table are missing.
Transactions table
salesperson, Profit, Units
John 100 1
John 50 1
Carl 200 2
Matt 300 3
Connections table
salesperson, Amount
Carl 100
Lynda 200
Lucy 300
Combined table
salesperson, (Amount+Profit), Units(sum)
Carl 300 2
John 150 2
Matt 300 3
Lynda 200 0
Lucy 300 0
This is what I've got so far
SELECT t.salesperson, SUM(t.profit) + SUM(c.amount), SUM(t.units)
FROM transactions AS t
FULL OUTER JOIN connections as c ON t.salesperson = c.salesperson
GROUP BY t.salesperson
ORDER BY t.salesperson ASC
Any help would be greatly appreciated.
SELECT salesperson, SUM(total), SUM(Units)
FROM
(
SELECT salesperson, Amount as total, Units
FROM Transactions
UNION ALL
SELECT salesperson, Profit as total, 0 as Units
FROM Connections
) T
GROUP BY salesperson

mysql get the columns sum and also get the distinct values at a time

I have my data base like this
id project_id client_id price
1 1 1 200
2 2 1 123
3 2 1 100
4 1 1 87
5 1 1 143
6 1 1 100
7 3 3 123
8 3 3 99
9 4 3 86
10 4 3 43
11 4 3 145
12 4 3 155
Now here I want that it will sum the price columns with the same client_id.
For that I just made my query like this
Select `project_id`, SUM(`price`) FROM `table-name` GROUP BY `client_id`
This one is doing sum the price but I am getting only two project_id in the result. I want the result should be all the distinct project for the client id and the price will be summed for the group clients.
So can someone tell me how to do this? Any help and suggestions will be really appreciable. Thanks
You should not have "bare" column in a group by query that are not in the group by statement.
If you want the list of projects, you can get them in a list like this:
SELECT client_id, GROUP_CONCAT(project_id), SUM(price)
FROM table-name
GROUP BY client_id;
you only have two client that why you are getting only two record , you can group by two column,
Select `project_id`, SUM(`price`) FROM `table-name` GROUP BY `client_id`, `project_id`

Access Totals Query Not Necessarily Returning First Record

I have a table of data like this:
id user_id A B C
=====================
1 15 1 2 3
2 15 1 2 5
3 20 1 3 9
4 20 1 3 7
I need to remove duplicate user ids and keep the record that sorts lowest when sorting by A then B then C. So using the above table, I set up a temp query (qry_temp) that simply does the sort--first on user_id, then on A, then on B, then on C. It returns the following:
id user_id A B C
====================
1 15 1 2 3
2 15 1 2 5
4 20 1 3 7
3 20 1 3 9
Then I wrote a Totals Query based on qry_temp that just had user_id (Group By) and then id (First), and I assumed this would return the following:
user_id id
===========
15 1
20 4
But it doesn't seem to do that--instead it appears to be just returning the lowest id in a group of duplicate user ids (so I get 1 and 3 instead of 1 and 4). Shouldn't the Totals query use the order of the query it's based upon? Is there a property setting in the query that might impact this or another way to get what I need? If it helps, here is the SQL:
SELECT qry_temp.user_id, First(qry_temp.ID) AS FirstOfID
FROM qry_temp
GROUP BY qry_temp.user_id;
You need a different type of query, for example:
SELECT tmp.id,
tmp.user_id,
tmp.a,
tmp.b,
tmp.c
FROM tmp
WHERE (( ( tmp.id ) IN (SELECT TOP 1 id
FROM tmp t
WHERE t.user_id = tmp.user_id
ORDER BY t.a,
t.b,
t.c,
t.id) ));
Where tmp is the name of your table. First, Last, Min and Max are not dependent on a sort order. In relational databases, sort orders are quite ephemeral.