How to group by count of related records - mysql

I've two tables invoices and products.
invoices: store,
products: id, invoice_id
I want to have a result set that shows how many invoices exists for each quantity of products.
I mean, if I have 2 invoices with 3 products each on store A, it will show Store: A, Products qty: 3, Number of invoices (with three products): 2
Another example:
| store | products_qty | count |
| A | 1 | 10 |
| A | 2 | 7 |
| A | 5 | 12 |
| B | 5 | 12 |
Meaning, store A has 10 invoices with 1 product. 7 with 2 products, and 12 with 5 products...
I've tried with something like:
SELECT store, count(p.id), count(i.id) FROM invoices i
LEFT JOIN products p ON (p.invoice_id = i.id)
GROUP BY price, count(i.id)
however my group cause is not valid, it shows Invalid use of group function.
How can I accomplish this?

I was able to do using subqueries, I wonder if is possible without it:
SELECT store, products_qty, count(*) FROM (
SELECT store, count(p.id) as products_qty, count(i.id) as invoices_count
FROM invoices i
LEFT JOIN products p ON (p.invoice_id = i.id)
GROUP BY price, i.id
) AS temp GROUP BY store, products_qty;

Related

SQL left join: how to return the newest from tableB and grouped by another field

I've been trying for two days, without luck.
I have the following simplified tables in my database:
customers:
| id | name |
| 1 | andrea |
| 2 | marco |
| 3 | giovanni |
access:
| id | name_id | date |
| 1 | 1 | 5000 |
| 2 | 1 | 4000 |
| 3 | 2 | 1500 |
| 4 | 2 | 3000 |
| 5 | 2 | 1000 |
| 6 | 3 | 6000 |
| 7 | 3 | 2000 |
I want to return all the names with their last access date.
At first I tried simply with
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id
But I got 7 rows instead of 3 as expected. So I understood I need to use GROUP BY statemet as the following:
SELECT * FROM customers LEFT JOIN access ON customers.id =
access.name_id GROUP BY customers.id
As far I know, GROUP BY combines using a random row. In fact I got unordered access dates with several tests.
Instead I need to group every customer id with its corresponding latest access! How this can be done?
You have to get the latest date from the access table with a group by on the the name_id, then join this result with the customer table. Here is the query:
select c.id, c.name, a.last_access_date from customers c left join
(select id, name_id, max(access_date) last_access_date from access group by name_id) a
on c.id=a.name_id;
Here is a DEMO on sqlfiddle.
I think this is what you'd like to achieve:
SELECT c.id, c.name, max(a.date) last_access
FROM customers c
LEFT JOIN access a ON c.id = a.name_id
GROUP BY c.id, c.name
The LEFT join will return all entries in table customers regardless if the join criteria (c.id = a.name_id) is satisfied. This means that you might get some NULL entries.
Example:
Simply add a new row in the customers table (id: 4, name: manuela). The output will have 4 rows and the newest row will be (id: 4, last_access: null)
I would do this using a correlated subquery in the ON clause:
SELECT a.*, c.*
FROM customers c LEFT JOIN
access a
ON c.id = a.name_id AND
a.DATE = (SELECT MAX(a2.date) FROM access a2 WHERE a2.name_id = a.name_id);
If this statement is true:
I need to group every customer id with its corresponding latest access! How this can be done?
Then you can simply do:
select a.name_id, max(a2.date)
from access a
group by a.name_id;
You do not need the customers table because:
All customers are in access, so the left join is not necessary.
You need no columns from customers.

MySQL SUM of one column, DISTINCT of ID column

I'm trying to create a summary report of our orders but having trouble extracting all my required data in a single query.
The data I'd like to extract:
subtotal - SUM of all sale prices
delivery total - SUM of all orders deliveryTotal
orders - COUNT of DISTINCT orderIds
quantity - SUM of all quantity ordered
Orders table (simplified for this example)
| orderId | deliveryTotal | total |
|---------|---------------|-------|
| 1 | 5 | 15 |
| 2 | 5 | 15 |
| 3 | 7.50 | 27.50 |
Order items table
| orderItemId | orderId | productId | salePrice | quantity |
|-------------|---------|-----------|-----------|----------|
| 1 | 1 | 1 | 10 | 1 |
| 2 | 2 | 1 | 10 | 1 |
| 3 | 3 | 1 | 10 | 1 |
| 4 | 3 | 2 | 10 | 1 |
My current query for extracting this data is
SELECT
SUM(i.salePrice * i.quantity) as subtotal,
SUM(DISTINCT o.deliveryTotal) as deliveryTotal,
COUNT(DISTINCT o.orderId) as orders,
SUM(i.quantity) as quantity
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Which results in a correct subtotal, order count and quantity sum. But delivery total is returned as 12.50 when I'm after 17.50. If I do SUM(o.deliveryTotal) it will return 25.
EDIT: Desired results
| subtotal | deliveryTotal | orders | quantity |
|----------|---------------|--------|----------|
| 40.00 | 17.50 | 3 | 4 |
https://tiaashish.wordpress.com/2014/01/31/mysql-sum-for-distinct-rows-with-left-join/
Here is a blog post that shows exactly what I was looking for. Maybe this can help others too.
The formula is something like this:
SUM(o.deliveryTotal) * COUNT(DISTINCT o.orderId) / COUNT(*)
Because of the join, the SUM(DISTINCT deliveryTotal) aggregate is being applied to a rowset including the values 5, 5, 7.5, 7.5 (distinct 5 + 7.5 = 12.5).
The rows your SUM() acted on become more apparent if you simply do
SELECT o.*
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Instead you are asking for the SUM() of all the values in deliveryTotal, irrespective of their position in the join with orderItems. That means you need to apply the aggregate at a different level.
Since you are not intending to add a GROUP BY later, the easiest way to do that is to use a subselect whose purpose is only to get the SUM() across the whole table.
SELECT
SUM(i.salePrice * i.quantity) as subtotal,
-- deliveryTotal sum as a subselect
(SELECT SUM(deliveryTotal) FROM orders) as deliveryTotal,
COUNT(DISTINCT o.orderId) as orders,
SUM(i.quantity) as quantity
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Subselects are usually discouraged but there won't be a significant performance penalty for the subselect, none different from the alternative methods of using a join for it. The calculation has to be done on a separate aggregate from the existing join no matter what. Other methods would place a subquery CROSS JOIN in the FROM clause, which performs the same thing we placed here in the subselect. Performance would be the same.
Select per Order in the Inner Select and than sum it up
Select
SUM(subtotal) as subtotal,
sum(deliveryTotal) as deliveryTotal,
count(1) as orders,
sum(quantity) as quantity
from (
SELECT
SUM(i.salePrice * i.quantity) as subtotal,
o.deliveryTotal as deliveryTotal,
SUM(i.quantity) as quantity
FROM orders o
INNER JOIN orderItems i ON o.orderId = i.orderId
group by o.orderId) as sub
The below query results exactly what you need
SELECT SUM(conctable.subtotal),
SUM(conctable.deliveryTotal),
SUM(conctable.orders),
SUM(conctable.quantity) from
(SELECT SUM(i.salePrice * i.quantity) as subtotal,
o.deliveryTotal as deliveryTotal,
COUNT(DISTINCT o.orderId) as orders,
SUM(i.quantity) as quantity
FROM orderItems i
JOIN orders o ON o.orderId = i.orderId group by i.orderid) as conctable;

MySQL Aggregate Function with group by and join

I have the following tables schemas and I want to get the sum of amount column for each category and the count of employees in the corresponding categories.
employee
id | name | category
1 | SC | G 1.2
2 | BK | G 2.2
3 | LM | G 2.2
payroll_histories
id | employee_id | amount
1 | 1 | 1000
2 | 1 | 500
3 | 2 | 200
4 | 2 | 100
5 | 3 | 300
Output table should look like this:
category | total | count
G 1.2 | 1500 | 1
G 2.2 | 600 | 2
I have tried this query below its summing up and grouping but I cannot get the count to work.
SELECT
employee_id,
category,
SUM(amount) from payroll_histories,employees
WHERE employees.id=payroll_histories.employee_id
GROUP BY category;
I have tried the COUNT(category) but that one too is not working.
You are, I believe, seeking two different summaries of your data. One is a sum of salaries by category, and the other is a count of employees, also by category.
You need to use, and then join, separate aggregate queries to get this.
SELECT a.category, a.amount, b.cnt
FROM (
SELECT e.category, SUM(p.amount) amount
FROM employees e
JOIN payroll_histories p ON e.id = p.employee_id
GROUP BY e.category
) a
JOIN (
SELECT category, COUNT(*) cnt
FROM employees
GROUP BY category
) b ON a.category = b.category
The general principle here is to avoid trying to use just one aggregate query to aggregate more than one kind of detail entity. Your amount aggregates payroll totals, whereas your count aggregates employees.
Alternatively for your specific case, this query will also work. But it doesn't generalize well or necessary perform well.
SELECT e.category, SUM(p.amount) amount, COUNT(DISTINCT e.id) cnt
FROM employees e
JOIN payroll_histories p ON e.id = p.employee_id
GROUP BY e.category
The COUNT(DISTINCT....) will fix the combinatorial explosion that comes from the join.
(Pro tip: use the explicit join rather than the outmoded table,table WHERE form of the join. It's easier to read.)

Duplicated Data When Joining 4 Tables in MySql [duplicate]

This question already has answers here:
Sum total of table with two related tables
(2 answers)
Closed 9 years ago.
I have 4 tables, with the relevant columns summarized here:
customers:
id
name
credits:
id
customer_id # ie customers.id
amount
sales:
id
customer_id # ie customers.id
sales_items:
id
sale_id # ie sales.id
price
discount
The idea is that customers lists all of our customers, credits lists each time they have paid us, sales lists each time they have bought things from us (but not what things they bought) and sales_items lists all of the items they bought at each of those sales. So you can see that credits and sales both relate back to customers, but sales_items only relates back to sales.
As an example dataset, consider:
customers:
id | name
5 | Carter
credits:
id | customer_id | amount
1 | 5 | 100
sales:
id | customer_id
3 | 5
sales_items:
id | sale_id | price | discount
7 | 3 | 5 | 0
8 | 3 | 0 | 0
9 | 3 | 10 | 0
I have tried this in MySQL:
SELECT c.*,
SUM( cr.amount ) AS paid,
SUM( i.price + i.discount ) AS bought
FROM customers AS c
LEFT JOIN sales AS s ON s.customer_id = c.id
LEFT JOIN sales_items AS i ON i.sale_id = s.id
LEFT JOIN credits AS cr ON cr.customer_id = c.id
WHERE c.id = 5
But it returns:
id | name | paid | bought
5 | Carter | 300 | 15
If I omit the SUM() functions, it returns:
id | name | paid | bought
5 | Carter | 100 | 5
5 | Carter | 100 | 0
5 | Carter | 100 | 15
So it looks like it's returning one row for every record matched in sales_items, but it's filling in the amount column with same value from credits each time. I see that this is happening, but I'm not understanding why it's happening.
So, two questions:
1. What is happening that it's smearing that one value through all of the rows?
2. What SQL can I throw at MySQL so that I can get this back:
id | name | paid | bought
5 | Carter | 100 | 15
I know that I could break it all up in subqueries, but is there a away to do it just with joins? I was hoping to learn a thing or two about joins as I tackled this problem. Thank you.
Edit: I created an SQL Fiddle for this: http://sqlfiddle.com/#!2/0051b/1/0
select distinct (c.id, c.name), sum(i.price+i.discount) AS bought, cr.amount AS paid
from customer c, credits cr, sales s, sales_items i
where s.customer_id = c.id
and i.sale_id = s.id
and cr.customer_id = c.id and c.id = 5
group by c.id, c.name;
I'm not very sure, but try this. Use group by; that is surely the solution.
Please try this
SELECT c.*,( SELECT SUM( cr.amount ) FROM customer c INNER JOIN credits cr ON
cr.customer_id = c.id WHERE c.id = 5 GROUP BY cr.id ) AS paid
,SUM( i.price + i.discount ) AS bought
FROM customers AS c INNER JOIN sales s ON s.customer_id = c.id
INNER JOIN sales_items i ON i.sale_id = s.id
INNER JOIN credits cr ON cr.customer_id = c.id
WHERE c.id = 5 GROUP BY s.id,cr.id

Exclusive mysql select query, two tables

I have the following tables (they all got more columns but I'm just showing the ones of interest):
Product Order details Orders
---------------------------- ---------------------------- --------------
| id_product | id_supplier | | id_order | id_product | | id_order |
| 12 | 2 | | 1 | 56 | | 1 |
| 32 | 4 | | 2 | 32 | | 2 |
| 56 | 2 | | 2 | 56 | | 3 |
| 10 | 1 | | 4 | 56 | | 4 |
---------------------------- | 3 | 12 | --------------
----------------------------
What I want to do is select all orders which have products from ONLY one or more suppliers. So lets say I want all orders that only have products from the supplier with id 2 (id_supplier = 2) I should get the orders with id 1, 3 and 4.
If I want all orders that ONLY have products from the supplier with id 4 (id_supplier = 4) I should get an empty result.
If I want all orders that ONLY have products from the suppliers with id 2 AND 4 I should get the order with id 2.
I've read the following question: mySQL exclusive records but I can't get a grip of that query to work when I have two tables like I have. I just need another pair of eyes to help me out here! :)
Do you have any idea on how I'll do this?
EDIT: To clearify, I want to fetch all orders that ONLY contains products from one or more specified suppliers. Orders with products from other suppliers than is specified, should not be included.
per the questions I've listed, I think THIS is what you want, and can be done with a LEFT join.
select
od.id_order,
sum( if( p.id_supplier in ( 2, 4 ), 1, 0 )) as HasSupplierLookingFor,
sum( if( p.id_supplier in ( 2, 4 ), 0, 1 )) as HasOtherSuppliers
from
order_Details od
join product p
on od.id_product = p.id_product
group by
od.id_order
having
HasSupplierLookingFor > 0
AND HasOtherSuppliers = 0
Sometimes, just answering a question that can be somewhat ambiguous as presented leads to misrepresented answers. This query will by a per order basis, join to the products to find the suppliers and group by the order id.
For each product ordered, the first SUM() asks if its one of the suppliers you ARE looking for, if so, sum a value of 1, otherwise 0... The next SUM() asks the same thing... but if it IS the supplier, use zero, thus all OTHER suppliers gets the 1.
So, now, the HAVING clause is looking for any order that at a minimum of 1 of your suppliers qualified AND it had no other suppliers represented.
So you could have an order with 30 items, and 20 from supplier 2, and 10 from supplier 4. The HasSupplierLookingFor would = 30, and HasOtherSuppliers = 0, the order would be included.
Another order could have 5 items. One from supplier 2, and 4 others from supplier 9. This would have HasSupplierLookingFor = 1, and HasOtherSuppliers = 4, thus exclude this as a qualified order.
You should inner join all those tables, like this:
SELECT o.* from Orders o
INNER JOIN Details d ON o.id_order = d.id_order
INNER JOIN Products p ON d.id_product = p.id_product
WHERE p.id_supplier = 4
That will give you the orders which include products from that supplier.
SELECT o.id_order
FROM Orders o
INNER JOIN `Order details` od
ON o.id_order = od.id_order
INNER JOIN Product p
ON p.id_product = od.id_product
WHERE p.id_supplier IN (2,4)
the (2,4) are the suppliers you want to fetch. you can also ask for only 1 by saying (2)