mysql many-to-one and group by giving unusual results - mysql

im having difficulty with the following fairly simple setup:
CREATE TABLE IF NOT EXISTS invoices (
id int(11) NOT NULL auto_increment,
PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS invoices_items (
id int(11) NOT NULL auto_increment,
invoice_id int(11) NOT NULL,
description text NOT NULL,
amount decimal(10,2) NOT NULL default '0.00',
PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS invoices_payments (
id int(11) NOT NULL auto_increment,
invoice_id int(11) NOT NULL,
amount decimal(10,2) NOT NULL default '0.00',
PRIMARY KEY (id)
);
some data:
INSERT INTO invoices (id) VALUES (1);
INSERT INTO invoices_items (id, invoice_id, description, amount) VALUES
(1, 1, 'Item 1', '750.00'),
(2, 1, 'Item 2', '750.00'),
(3, 1, 'Item 3', '50.00'),
(4, 1, 'Item 4', '150.00');
INSERT INTO invoices_payments (id, invoice_id, amount) VALUES
(1, 1, '50.00'),
(2, 1, '1650.00');
and the sql yielding unusual results:
select invoices.id,
ifnull(sum(invoices_payments.amount),0) as payments_total,
ifnull(count(invoices_items.id),0) as item_count
from invoices
left join invoices_items on invoices_items.invoice_id=invoices.id
left join invoices_payments on invoices_payments.invoice_id=invoices.id
group by invoices.id
results in the (erroneous) output
id payments_total item_count
1 6800.00 8
now, as evidenced by there being infact only four 'invoice_item' rows, i dont understand why mysql is not grouping properly.
EDIT
i know i can do something like this:
select x.*, ifnull(sum(invoices_payments.amount),0) as payments_total from (
select invoices.id,
ifnull(count(invoices_items.id),0) as item_count
from invoices
left join invoices_items on invoices_items.invoice_id=invoices.id
group by invoices.id
) as x left join invoices_payments on invoices_payments.invoice_id=x.id
group by x.id
but i want to know if im doing something wrong in the first query - i cant immediately see why the first query is giving incorrect results! :(

Your join logic is incorrect. In your join, you specify invoices_items.invoice_id = invoices.id. You also specify invoices_payments.invoice_id = invoices.id. Because of transitivity, you end up with:
invoices_items.invoice_id = invoices.id
invoices_payments.invoice_id = invoices.id
invoice_items.invoice_id = invoices_payments.invoice_id
The sum of the 2 invoice payments is $1700. For every invoice payment, there are 4 invoice_items that satisfy the above relations. $1700 * 4 = $6800.
For every invoice item, there will be two invoice payments that satisfy the above relations. 4 invoice items * 2 = 8 count.

There are two tables with a many:one relationship with invoices. Your count is the cartesian product.
The payments should be applied to the invoice, not the invoice items. Get the invoice total first, then join the payments to it.
This may be similar to what you are looking for:
SELECT
invoice_total.invoice_id,
invoice_total.amount as invoice_amount,
payments_total.amount as total_paid
FROM
(
SELECT
invoice_id,
SUM(amount) as amount
FROM
invoices_items
GROUP BY
invoice_id
) invoice_total
INNER JOIN
(
SELECT
invoice_id,
SUM(amount) as amount
FROM
invoices_payments
GROUP BY
invoice_id
) payments_total
ON invoice_total.invoice_id = payments_total.invoice_id;

edit:
ah, sorry - see your point now. The reason you're getting unexpected results is that this query:
SELECT *
FROM invoices
LEFT JOIN invoices_items ON invoices_items.invoice_id = invoices.id
LEFT JOIN invoices_payments ON invoices_payments.invoice_id = invoices.id;
results in this:
id id invoice_id description amount id invoice_id amount
1 1 1 Item 1 750.00 1 1 50.00
1 1 1 Item 1 750.00 2 1 1650.00
1 2 1 Item 2 750.00 1 1 50.00
1 2 1 Item 2 750.00 2 1 1650.00
1 3 1 Item 3 50.00 1 1 50.00
1 3 1 Item 3 50.00 2 1 1650.00
1 4 1 Item 4 150.00 1 1 50.00
1 4 1 Item 4 150.00 2 1 1650.00
As you can see you get every invoices_items record once each for every invoices_payments record. You're going to have to grab (i.e. group) them separately.
Note that the GROUP BY clause in your initial query is redundant.
Here's what you need:
SELECT
invoices.id,
payments_total.payments_total,
IFNULL(COUNT(invoices_items.id),0) AS item_count
FROM invoices
LEFT JOIN invoices_items ON invoices.id = invoices_items.invoice_id
LEFT JOIN (
SELECT invoice_id,
IFNULL(SUM(invoices_payments.amount),0) AS payments_total
FROM invoices_payments
GROUP BY invoice_id
) AS payments_total ON invoices.id = payments_total.invoice_id
;

Related

Mysql query to get customer count from table

I have the following table
customer_id
id
product_type
serial_number
parent_prod_id
123
200
Camera
3222333
200
123
201
InstaCam
3322322
200
123
202
InstaCam
4332233
200
125
200
Camera
3222333
200
126
200
Camera
3222333
200
My query should return the customer count for each product type but if the same customer purchased a product such as InstaCam which is tied to the parent prod id Camera, then the customer count for the product InstaCam must be 0. In the above table, Camera was purchased by three different customers with customer ids 123, 125 and 126. Since InstaCam was also purchased by one of the customers who purchased the Camera and because the parent_prod_id of InstaCam is the same as the id of Camera, the same customer should not be counted again for the Instacam product so the customer count would be 0.
Expected output:
serial_number
product_type
customer_count
3222333
Camera
3
3322322
InstaCam
0
4332233
InstaCam
0
I have tried many solutions for hours with no luck. Any help would be greatly appreciated. Thank you.
This must work. Basically what this query does is sum the cases valid for your requirements.
These cases are:
The product is a parent
The product is a child but there is not a buy for the parent
Else => 0 (not sum)
Then, with this clasification, you can add the occurrences.
select d.serial_number, d.product_type, sum(counter) as customer_count
from (
select *,
case
when y.id = y.parent_prod_id then 1
when not exists (
select 1
from your_data yy
where y.customer_id=yy.customer_id
and yy.id = y.parent_prod_id
) then 1
else 0
end counter
from your_data y
) d
group by d.serial_number, d.product_type
You can test on this <>db_fiddle
You can do it with simple join and conditional aggregation.
Schema and insert statements:
create table yourtable(customer_id int, id int, product_type varchar(50), serial_number int, parent_prod_id int);
insert into yourtable values(123,200, 'Camera', 3222333,200);
insert into yourtable values(123,201, 'InstaCam', 3322322,200);
insert into yourtable values(123,202, 'InstaCam', 4332233,200);
insert into yourtable values(125,200, 'Camera', 3222333,200);
insert into yourtable values(126,200, 'Camera', 3222333,200);
Query:
select a.serial_number, a.product_type,sum(case when a.id=b.id then 1 else 0 end)customer_count
from yourtable a
left join yourtable b on a.parent_prod_id=b.id and a.customer_id=b.customer_id
group by a.serial_number, a.product_type
Output:
serial_number
product_type
customer_count
3222333
Camera
3
3322322
InstaCam
0
4332233
InstaCam
0
db<>fiddle here
To solve this, you will need to join the table to itself and compare sales.
First let's make the table and populate it with the supplied data:
DROP TABLE IF EXISTS `Sales`;
CREATE TABLE IF NOT EXISTS `Sales` (
`customer_id` int(11) UNSIGNED NOT NULL ,
`id` int(11) UNSIGNED NOT NULL ,
`product_type` varchar(80) NOT NULL DEFAULT '',
`serial_number` varchar(40) NOT NULL DEFAULT '',
`parent_prod_id` int(11) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `Sales` (`customer_id`, `id`, `product_type`, `serial_number`, `parent_prod_id`)
VALUES (123, 200, 'Camera', '3222333', 200),
(123, 201, 'InstaCam', '3322322', 200),
(123, 202, 'InstaCam', '4332233', 200),
(125, 200, 'Camera', '3222333', 200),
(126, 200, 'Camera', '3222333', 200);
To get the results you seek, we can use a query like this:
SELECT s.`serial_number`, s.`product_type`,
COUNT(DISTINCT CASE WHEN pp.`id` IS NOT NULL THEN NULL ELSE s.`customer_id` END) as `customer_count`
FROM `Sales` s LEFT OUTER JOIN `Sales` pp ON s.`customer_id` = pp.`customer_id`
AND s.`parent_prod_id` = pp.`id`
AND s.`id` <> pp.`id`
GROUP BY s.`serial_number`, s.`product_type`;
This will give you a result like this:
serial_number
product_type
customer_count
3222333
Camera
3
3322322
InstaCam
0
4332233
InstaCam
0
Now to test this, let's add a record for a customer who bought only an InstaCam:
INSERT INTO `Sales` (`customer_id`, `id`, `product_type`, `serial_number`, `parent_prod_id`)
VALUES (131, 201, 'InstaCam', '3322322', 200);
Run the same query as before, and you'll get this:
serial_number
product_type
customer_count
3222333
Camera
3
3322322
InstaCam
1
4332233
InstaCam
0
Next time I answer a question, I'll make sure I have a cup of coffee first 🤪
You can use distinct on customer_id such as SELECT count(distinct customer_id) so to count a customer only once.
https://www.mysqltutorial.org/mysql-distinct.aspx

How to get multiple columns on subquery or group by

I have two tables on MySql, the first contains an ID and the name of some products. I have to get the cheapest combination of brand/market for each product. So, I've inserted some itens into both tables:
UPDATE: Inserted new product (bed) with no 'Product_Brand_Market' to test LEFT JOIN.
UPDATE: Changed some product prices for better testing.
CREATE TABLE Product(
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL);
CREATE TABLE Product_Brand_Market(
product INT UNSIGNED,
market INT UNSIGNED, /*this will be a FOREIGN KEY*/
brand INT UNSIGNED, /*this will be a FOREIGN KEY*/
price DECIMAL(10,2) UNSIGNED NOT NULL,
PRIMARY KEY(product, market, brand),
CONSTRAINT FOREIGN KEY (product) REFERENCES Product(id));
INSERT INTO Product
(name) VALUES
('Chair'), /*will get id=1*/
('Table'), /*will get id=2*/
('Bed'); /*will get id=3*/
INSERT INTO Product_Brand_Market
(product, market, brand, price) VALUES
(1, 1, 1, 8.00), /*cheapest chair (brand=1, market=1)*/
(1, 1, 2, 8.50),
(1, 2, 1, 9.00),
(1, 2, 2, 9.50),
(2, 1, 1, 11.50),
(2, 1, 2, 11.00),
(2, 2, 1, 10.50),
(2, 2, 2, 10.00); /*cheapest table (brand=2, market=2)*/
/*no entries for bed, must return null*/
And tried the following code to get the desired values:
UPDATE: Changed INNER JOIN for LEFT JOIN.
SELECT p.id product, MIN(pbm.price) price, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON p.id = pbm.product
GROUP BY p.id;
The returned price is OK, but I'm getting the wrong keys:
| product | price | brand | market |
|---------|-------|-------|--------|
| 1 | 8 | 1 | 1 |
| 2 | 10 | 1 | 1 |
| 3 | null | null | null |
So the only way I could think to solve it is with subqueries, but I had to use two subqueries to get both brand and market:
SELECT
p.id product,
(
SELECT pbm.brand
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as brand,
(
SELECT pbm.market
FROM Product_Brand_Market pbm
WHERE p.id = pbm.product
ORDER BY pbm.price
LIMIT 1
) as market
FROM Product p;
It returns the desired table:
| product | brand | market |
|---------|-------|--------|
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | null | null |
But I want to know if I really should use these two similar subqueries or there is a better way to do that on MySql, any ideas?
Use a correlated subquery with LIMIT 1 in the WHERE clause:
SELECT product, brand, market
FROM Product_Brand_Market pbm
WHERE (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
)
This will return only one row per product, even if there are two or many of them with the same lowest price.
Demo: http://rextester.com/UIC44628
Update:
To get all products even if they have no entries in the Product_Brand_Market table, you will need a LEFT JOIN. Note that the condition should be moved to the ON clause.
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON pbm.product = p.id
AND (pbm.brand, pbm.market) = (
SELECT pbm1.brand, pbm1.market
FROM Product_Brand_Market pbm1
WHERE pbm1.product = pbm.product
ORDER BY pbm1.price ASC
LIMIT 1
);
Demo: http://rextester.com/MGXN36725
The follwing query might make a better use of your PK for the JOIN:
SELECT p.id as product, pbm.brand, pbm.market
FROM Product p
LEFT JOIN Product_Brand_Market pbm
ON (pbm.product, pbm.market, pbm.brand) = (
SELECT pbm1.product, pbm1.market, pbm1.brand
FROM Product_Brand_Market pbm1
WHERE pbm1.product = p.id
ORDER BY pbm1.price ASC
LIMIT 1
);
An index on Product_Brand_Market(product, price) should also help to improve the performance of the subquery.

Joining table to union of two tables?

I have two tables: orders and oldorders. Both are structured the same way. I want to union these two tables and then join them to another table: users. Previously I only had orders and users, I am trying to shoehorn oldorders into my current code.
SELECT u.username, COUNT(user) AS cnt
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND total != 0
GROUP BY user
This finds the number of nonzero total orders all users have made in table orders, but I want to this in the union of orders and oldorders. How can I accomplish this?
create table orders (
user int,
shipped int,
total decimal(4,2)
);
insert into orders values
(5, 1, 28.21),
(5, 1, 24.12),
(5, 1, 19.99),
(5, 1, 59.22);
create table users (
username varchar(100),
userident int
);
insert into users values
("Bob", 5);
Output for this is:
+----------+-----+
| username | cnt |
+----------+-----+
| Bob | 4 |
+----------+-----+
After creating the oldorders table:
create table oldorders (
user int,
shipped int,
total decimal(4,2)
);
insert into oldorders values
(5, 1, 62.94),
(5, 1, 53.21);
The expected output when run on the union of the two tables is:
+----------+-----+
| username | cnt |
+----------+-----+
| Bob | 6 |
+----------+-----+
Just not sure where or how to shoehorn a union into there. Instead of running the query on orders, it needs to be on orders union oldorders. It can be assumed there is no intersect between the two tables.
You just need to union this way:
SELECT u.username, COUNT(user) AS cnt
FROM
(
SELECT * FROM orders
UNION
SELECT * FROM oldorders
) o
LEFT JOIN users u ON u.userident = o.user
WHERE shipped = 1
AND total != 0
GROUP BY user;
First get the combined orders using UNION between orders and oldorders table.
The rest of the work is exactly same what you did.
SEE DEMO
Note:
Left join doesn't make sense in this case. Orders for which the users don't exist then you will get NULL 0 as output. This doesn't hold any value.
If you want <user,total orders> for all users including users who might not have ordered yet then you need to change the order of the LEFT JOIN

sql query with calculation from subquery

I have two tables (well I have more but I simplify it some for this question)
Invoice
invoiceID 10
invoiceNo 1234
invoiceAmount 1000
invoiceStatus 2
Payments
paymentID 3
invoiceID 10
paymentAmount 500
paymentMethod 3
Now I need a query that gives me some values from table Invoice but also a calculation based on values from Payments for a certain invoiceID. What I would like to get is:
Invoice number, invoice amount and remaining amount to pay
-------------- --------------- -----------------------
1234 1000 500
Can you help me finish up the query with a subquery that actually works.
select i.invoiceNo as 'Invoice Number', i.invoiceAmount as 'Invoice amount' (i.invoiceAmount - totallyPayed) as reminingToPay
from Invoice i
left join Payments p on (p.invoiceID = i.invoiceID)
where
i.invoiceStatus = 2
and totallyPayed = (select sum(p.PaymentAmount) from Payments where p.paymentMethod in (1,2,3))
You could do:
SELECT i.invoiceNo AS 'Invoice Number',
i.invoiceAmount AS 'Invoice amount',
(i.invoiceAmount - COALESCE(p.totalPayed,0)) AS remainingToPay
FROM Invoice i
LEFT JOIN (
SELECT invoiceID,
SUM(paymentAmount) AS totalPayed
FROM payments
WHERE paymentMethod IN (1, 2, 3)
GROUP BY invoiceId
) p
ON p.invoiceID = i.invoiceID
WHERE i.invoiceStatus = 2
First you get the sum of paymentAmount from payments table for each invoiceID and then you join with your invoice table to get the remainingToPay.

How to find if a list/set is contained within another list

I have a list of product IDs and I want to find out which orders contain all those products. Orders table is structured like this:
order_id | product_id
----------------------
1 | 222
1 | 555
2 | 333
Obviously I can do it with some looping in PHP but I was wondering if there is an elegant way to do it purely in mysql.
My ideal fantasy query would be something like:
SELECT order_id
FROM orders
WHERE (222,555) IN GROUP_CONCAT(product_id)
GROUP BY order_id
Is there any hope or should I go read Tolkien? :) Also, out of curiosity, if not possible in mysql, is there any other database that has this functionality?
You were close
SELECT order_id
FROM orders
WHERE product_id in (222,555)
GROUP BY order_id
HAVING COUNT(DISTINCT product_id) = 2
Regarding your "out of curiosity" question in relational algebra this is achieved simply with division. AFAIK no RDBMS has implemented any extension that makes this as simple in SQL.
I have a preference for doing set comparisons only in the having clause:
select order_id
from orders
group by order_id
having sum(case when product_id = 222 then 1 else 0 end) > 0 and
sum(case when product_id = 555 then 1 else 0 end) > 0
What this is saying is: get me all orders where the order has at least one product 222 and at least one product 555.
I prefer this for two reasons. The first is generalizability. You can arrange more complicated conditions, such as 222 or 555 (just by changing the "and" to and "or"). Or, 333 and 555 or 222 without 555.
Second, when you create the query, you only have to put the condition in one place, in the having clause.
Assuming your database is properly normalized, i.e. there's no duplicate Product on a given Order
Mysqlism:
select order_id
from orders
group by order_id
having sum(product_id in (222,555)) = 2
Standard SQL:
select order_id
from orders
group by order_id
having sum(case when product_id in (222,555) then 1 end) = 2
If it has duplicates:
CREATE TABLE tbl
(`order_id` int, `product_id` int)
;
INSERT INTO tbl
(`order_id`, `product_id`)
VALUES
(1, 222),
(1, 555),
(2, 333),
(1, 555)
;
Do this then:
select order_id
from tbl
group by order_id
having count(distinct case when product_id in (222,555) then product_id end) = 2
Live test: http://www.sqlfiddle.com/#!2/fa1ad/5
CREATE TABLE orders
( order_id INTEGER NOT NULL
, product_id INTEGER NOT NULL
);
INSERT INTO orders(order_id,product_id) VALUES
(1, 222 ) , (1, 555 ) , (2, 333 )
, (3, 222 ) , (3, 555 ) , (3, 333 ); -- order#3 has all the products
CREATE TABLE products AS (SELECT DISTINCT product_id FROM orders);
SELECT *
FROM orders o1
--
-- There should not exist a product
-- that is not part of our order.
--
WHERE NOT EXISTS (
SELECT *
FROM products pr
WHERE 1=1
-- extra clause: only want producs from a literal list
AND pr.product_id IN (222,555,333)
-- ... that is not part of our order...
AND NOT EXISTS ( SELECT *
FROM orders o2
WHERE o2.product_id = pr.product_id
AND o2.order_id = o1.order_id
)
);
Result:
order_id | product_id
----------+------------
3 | 222
3 | 555
3 | 333
(3 rows)