What is the relationship between customer and order? - mysql

I'm confused what is the relationship bewteen customer and a order.
Most of websites says its One to many because One customer can place many order.
If we see physically its many to many because there is customer and there is product. In SQL table if I see the relatiosip tabe
Customer
Cid
Cname
1001
A
1002
B
1003
c
1005
D
Product
Pid
Pname
P1
Soap
p2
Dettol
p3
Toothpaste
p4
sanitizer
Relationship table (many customers can orders many product)many to many
orders
cid
pid
Pname
1001
p1
Soap
1001
p3
Toothpaste
1002
p1
Soap
1003
p3
Toothpaste
1005
p4
sanitizer
1005
p1
Soap
How you do consider this one to many?
Many coustomers can order many product's if we see real life.
if you know the answer just prove it one to many.
Many to many Relation i am expecting

Customer and Order should be one to many, not many to many. Imagine this, a customer can make many orders, but can an order by ordered by more than one customer? If you're talking about products, it's a different entity called Product which only contains information about a product. Products and Orders relationship is many to many since one order can have many products and the reverse is also true.
In summary, Customer and Order is a one-to-many, Product and Order is a many-to-many. Hope it makes senses.

One-to-many means that for each one customer there can be many orders. It doesn't mean that there's only one customer in the entire table.
Many-to-many means that you have a one-to-many relationship in both directions: Each customer can have many orders, and each order can have many customers. That's not normally allowed in an order database, each customer gets a different order.
The relationship between orders and products will be many-to-many, since many customers can order the same product. But each has to go through a different order.

this concept can best be understood by referring to yourself (One) as a customer who is making a purchase. it is normal that you can purchase so many things by yourself from more than one sources, which makes it one to many, and the same product or item which you purchased, is literally more than one in that same source you got yours, can also be distributed to many people, which makes it many to many

Related

MySQL - When shouldn't I Join tables? Combinatorial Explosion of values

I am working on a database called classicmodels, which I found at: https://www.mysqltutorial.org/mysql-sample-database.aspx/
I realized that when I executed an Inner Join between 'payments' and 'orders' tables, a 'cartesian explosion' occurred. I understand that these two tables are not meant to be joined. However, I would like to know if it is possible to identify this just by looking at the relational schema or if I should check the tables one by one.
For instance, the customer number '141' appears 26 times in the 'orders table', which I found by using the following code:
SELECT
customerNumber,
COUNT(customerNumber)
FROM
orders
WHERE customerNumber=141
GROUP BY customerNumber;
And the same customer number (141) appears 13 times in the payments table:
SELECT
customerNumber,
COUNT(customerNumber)
FROM
payments
WHERE customerNumber=141
GROUP BY customerNumber;
Finally, I executed an Inner Join between 'payments' and 'orders' tables, and selected only the rows with customer number '141'. MySQL returned 338 rows, which is the result of 26*13. So, my query is multiplying the number of times this 'customer n°' appears in 'orders' table by the number of times it appears in 'payments'.
SELECT
o.customernumber,
py.amount
FROM
customers c
JOIN
orders o ON c.customerNumber=o.customerNumber
JOIN
payments py ON c.customerNumber=py.customerNumber
WHERE o.customernumber=141;
My questions is the following:
1 ) Is there a way to look at the relational schema and identify if a Join can be executed (without generating a combinatorial explosion)? Or should I check table by table to understand how the relationship between them is?
Important Note: I realized that there are two asterisks in the payments table's representation in the relational schema below. Maybe this means that this table has a composite primary key (customerNumber+checkNumber). The problem is that 'checkNumber' does not appear in any other table.
This is the database's relational schema provided by the 'MySQL Tutorial' website:
Thank you for your attention!
This is called "combinatorial explosion" and it happens when rows in one table each join to multiple rows in other tables.
(It's not "overestimation" or any sort of estimation. It's counting data items multiple times when it should only count them once.)
It's a notorious pitfall of summarizing data in one-to-many relationships. In your example each customer may have no orders, one order, or more than one. Independently, they may have no payments, one, or many.
The trick is this: Use subqueries so your toplevel query with GROUP BY avoids joining one-to-many relationships serially. In the query you showed us, that's happening.
You can this subquery to get a resultset with just one row per customer. (try it.)
SELECT customernumber,
SUM(amount) amount
FROM payments
GROUP BY customernumber
Likewise you can get the value of all orders for each customer with this
SELECT c.customernumber,
SUM(od.qytOrdered * od.priceEach) amount
FROM orders o
JOIN orderdetails od ON o.orderNumber = od.orderNumber
GROUP BY c.customernumber
This JOIN won't explode in your face because customer can have multiple orders, and each order can have multiple details. So it's a strict hierarchical rollup.
Now, we can use these subqueries in the main query.
SELECT c.customernumber, p.payments, o.orders
FROM customers c
LEFT JOIN (
SELECT c.customernumber,
SUM(od.qytOrdered * od.priceEach) orders
FROM orders o
JOIN orderdetails od ON o.orderNumber = od.orderNumber
GROUP BY c.customernumber
) o ON c.customernumber = o.customernumber
LEFT JOIN (
SELECT customernumber,
SUM() payment
FROM payments
GROUP BY customernumber
) p on c.customernumber = p.customernumber
Takehome tricks:
A subquery IS a table (a virtual table) that can be used whereever you might mention a table or a view.
The GROUP BY stuff in this query happens separately in two subqueries, so no combinatorial explosions.
All three participants in the toplevel JOIN have either one or zero rows per customernumber.
The LEFT JOINs are there so we can still see customers with (importantly for a business) no orders or no payments. With the ordinary inner JOIN, rows have to match both sides of the ON conditions or they're omitted from the resultset.
Pro tip Format your SQL queries fanatically carefully: They are really verbose. Adm. Grace Hopper would be proud. That means they get quite long and nested, putting the Structured in Structured Query Language. If you, or anybody, is going to reason about them in future, we must be able to grasp the structure easily.
Pro tip 2 The data engineer who designed this database did a really good job thinking it through and documenting it. Aspire to this level of quality. (Rarely reached in the real world.)
In this particular case, your behavior should depend on the accounting style being supported by the database, and this does not appear to be "open item" style accounting ie when an order is raised for 1000 there does not need to be a payment against it for 1000.. This is perhaps unusual in most consumer experience because you will be quite familiar with open item style ordering from Amazon - you buy a 500 dollar tv and a 500 dollar games console, the order is a thousand dollars and you pay for it, the payment going against the order. However, you're also familiar with "balance forward" accounting if you paid for that order using your credit card because you make similar purchases every day for a month and hen you get a statement from your bank saying you owe 31000 and you pay a lump of money, doesn't even have to be 31k. You aren't expected to make 31 payments of 1000 to your bank at the end of the month. Your bank allocate it to the oldest items on the account (if they're nice, or the newest items if they're not) and may eventually charge you interest on unpaid transactions
1 ) Is there a way to look at the relational schema and identify if a Join can be executed
Yes, you can tell looking at the schema- customer has many orders, customer makes many payments, but there is no relation between the order and payment tables at all so we can see there is no attempt to directly attach a payment to an order. You can see that customer is a parent table of payment and order, and therefore enjoys a relationship with each of them but they do not relate to each other. If you had Person, Car and Address tables, a person has many addresses during their life, and many cars but it doesn't mean there is a relationship between cars and addresses
In such a case it simply doesn't make sense to join payments to customers to orders because they do not relate that way. If you want to make such a join and not suffer a Cartesian explosion then you absolutely have to sum one side or the other (or both) to ensure that your joins are 1:1 and 1:M (or 1:1 and 1:1). You cannot arrange a join that is a pair of 1:M.
Going back to the car/person/address example to make any meaningful joins, you have to build more information into the question and arrange the join to create the answer. Perhaps the question is "what cars did they own while they lived at" - this flattens the Person:Address relationship to 1:1 but leaves Person:Car as 1:M so they might have owned many cars during their time in that house. "What was the newest car they owned while living at..." might be 1:1 on both sides if there is a clear winner for "newest" (though if they bought two cars manufactured at identical times...)
Which side you sum in your orders case will depend on what you want to know, but in this case I'd say you usually want to know "which orders haven't been paid for" and that's summing all payments and rolling summing all orders then looking at what point the rolling sum exceeds the sum of payments.. those are the unpaid orders
Take a look again at your database graph (the one that was present in the first iteration of your question). See the lines between tables have 3 angled legs on one end - that's the many end. You can start at any table in the graph and join to other tables by walking along the relationship. If you're going from the many end to the one end, and assuming you've picked out a single row in the start table (a single order) you can always walk to any other table in the many->one direction and not increase your row count. If you walk the other way you potentially increase your row count. If you split and walk two ways that both increase row count you get a Cartesian explosion. Of course, also you don't have to only join on relation lines, but that's out of scope for the question
ps: this is easier to see on the db diagram than the ERD in the question because the database purely concerns itself with the columns that are foreign keyed. The ERD is saying a customer has zero or one payments with a particular check number but the database will only be concerned with "the customer ID appears once in the customer table and multiple times in the payment table" because only part of the compound primary key of payment is keyed to the customer table. In other words, the ERD is concerned with business logic relations too, but the db diagram is purely how tables relate and they aren't necessarily aligned. For this reason the db diagrams are probably easier to read when walking round for join strategies
After seeing the answers of Caius Jard and O.Jones (please, check their replies), which kindly helped me to clarify this doubt, I decided to create a table to identify which customers paid for all orders they made and which ones did not. This creates a pertinent reason to join 'orders', 'orderdetails', 'payments' and 'customers' tables, because some orders may have been cancelled or still may be 'On Hold', as we can see in their corresponding 'status' in the 'orders' table. Also, this enables us to execute this join without generating a 'combinatorial explosion'.
I did this by using the CASE statement, which registers when py.amount and amount_in_orders match, don't match or when they are NULL (customers which did not make orders or payments):
SELECT
c.customerNumber,
py.amount,
amount_in_orders,
CASE
WHEN py.amount=amount_in_orders THEN 'Match'
WHEN py.amount IS NULL AND amount_in_orders IS NULL THEN 'NULL'
ELSE 'Don''t Match'
END AS Match
FROM
customers c
LEFT JOIN(
SELECT
o.customerNumber, SUM(od.quantityOrdered*od.priceEach) AS amount_in_orders
FROM
orders o
JOIN orderdetails od ON o.orderNumber=od.orderNumber
GROUP BY o.customerNumber
) o ON c.customerNumber=o.customerNumber
LEFT JOIN(
SELECT customernumber, SUM(amount) AS amount
FROM payments
GROUP BY customerNumber
) py ON c.customerNumber=py.customerNumber
ORDER BY py.amount DESC;
The query returned 122 rows. The images below are fractions of the generated output, so you can visualize what happened:
For instance, we can see that the customers identified by the numbers '141', '124', '119' and '496' did not pay for all the orders they made. Maybe some of them where cancelled or maybe they simply did not pay for them yet.
And this image shows some of the columns (not all of them) that are NULL:

mySQL: how to go along 3 different tables, referring to different column in each table but using primary keys

I'm trying to find out which schools had students that did not complete their exams in 2018. So I've got 3 tables set up being: ExamInfo, ExamEntry and Students. I'm going to try use the ExamInfo table to get information from the Students table though, I obviously only want the student information that did not complete their exam in 2018. Note: I'm looking for students that attended, though did not complete the exam, with this particular exam you can look at completed exam as passed exam.
Within ExamInfo I have the columns:
ExamInfo_Date --when exam took place, using to get year() condition
ExamInfo_ExamNo --unique student exam ID used to connect with other tables
ExamInfo_Completed --1 if completed, 0 if not.
...
Within ExamEntry I have the related columns:
ExamEntry_ExamNo --connected to ExamInfo table
ExamEntry_StudentId --unique studentId used to connect to Students table
ExamEntry_Date -- this is same as ExamInfo_Date if any relevance.
...
Within Students I have following columns:
Students_Id --this is related to ExamEntry_StudentId, PRIMARY KEY
Students_School --this is the school of which I wish to be my output.
...
I want my output to simply be a list of all schools that had students that did not complete their exams in 2018. Though my issue is with getting from the ExamInfo table, to finding the schools of which students did not complete their exam.
So far I've:
SELECT a.Students_School, YEAR(l.ExamInfo_Date), l.ExamInfo_Completed
FROM ExamInfo l ??JOIN?? Students a
WHERE YEAR(l.ExamInfo_Date) = 2018
AND l.ExamInfo_Completed = 0
;
I'm not even sure if going through the ExamEntry table is necessary. I'm sure I'm meant to use a join, though unsure of how to appropriately use it. Also, with my 3 different SELECT columns, I only wish for Students_School column to be output:
Students_School
---------------
Applederry
Barnet Boys
...
Clearly, you need a JOIN -- two in fact. Your table has exams, students, and a junction/association table that represents the many-to-many relationship between these entities.
So, I would expect the FROM clause to look like:
FROM ExamInfo e JOIN
ExamEntry ee
ON ee.ExamEntry_ExamNo = e.ExamNo JOIN
Students s
ON ee.ExamEntry_StudentId = s.Students_Id

How can make sure logically that my many to many relationships are created right?

I have basically 13 different tables. I have Customer table, a Sale table which is connected to Customer table because a customer can buy a Sell and that same Sale table is also connected to an Employee table because an Employee can Sell a Sale. Then I have an Order table connected to a Vendor table because a single order can be placed to a Vendor just as well as many different orders can be placed to many different vendors. But also the Employee table can be connected to a Shop table because an Employee can work at a single shop just as well as many employees work for different shops. The Employee and Shop table is connected by EmployeeShop which have reference or have foreign keys to both the Employee and Shop table. The rest of the relationships can be shown in the pic below. My problem is that I am not entirely sure that some of my tables that are connected to some of the other tables, for example, Ingredient table that is connected to the OrderLineItem, is entirely right. And also Order table and Vendor which are connected by the OrderVendor. Please any help or advice would be beneficial to actually moving on and forward my model to an physical database.
Here is the EER model
[
Let's try that again:
ingredients(ingredient_id*,ingredient)
order_details (order_id*,ingredient_id*,qty)
orders (order_id*, vendor_id, date)
vendors(vendor_id*,vendor details, etc.)
[ * = (component of) PRIMARY KEY ].

Inventory/Stock Monitoring Database Schema

I tried my best, asked IRC for help but still kinda hard.
Here's what I only got so far
I don't know how would I link medicines to products.
Here's the logic:
Many brands can make many products.
Products can have many brands, has name, and image
Products have type (it may be Medicine, Soap, etc)
If product is medicine, I want to it to be attached to the medicines tables as it's attribute will be inserted there.
examples:
Brand A - Product A - image1 - Medicine - 250 - mg
Brand B - Product A - image2 - Medicine - 250 - mg
Brand B - Product A - image2 - Medicine - 500 - mg
Brand B - Product B - image 3 - Soap
Brand B - Product C - image 4- Soap
-- EDIT --
I thought of adding medicine_fk on the products table. It'll be null if it's not a medicine. But thinking about its flexibility, what if in the future there will be more product types?
A good example (bad type tho xD)
I'll be needing car_fk in the product table? that points to car table?
How should I do that?
-- EDIT --
My mind's is so stressed about this one I forgot I should put the product_id instead on car table and medicine table and any other types of products
-- EDIT --
At first I thought how to find all the tables that is related to the product if ever a user added a new product type.
And thought of making another table again, and also dynamically making new tables for each product types but that seems an ugly way.
With the help of IRC people I ended up with this:
Is there any possible errors with that?
-- EDIT --
My FINAL table structure is the same as above only without the sub tables.
I removed subattributes and sub categories.
Add a parent_id column on categories and attributes table instead.
Much better and I assumed this answered my question.
Firstly, in your diagram the relationship between PRODUCT_TYPES and PRODUCTS is depicted as one-to-one, shouldn't this relationship be one- to-many (One product type can have many products for example Medicine can have Asprin, Paracetamol etc).
Secondly, the approach you are following is correct. Just introduce your PRODUCT_ID in your medicines table and your CAR table. It should be NOT NULL. Since PRODUCTS is your parent table it should not have the keys from other tables.
Alternately, you could also make the ID in your medicines table as a foreign key to the ID column in your PRODUCTS table.
Keep in mind that relationships should be 1:many or many:many, never 1:1. Only in the many:many case do you need an extra table.
"Sub" attributes and categories will be a nightmare to deal with; can you get rid of them?
What will the SELECTs be? To some extend, they drive what the tables look like.
A table with a thousand rows does not need its fields "normalized".

selecting unique combination from a many to many relationship MySQL

I have a database which has a many to many structure between products and their location in our warehouse (due to the random nature of the products being bought)
I need to match a unique location for the same product to multiple customers who order it.
DB SCHEMA
2 tables
Customer:Cust_ID, Prod_ID
Product:Prod_ID,Locat_ID
There are many locations for the same product and many products in the same location.
Customer:
C1,P1
C2,P1
Product:
P1,L1
P1,L2
P1,L3
SELECT
Customer.Cust_ID,
Customer.Prod_ID,
Product.Locat_ID,
FROM Customer JOIN Product ON Customer.Prod_ID = Location.Prod_ID
Gives
C1,P1,L1
C1,P1,L2
C1,P1,L3
C2,P1,L1
C2,P1,L2
C2,P1,L3
If I group by Cust_ID I get
C1,P1,L1
C2,P1,L1
but I need
C1,P1,L1
C2,P1,L2
Thanks for any insights