Database design: many2many relationship for self referencing FK - mysql

My question can be demonstrated using the classic "employees" self referencing table, where manager_id is the FK related to the employee_id (PK). Another table is "authorizations".
What if authorizations are only relevant to managers and not to non-managers?
Assuming I create a junction table of "manager authorizations", can this table connect to employees.manager_id even though it's not unique?
Or must I separate managers to another table, even though they have the exact same attributes as non-managers?

Look at this structure Employees Structure. It manages a complex situation (a manager is responsible of one or more departement) and records are historical (from_date, to_date).
Try to modify this structure as follow:
departments becomes authorizations
dept_emp becomes the associative table auth_emp
remove dept_manager
A FK must always refer to a PK so connecting auth_emp to employee by manager_id is not possible (think to null references).
The decision to split or not employees that are manager to another table depends on your requirements (think about this: when a employee becomes a manager? An employee is a manager when at least another employee refers to him?).
If you want force the constraint that an employee_id registered in auth_emp refers to a manager you can use trigger on employee table. Obviously the trigger logic depends on your requirements.

Related

Constrains on a child table in mySQL

I have this situation:
MANAGER (ManagerID, Salary, .... , email)
PROJECT (ProjectID, ..., Date)
Since there is relationship M:N between Manager an project, I'll have a third table:
Manager_has_Project( ManagerID, ProjectID )
where ( ManagerID, ProjectID ) is the compound PK for Manager_has_Project
Let's suppose we have to delete a Manager who has created some projects from our database: SQL won't make us do that. We could add the constraint on the fk ManagerID in the child table "ON DELETE CASCADE", but in this case we will lose information about, for example, how many managers worked for a project. The alternative is "ON DELETE SET NULL" but, since ManagerID is part of the compound pK of Manager_has_Project, we can't set a PK as null.
What would recommend to do?
If you want to to keep the information, use soft deletes rather than actually removing the rows.
That is, add a column, say is_deleted or deletion_datetime that indicates that a Manager has been deleted. Then you can keep all the information, even about "deleted" managers.
You can use views so "normal" queries would only return managers who are not deleted.

How are these tables related?

Say I run an online business where you can order products from my website, and I have a database with two tables:
Table order with the fields order_number, customer_ID, address
Table customer with the fields customer_ID, first_name, last_name
To get a full, detailed 'report' of the order, I would perform a LEFT JOIN to concatenate data from the order table to include the customer's first and last names along with their address and order number.
My question is, how are these tables related? Are they at all? What would an entity relationship diagram look like? Separately they don't seem to interact and act more like lookup tables for each other.
An order would always have a customer, no? So it is not a left but inner join.
What links them is the customer_id. So your SQL is simply:
select o.order_number, o.customer_ID, o.address,
c.first_name, c.last_name
from orders o
inner join customer c on o.customer_ID = c.customer_ID;
Entity relationship:
Order Customer
Customer_Id 0...N >---+ 1 Customer_Id
... ...
This EF relation is from MS SQL Server's sample database Northwind. In that sample database, just like yours, there are Customers and Orders. Customers and Orders tables are related via the CustomerId fields in both tables (it is the primary key in Customers, and foreign key in Orders table). When you model that as an Entity relation than you have the above diagram. Customer entity have an "Orders" navigation property (via customerId) that points to a particular Customer's Orders. And Order entity have a navigation property that points to its Customer (again via CustomerId). The relation is 1 to 0 or many (1 - *), meaning a Customer might have 0 or more Orders.
When you do the join from Customer's side, you use a LEFT join "if you want to see all Customers regardless they have Order(s) or not" - 0 or more Order(s). If you want to see only those with Order(s) then you use an inner join.
When you do the join from Orders' side, then an Order must have a Customer so it can't be a LEFT join. It is an INNER join.
You can check the relation from both sides using the connecting CustomerId field.
You wouldn't have a separate table for "OrderId, CustomerId" as it is not many-to-many relation (it would be pure redundancy and would create normalization anomalies).
Hope it is more clear now.
In the entity-relationship model, we don't relate tables. Instead, we relate entities which are represented by their key values. In your example, the customer entity (represented by customer_ID) has a one-to-many relationship with the order entity (represented by order_number) and this relationship is recorded in the order table (where order_number is associated with customer_ID). If we were to strictly follow the ER model, we would record (order_number, customer_ID) (a relationship relation) in a separate table from (order_number, address) (an entity relation).
If there's a foreign key constraint on order.customer_ID referencing customer.customer_ID, that's a subset relation that ensures that every order's customer is a known customer. Thus, the relation between the tables and the relation between the entities is not the same thing.
The relational model allows us to relate (join) tables in any way we need. The obvious join for your example would be on the shared domain customer_ID, but I could just as easily find orders which contain a customer's last_name in its delivery address.
An ER diagram for your example could look like this:
My question is, how are these tables related? Are they at all? What
would an entity relationship diagram look like? Separately they don't
seem to interact and act more like lookup tables for each other.
When dealing with data modeling in an ER-Diagram, more in the conceptual model, we should never start thinking of tables, primary keys, foreign keys and stuff as this is for the later model to the conceptual model, the logical model.
By modeling entities in ER-Diagram, we must always recognize them by their attributes/properties. The entities become concrete when we can find properties in them. I feel a lack of context here, so I'll proceed with a guess:
If you need to persist products in your database, and you need to
save the attributes/properties of them, then we have a Products
entity.
If you need to persist clients/users in your database along with
their properties/attributes, then we have a Customers entity.
According to what you said, customers are able to purchase products. So, you have to relate these two entities in any way. With this in mind, stop and think about the following:
You need to persist purchases/orders.
Purchases/orders show which users bought which products.
By linking products and customers, what would we get? The purchasing/orders event, right?
We would then have two entities relating to each other forming the purchasing (or "orders", you name it) event. This event is formed by the need for the present business rules (your rules). You as modeler you need to persist the orders, and you know that products and customers relate in some way, so you can create a relationship between the two entities representing the orders event.
As has been well discussed in other answers, there is the need of separation of attributes here. If the field "address" is where a user lives, not where the product will be delivered, then this attribute must exist in the Customers entity.
If this field/attribute/property is the final location of delivery, it should remain in the relationship orders created between the two entities, customers and products.
Let's talk now about cardinality. If a customer can purchase/order more than one product, and the same product can be purchased by more than one user, then we have a N-N relationship here. I believe this is your case.
Based on everything that was said, we find the following conceptual model:
By decomposing this model, and developing the logic model, we would have:
Now, for what reason we end up with this kind of model?
Relations N-N allow the existence of properties/attributes (if needed), so we can have attributes/properties in the relation Orders, resulting in an ORDERS table with the fields shown. This table represents the purchases/orders made by users/customers.
And for what reason the existence of "Products from orders"?
We need to say what products were purchased in which purchases/orders, especially showing how many of the same type are acquired. In N-N relationships, some properties induces the appearance of new relations which become tables later. In such situations, it is the discretion of the modeler.
In "Products from orders" entity we have a composite primary key, represented by foreign keys.
With this type of model, you can see:
Which users made purchases/orders.
Which purchases/orders were made by which users.
Which products belong to which purchases/orders.
Which products have been acquired by which users.
How many products of a type were purchased/ordered.
Using the date field, you can find out how many purchases were made in periods:
Weeks.
Months.
Years.
Quarters.
Etc.
If you are interested, see also:
E-R diagram confusion
If you have any questions, please comment and I will answer.I hope I've helped a bit.
Cheers.
"Related" is a vague and confusing term. This is because "relationship" gets used in two different ways: as "association" (between values) and as "foreign key" (between tables).
You do not need to know anything about how tables are "related" by foreign keys or common columns in order to query. What matters is how values in a query's result rows are related/associated by the query. The important relationships/associations are (represented by) the tables.
From a relational perspective, there would be a foreign key from Order referencing Customer on customer_id.
A Chen-style ER diagram would have Order and Customer entity types (boxes) and an Orders relationship type (diamond) with participations (lines) from Orders to Order and to Customer. It would have a table for each type. This is an example of perfectly sensible relational design that the ER model, with its artificial and perverse distinction between entities and relationships, cannot capture. People will use a database design like yours to implement such an ER design, though. Even though the ER design isn't, then, the design.
--
When using a database, values are used to identify entities or as property names and magnitudes. Each base table holds the rows of values that participate in some relationship/association given by the DBA. Each query result holds the rows of values that participate in a relation/association that is a combination of conditions and the relationships/associations of the base tables it mentions. All you need to know to query is the relationships/associations of tables.
Order -- order ORDER_NUMBER is by customer CUSTOMER_ID to address ADDRESS
Customer -- customer CUSTOMER_ID is named FIRST_NAME LAST NAME
Order JOIN (Customer WHERE FIRST_NAME='Eddie')
-- order ORDER_NUMBER is by customer CUSTOMER_ID to address ADDRESS
AND customer CUSTOMER_ID is named FIRST_NAME LAST NAME
AND FIRST_NAME='Eddie'
Sometimes values for a list of columns in a row of one table must also appear as values for a list of columns in another table or that table. This is because if the listed values and others satisfy one table's relationship/association then the listed values and yet others will satisfy the other table's relationship/association.
/* IF there exist ORDER_ID & ADDRESS satisfying
order ORDER_NUMBER is by customer CUSTOMER_ID to address ADDRESS
THEN there exist FIRST_NAME & LAST_NAME satisfying
customer CUSTOMER_ID is named FIRST_NAME LAST_NAME
*/
FOREIGN KEY Order(customer_id) REFERENCES Customer(customer_id)
This means that those particular tables and column lists satisfy the (one and only) relationship/association "foreign key in this database". (And a database will have a meta-data table for its foreign key relationship/association.)
We say that "there is a foreign key" from the first table's list of columns to the second table's list of columns. There's only one foreign key relationship/association for the database. When there is a foreign key its tables and column lists satisfy this database's foreign key relationship. But a foreign key isn't a relationship/association. People call foreign keys "relationships", but they are not.
But foreign keys don't matter to querying! (Ditto for any other constraint.) A query's result holds rows whose values (entity & property info) are related/associated by that query's relationship/association, built from conditions and base table relationships/associations.

Is it necessary to bring the primary key of non repeating table while normalizing the database from UNF to 1NF

My UNF is
database(
manager_id,
manager_name,
{supplier_id,
supplier_name,
{order_id,
order_quantity}}
{purchase_id,
purchase_date}
Here manager_name, supplier_id, order_id and purchase_id are primary key.
During normalization there will be 1 table called purchase. Is it necessary to make manager_name as a foreign key?
How can I normalize these database?
This is a part of my college project on database. Normalization is really confusing.
First consider splitting things out by things that naturally go together. In this case you have manager information, supplier information, order information and purchase information. I personally would want to know the difference between an order and a purchase because that is not clear to me.
So you have at least four tables for those separate pieces of information (although depending on the other fields you might need, suppliers and managers could be in the same table with an additional field such as person_type to distinguish them, in this case you would want a lookup table to grab the valid person type values from). Then you need to see how these things relate to each other. Are they in a one to one relationship or a one-to many or a many to many relationship? In a one-to one relationship, you need the FK to also have a unique constraint of index to maintain the uniqueness. In a many to many you will need an additional junction table that contains both ids.
Otherwise in the simplest case the child table of purchase would have FKs to the manager, supplier. and order tables.
Manager name should under no circumstances be a primary key. Many people have the same name. Use Manager ID as the key because it is unique where name is not. In general I prefer to separate out the names into First, middle and last so that you can sort on last name easily. However in some cultures this doesn't work so well.

redundant foreign key?

I'm currently working on a database with the following requirements: There are accounts of different types. They can have different payment plans depending on the type. The whole thing is like a hierarchy with 2 levels. For instance, if the account types were A and B, then the payment plans could be A1, A2, B1, and B2, where Ax would only be valid for account type A and so on.
So far I've got the following setup:
A table account_types with id and name. A table payment_plans with account_type_id and id, both of them are part of the PK. And a table accounts with type_id and plan_id. I guess it's obvious what references what.
My problem is this: accounts.type_id is a FK and accounts.type_id + accounts.plan_id is a composite FK. I don't know if this an optimal solution. The type_id is kind of redundant, since it's implicitly defined by the plan_id, but only because of the constraints in place in the payment_plans table. So what would be best practice here? I could get rid type_id entirely, but then it would take another join with the payment_plans table just to determine the type of an account.
Thanks for your input in advance. Suggestions involving a completely different structure are also welcome. ;-)
I would have these tables:
account_types:
id: PK
name
payment_plans
id: PK
account_type_id: FK to account_types
accounts
whatever columns an account needs
accounts2payment_plans
account_id: FK to accounts and part of PK
payment_plan_id: FK to payment_plans and part of PK
The reasoning behind this is the following:
Each account type can have multiple payment plans, therefore the payment_plans table needs to reference the account_type table. With this reference, the type of a payment plan is defined.
Each account can have multiple payment plans, but each payment plan can be used by multiple accounts, that's why the mapping table accounts2payment_plans is needed. Through the mapping from account to payment plans all account types of an account are implicitly defined.

Normalization of database for timesheet tool and ensure data integrity

I'm creating a timesheet application. I have the following entities (amongst others):
Company
Employee = an employee associated with a company
Client = a client associated with a company
So far I have the following (abbreviated) database setup:
Company
- id
- name
Employee
- id
- companyId (FK to Company.id)
- name
Client
- id
- companyId (FK to Company.id)
- name
Now, I want an employee to be associated with a client, but only if that client is associated with the company the employee works for. How would you guarantee this data integrity on a database level? Or should I just depend on the application to guarantee this data integrity?
I thought about creating a many to many table like this:
EmployeeClient
- employeeId (FK to Employee.id)
- companyId \ (combined FK to Client.companyId, Client.id)
- clientId /
Thus, when I insert a client for an employee along with the employee's company id, the database should prevent this when the client is not associated with the employee's company id. Does this make sense? Because this still doesn't guarantee the employee is associated with the company. How do you deal with these things?
UPDATE
The scenario is as followed:
A company has multiple employees. Employees will only be linked to one company.
A company has multiple clients also. Clients will only be linked to one company.
(Company is a sandbox, so to speak).
An employee of a company can be linked to a client of it's company, but only if the client is part of the company's clientele.
In other words:
The application will allow a company to create/add employees and create/add clients (hence the companyId FK in the Employee and Client tables). Next, the company will be allowed to assign certain clients to certain of it's employees (EmployeeClient table).
Imagine an employee working on projects for a few clients for which s/he can write billable hours, but the employee must not be allowed to write billable hours for clients they are not assigned to by their employer (the company). So, employees will not automatically have access to all their company's clients, but only to those that the company has selected for them. Hopefully this has shed some more light on the matter.
If you want to do it from the database level then I would put the logic in a stored procedure. The stored proc code will then associate the two if applicable but this means that (given you put the foreign key to the employee in the client table) a client is only associated with one employee. Is this what you want?
Also take note though that an employee in your table is indirectly associated with all such clients via its company association. If all employees are automatically associated with all new clients of their company then perhaps you just want to write a query that checks for this.
(This is not an answer, but it didn't really fit in as a question comment.)
The data presented for your design question begs a number of questions:
Are employees to be associated with companies and clients? Or...
Are employees only associated with clients, and (thus) the company associated with that client?
If employess and clients are associated with companies, is an employee thus associated with all employees of that company, or must you pick and choose?
Update
As far as data modelling is concerned, it seems like all you need to do is expand the foreign key in EmployeeClient into Employee like so:
EmployeeClient
- companyId
- employeeId
- clientId
Compound primary key on all three columns.
Foreign key on (companyId, clientId) into Client
Foreign key on (companyId, employeeId) into Employee
Thus, all relations defined in EmployeeClient require both Client and Employee to share the same clientId.