Database Design: Circular reference and how to correct it - mysql

Is this a circular reference? If so, how can I improve my model?

You don't have any circular references. I interpret the data model to say:
An Item belongs to exactly 1 Client
An Item belongs to 0 or 1 Employee
An Employee belongs to exactly 1 Client
A circular reference would add An Employees to exactly 1 Item.
In the comments, you said than an item always belongs to the same client as it's employee, but not all items belong to an employee.
There are a few ways to model this.
What I would avoid is having ClientID as a not-null foreign key relationship on Item - this duplicates the logic that "an item without an explicit client ID inherits the client ID from its employee". It's not expressive (people reading the schema would not be able to figure that out), and opens up bugs.
One option is to make the cardinality of both item->employee and item-> client optional (i.e. 0..1). Your convention would then be if an item has a client relationship, it may not have an employee relationship, and if an item has an employee relationship, it may not have an explicit client relationship; the client is determined by the employee. You can't cleanly express this in your schema, and would have to build this into your data access code.
Another option is to create two type of item, one with a clientID foreign relationship, and one with an employeeId foreign relationship. This is much more expressive from a schema point of view - presumably there is some business concept you can use to name the tables. However, if Item has lots of attributes, you're duplicating a lot.
Finally, you could store the relationship of items to either client or employee in separate joining table:
Item
-------
ItemID
...
ItemEmployees
-----------------
ItemID
EmployeeID
ItemClients
----------
ItemID
ClientID
This avoids the duplication of attributes on Item, but is less expressive because it uses a pattern more commonly used for many-to-many relationships, and doesn't explicitly declare "either or".

Related

Can a many to one relationship be represented in a logical ER diagram?

I have a particular problem from my assignment which goes like this :
"Each product making up a set is supplied by a single supplier and is given a unique ID,. Products are always sold as part of a set, never on their own."
So based on this is assumed Many Products creates One Package(aka set), but i don't know if i'm right, if so how can I visually show a Many to One relationship as an ER diagram.
I have constructed my own Conceptual and Logical ER diagram, I just need to know if i'm right or wrong so that i can continue with the rest.
Here's a breakdown of the assignment and what I get from it:
Each product making up a set is supplied by a single supplier and is given a unique ID,. Products are always sold as part of a set, never on their own.
From this I get that we have these entities:
Product
Supplier
Package (Set)
You should know that each Entity needs its own primary key. Pros will either call this id, or product_id. There are ORM's that tend to work best out of the box, if you name the pk for each table 'id', especially when it is a simple sequence number.
It's also better not to do what you are doing with attribute names. In sql people stick with either all uppercase or all lowercase naming rather than camelcase. Also I'd suggest that you don't name the price attribute pPrice just because it's in the Package table. Just name it price, because it can be referred to as Package.price if you need to tell it apart from some other table that also contains a price column.
The important thing to understand is that the relationship between Package and Product is Many to Many
One Product can be part of Many Packages.
One Package can contain Many Products
In order to create entities for a Many to Many relationship, you need a table that sits between the 2 tables and will have foreign keys to both tables in it. Typically people will pick whatever they consider the dominant side -- I would probably use Package, and name the table "PackageProduct" to reinforce the idea that this table lets me package products together and sell or distribute them.
PackageProduct
--------------
id (pk)
package_id (foreign key to Package table)
product_id (foreign key to Product table)
You also need a supplier table, but you were informed that the relationship between Package and supplier is that a Package can have one and only one Supplier.
This is code for: create a one to many relationship between Supplier and Package. In doing this, Package will have a foreign key in it that stores the Supplier.id (or supplier_id)
So to conclude you should have these entities (tables):
Package
Product
Supplier
PackageProduct
ERD
Here's an ERD rendered with "Relational" format which I find a bit more descriptive, as the many sides of the connections use the crowsfoot, so it's very obvious.
According to your description your schema will have one to many relation i.e your single package comprises many products.
You can also find out your ERD diagram

How to structure a Bill of Materials that has multiple options

I am stuck trying to develop a Bill of Materials in Access. I have a table call IM_Item_Registry where I have the Item_Code and a boolean for if it's a component. Where I'm stuck is that past sins of the company made several part numbers for the same ingredient from different vendors. A product may use ingredient 1 at the beginning of the run and ingredient 2 at the end of a run depending on inventory and it may switch from job to job (Lack of discipline and random purchasing based on price). It's creating a headache for me because they typically have different inclusions. How would I go about adding in the flexibility to use both? or would it just be easier to make multiple versions and then select those version upon scheduling?
I know this is loaded and I can include more detail if needed but I appreciate your help I've been researching on how to do this for a couple weeks now.
EDIT (3/28/2019)
this is for an injection molding company.
IM_Item_Registry (Fields: Item_Code, Category(Raw, manufactured, customer supplied, assembly component), Description, Component (boolean), active (boolean), Unit of Measure.
for this Bill-of-materials 100011 produces component lets call this a handle. bill 100011 uses raw resin 700049 at 98% inclusion and raw color 600020 at 2% inclusion. However, we may run out of raw color 600020 and have to run it out of 600051 which would change 700049 to 98.5% inclusion because 600051 requires 1.5% inclusion to achieve the same color.
i would like to create a table that would call out for the general term lets say 600020 and 600051 is yellow color additive. then create a "ghost" number to call for either 600020 or 600051 and give both formulation recipes. When production starts they would scan in which color they actually used to create the production BOM themselves and record which color was used and how much. is there a way to do this in access database structuring?
I'm assuming I would need both the item_registry table, a BoM table (fields: BOM#, ParentID, Ghost_ID) and then a components table (Fields: Ghost_ID, item_code, Inclusion Rate).
Database normalization is the guiding principle for designing efficient, useful tables and relationships in a relational database. Access forms, subforms, reports, etc. require properly normalized tables to work as intended. There are various levels of normalization, but the common idea is to avoid duplication of data between rows and columns of data. Having duplicate data requires a lot of overhead in storage and in ensuring that actions on the database do not create inconsistent states (contradictory data values). Well-normalized tables allow useful constraints to be defined between data columns and/or rows to ensure that data is valid.
The [BoM] table as proposed in the question is not normalized. But before we get to that, the ParentID was not defined and it's not clear what it represents. Instead, to help show why it's not normalized, let me add a [Product] column to the [BoM] table. Then if such a handle has two alternative lists of components (ghosts?), the table would look like
BOMID, Product, GhostID
----- ------- -------
1 Handle 1
1 Handle 2
See the duplication? And now if the product is renamed, for instance to "Bronze Handle", then both rows need to be updated for a single conceptual element. It also introduces the possibility of having contradictory data like
BOMID, Product, GhostID
----- ------- -------
1 Handle 1
1 Bronze Handle 2
Enough said about that, since I've already gone on too much about normalization concepts here. Following is a basic normalized schema which would serve you better, but notice that it's not too much different that what you proposed in the question. The only real difference is that the BoM table is normalized by splitting its columns (and purpose) into another table.
I do not list all columns here, only primary and foreign keys and a few other meaningful columns. PK = Primary Key (unique, non-null key), FK = Foreign Key. Proper indices should be defined on the PK and FK columns AND relationships defined with appropriate constraints.
Table: [IM_Item_Registry]
Item_Code (PK)
Table: [BOM]
BOMID (PK)
ProductID (FK)
Table: [BOM_Option]
OptionID (PK)
BOMID (FK)
Primary (boolean) - flags the primary/usual list of components
Description
Table: [Option_Items]
OptionID (FK; part of composite PK)
Item_Code (FK; part of composite PK)
Inclusion_Rate
The [BOM].[ProductID] column alludes to another table with details of the product which should be defined separately from the Bill of Material. If this database really is super-simplistic, then it could just be a string field [Product] containing the name, but I assume there are more useful details to store. Perhaps this is what the ParentID also alluded to? (I suggest choosing names that are not so abstract like "parent" and "ghost", hence my choice of the word "option".)
Really, since [BOM_Option] should be limited to a single option per BOM, it would fulfill proper normalization to create another table like
Table: [BOM_Primary]
[BOMID] (FK and PK) - Primary key so only one primary option can be defined at once
[OptionID] (FK)

Creating a table for each student. Is it considered a bad practice?

i have hit a road bump where i need to list all the current courses for students and instructors and i have 2 tables one of them is called students and the second one is called courses. I was thinking of creating a field for students called courses and then separating entries with a comma so i can use the WHERE IN clause but creating a table for each student is much easier.
As you have a many-to-many mapping, consider using a linking table with student_id and course_id columns.
I was thinking of creating a field for students called courses and then separating entries with a comma
Bad idea, and you're certainly not the first to have it.
creating a table for each student is much easier
Worse idea, and you're certainly not the first to have it.
Don't create database structures that require you to parse information from disorganized blobs. And definitely don't create database structures that require you to change the structure every time data changes.
What you're describing, the relationship between Student and Course, is called a many-to-many relationship. To achieve it, all you need is a "linking table" between the two entities. Consider something like this:
Student
----------
ID (PK)
Name
Course
----------
ID (PK)
Name
Simple enough representation of those two entities. Now all you need is a third table to connect them in a many-to-many relationship:
StudentCourse
----------
ID (PK)
StudentID (FK)
CourseID (FK)
A few things to note:
The name of the table doesn't have to follow this convention, this is just a common practice. You can call it anything you like. Enrollment might be a good name for this as it grows into its own entity.
This doesn't need its own ID (PK), its primary key could be a composite of the two foreign keys (since each pair thereof should also be unique in this domain).
This can quickly grow into its own entity if it has more data than just the relationship. For example, if there is specific information about a student's enrollment in a course which is specific to the combination of the two, but not specific to either entity itself. A registration number of some kind, a date/time of enrollment, etc. This table would become its own entity alongside the other two and be more than just a structural linking table.

Database Hierarchy Structure - Different Node Representation

I am looking for some feedback/guidance on modeling a hierarchy structure within a relational database. My requirement states that I need to have a tree structure, where every node within the tree can represent a different type of data. For example:
Organization
Department 1
Employee 1
Employee 2
Office Equipment 1
Office Equipment 2
Department 1
Team 1
Office Equipment 3
In the example above, Organization, Department, Employee, Office Equipment, and Team could all be different tables within the database and have different properties associated with them. Additionally, things like Office Equipment may not necessarily be required to be associated to a department - it could be associated to a Team or the Organization.
I have two ideas surrounding modeling this:
The first idea is to have a hierarchy table like below:
hierarchys
hierarchy_id (INT, NOT NULL)
parent_hierarchy_id (INT, NOT NULL)
organization_id (INT, NULL)
department_id (INT, NULL)
team_id (INT, NULL)
office_equipment (INT, NULL)
In the table above, each of the columns would be a nullable field with a foreign key reference to their respectable table. The idea would be that only one column from every row would be populated.
My second idea is to have a single table like below:
hierarchys
hierarchy_id (INT, NOT NULL)
parent_hierarchy_id (INT, NOT NULL)
type (INT, NOT NULL)
In this case, the table above would manage the hierarchy structure, and each "node table" would have a hierarchy_id which would have a foreign key reference back to the hierarchy table (i.e. organizations would have a hierachy_id column). The type column would be a lookup to represent which type node is being represented (i.e. Organization, Employee, etc).
I see pros and cons in both approaches.
Some additional information:
I would like to keep in mind maintainability of this table - there will be additions, deletions, changes, etc.
I will have to display this data on an user interface, which will likely just display an icon to represent the node type, and the name.
I will have to preform some aggregations across the tree for different data requests.
This structure will be backed by a MySQL database.
Does anyone have an experience with a similar scenario? I have searched quite a bit for information and guidance on this approach, but have not been able to find any information. I have a feeling there is a specific term for what I am looking for that I am failing to use.
Thank you in advance for the community's help.
You may want to look into "nested sets". This is a model for representing subsets of an ordered set by two limits, which we can call "left" and "right". In this model, (6,7) is a subset of (5,10) because it is "nested" inside of it. If you use nested sets together with your design of having a separate table for the hierarchy, you'll end up with four columns in your hierarchy table: leftID, rightID, ObjectID (an FK), and level.
There is a good description of the nested set model in Wikipedia, which you can view by clicking here.
I have encountered similar situations throughout different projects, and the approach I've taken in those cases was very similar to your second solution.
I am also a bit biased towards how some Ruby on Rails gems do things, but you can easily figure out how you would implement these techniques with plain SQL and some application logic. So I'm giving you one alternative to your solution:
Using "Multi Table Inheritance" (Implemented in Heritage: https://github.com/dipth/Heritage). In this scenario you would have a Node table which forms the basis of your hierarchy with:
Node (id, parent_node_id, heir_type, heir_id)
Where the heir_type is the name of the table holding the details for the node (e.g., Organization, Employee, team, etc.), and the heir_id is the id of the object in that table.
Then each type of node would have it's own table and it's own unique id. e.g.:
Organization(id, name, address)
Having the rest of the tables independently from the hierarchy (i.e., strong entities) makes your model more flexible to new additions. Also having a separate table with its own unique id to handle the hierarchy makes it easier to render the hierarchy without having to deal with parent types etc. This model is also more flexible in the sense that one entity can be part of many different branches of the hierarchy (e.g., Employee 1 could be a member of Team 1 and Team 2 at the same time.)
Your solution has one mistake: The hierarchys is miss-spelled :P JK. The hierarchys table has no unique id. It looks like the unique id is a composite key (hierarchy_id, type). The parent_hierarchy_id does not capture the type of the parent and thus it may point to multiple nodes and many inconsistencies.
If you'd like me to elaborate more, let me know.

Connecting Two Items in a Database - Best method?

In a MySQL Database, I have two tables: Users and Items
The idea is that Users can create as many Items as they want, each with unique IDs, and they will be connected so that I can display all of the Items from a particular user.
Which is the better method in terms of performance and clarity? Is there even a real difference?
Each User will contain a column with a list of Item IDs, and the query will retrieve all matching Item rows.
Each Item will contain a column with the User's ID that created it, and the query will call for all Items with a specific User ID.
Let me just clarify why approach 2 is superior...
The approach 1 means you'd be packing several distinct pieces of information within the same database field. That violates the principle of atomicity and therefore the 1NF. As a consequence:
Indexing won't work (bad for performance).
FOREIGN KEYs and type safety won't work (bad for data integrity).
Indeed, the approach 2 is the standard way for representing such "one to many" relationship.
2nd approach is better, because it defines one-to-many relationship on USER to ITEM table.
You can create foreign key on ITEM table on USERID columns which refers to USERID column in USER table.
You can easily join both tables and index also be used for that query.
As long as an item doesn't have multiple owners it's a one to many relationship. This typically gets reduced to the second approach you mention, eg. have a user or created_by column in the Items table.
If a User can have one or more Items but each Item is owned by only a single User, then you have a classic One-To-Many relationship.
The first option, cramming a list of related IDs into a single field, is exactly the wrong way to do it.
Assign a unique identifier field to each table (called the primary key). And add an extra field to the Item table, a foreign key, the id of the User that owns that item.
Like this ERD (entity-relationship diagram)…
You have some learning to do about relational database design and normalization.