database design: X:X to 1:many vs X:X to 0:many - mysql

(edited 1/5 10:22hr. added some explanation about my notation. And added some additional information I received)
I am doing a course on database design and currently we're doing ERD's and designing db's in MySQL worksbench. Think 1st, 2nd and 3rd NF, creating schema, tables, constraints, etc.
Most of it is pretty clear to me.
However there's one aspect where things remain unclear: the X:X to 1:many relationship vs the X:X to 0:many relationship (meaning: whatever to 0:many, vs whatever to 1:many, etc).
In some cases it's obvious, in others not so much. Whenever it's unclear to me, it's mostly something like this:
Example :
an artist has 1 to many paintings. A painting has 1-and only 1 artist.
Relationship:
|artist| 1:1 -------- 1:many |painting|
the same in another notation
|artist| ||------------ 1< |painting|
This seems fair, but....Then there's the thought: I could be a new artist, not having produced a painting yet.
Or: I could be entering a new artist into a artist table, not yet having entered his paintings yet (which could lead to a practical issue).
Another example:
A workshop has 1 to many participants. A participant enters 0-to-many workshops.
Relationship:
|workshop| many:0 ------- 1:many |participant|
Okay. However: a workshop could have 0 participants (no one want to participate, probably leading to cancellation).
Or: I could be entering a new workshop into a table, not having added any participants yet.
Another example:
An event is held at 1 only 1 location. A location had 1 to many events.
Relationship: |event| many:1 -------- 1:1 |location|
However, maybe you're entering a new (future) location, and there have not been events there yet.
Long shorty short: I am having a hard time establishing the minimal cardinality in cases like above.
Also, when I'm designing a db and get Workbench to forward engineer the SQL for creating the tables (based on my ERD), there doesn't seem to be any difference between a X to 1/many vs a X to 0/many variant. Which makes me wonder: what's the actual (practical) effect or implication of doing one or the other? Maybe the implications (further down the road) make for an easier choice?
Can anyone explain this matter in a simple (fool-proof) way?
Looking forward to understanding!
Addition 1/5:
I've talked about my question/issue with a teacher. He agreed with me that certain minimum cardinalities could lead to a deadlock:
one table cannot be inserted without there being a occurence in the other, and vice versa.
He explained to me that the ERD diagram is a logical model, not perse a fysical model. In other words, the ERD's minimum cardinality is not neccessarily for technical implementation.
Well, if that is the case, I understand his point. Usually an artist has at least one painting. A workshop normally has at least one participant. A location usually has at least one event. So on a logical level, that seems fine.
On a technical/implementation level, it is another deal. You should be able to enter a artist, workshop or location without there already being occurrences in another table.
My question now is:
is this true? Is a ERD a logical model, not a technical model?
and if that is so, WHAT is the reason for adding the minimum cardinality? It seems of little use.

Let's continue with your artist::paintings relationship, I think that's the clearest. When we say 1 to many, we often mean 0/1 to 0/many, but not all the 4 permutations work or are meaningful.
How does "I can have no artists with no paintings" sound? That's the zero-to-zero permutation. It does not sound wrong, but it's a degenerate case that is of no use to us.
1 artist to 0 paintings is OK, as matbailie described, and presents no problems.
1 artist to many paintings is OK, and is the main use case.
0 artists to many paintings is just not correct and should not be supported in the model. A painting must have an artist.
Because cases 1 & 4 don't work, it is not really correct to say 0/1 to 0/many. It is more correct to characterize the cardinality as 1 to 0/many, which encompasses cases 2 & 3 above.
You might say, 'but there are cases where we don't know the artist so we should leave an opportunity to have paintings with no artists'. This statement feels to me like you are leaving the realm of the ERD and entering into physical design. From a design standpoint you could just as easily say there is an artist we just don't know who, so let's create an UNKNOWN ARTIST record and connect those paintings to it.
Your second example (workshops::participants) looks more like a 0/many to 0/many. If you run through the same 4 permutations, they all look credible, though the 0-to-0 case still seems kind of ludicrous.
Your last example is another 1 to 0/many because events without locations cannot be held. When you get to the physical level you can talk about the best way to handle virtual events.
So, none of your examples seem to show a true zero-to-many (which is more accurately stated 0/1 to 0/many). I'm thinking they are pretty rare, if they exist at all. It would have to be associated with some kind of optional activity where if you did enroll in it there were a constrained set of choices.

But... A participant can be in 0-N workshops. So that needs to be many-to-many.
In general "zero" is a degenerate case of "1" or "N"; it is rarely worth worrying about.
I like to start by identifying the "entities" in the model. Your participant, workshop, painting, event, artist, location are excellent examples of such. Then I look for "relations" between obvious pairs.
I say there are only 3 cases. And I like to remember that the two "entities" are manifested as two "database tables":
1:1 -- At which point I ask why the two entities (database tables) are not merged together. The two tables share the same unique key.
1:many -- Represented by the id of the "1" as a column in the "many" table.
Many:many -- This needs a link table between the two tables. This linking table has two ids.
"Id" means a unique key for a table. It is usually a number, but that is not a requirement.
For "0"...
1:0 or many:0 -- You may need a LEFT JOIN to provide NULLs when the entry on the "0" side is missing
Many:many -- If either id is non-existent (the zero case), then there are no rows for that relationship.
Then comes defining the INDEXes for efficient access. And, optionally, FOREIGN KEYs for integrity. The indexes that represent how two entities are related are prime candidates for FKs. Other INDEXes should be added to optimize WHERE clauses in SQL queries.
In all cases, the id/FK/index may be "composite" -- meaning that it is two or more columns that are used for a single id/FK/index.

Related

Exist practices/guidelines for creation of non normalized tables during the normalization process?

I started to teach myself the basics of databases and i am currently working through 1. to 3. normal forms. What i understand until now is the wish to remove redundancy to make my databases less prone to inconsistency during phases of data-change as well as saving space by eliminating as much duplicates as possible.
For example if we have a table with the following columns:
CD_ID
title
artist
year
and change the design to have multiple tables where the first (CD) contains:
CD_ID
title
artist_ID
the second (artist) contains:
artist_ID
artist
year
I see that in the original table the year is transitively dependent on the ID via the artist. So we wanna get rid of that and create a table for the artists so our new CD table is now in third normal form.
But to do so i created another table (the artist table) which again is not in third normal form as far as I understand it, as we have the same type of transitive dependency like before just in another table.
Is this correct and if yes should i also normalize the artist table to be in 3rd NF? When do I stop?
TL;DR You need to follow a published algorithm to decompose to a given normal form.
PS You didn't get Artist from the original CD via normalization, since you introduced a new column. But assume table Artist has the obvious meaning. Why do you think it "again is not in third normal form as far as I understand it"? If artist -> year in the original CD then it also does in Artist. But then {artist} is, with {artist_id}, a CK (candidate key) of Artist, and Artist is in 3NF (and 5NF).
From your question's original version plus the current one, you have a proposed base table CD with columns cd_id, title, group & year, holding tuples where cd cd_id titled title was made by group group that formed in year year. Column cd_id is unique, hence is a CK. FD {group} -> year also holds.
Normalization does not introduce new column names. It replaces a proposed base table by others, each with a smaller subset of its columns, that always join to what its value would have been. Normalization up to BCNF is based on FDs (functional dependencies), which are also what determine the CKs of a base table. So your question does not contain a decomposition. A possible decomposition reminiscent of your question, which might or might not have any particular properties, would be to tables with column sets {cd_id, title, group} and {group, year}.
Other FDs hold in the original. Some hold because of what the columns are; some hold because of the CK; some hold because {group} -> year holds; in general, certain ones hold because all three of those do. And maybe others hold because of what tuples are supposed to go into the relation and what situations can arise. You need to decide for every possible FD whether it holds.
Of course, you might have been told that the only ones that hold are the ones that have to hold under those circumstances. But you won't have been told that the only FD that holds is {group} -> year, because there are trivial FDs and every superset of a CK functionally determines every set of columns.
One definition of 3NF is that a relation is in 2NF and no non-prime column is transitively functionally dependent on any CK. (Notice each condition involves other definitions.) If you want to use this to find out whether your relation is in 3NF then you next need to find out what all the CKs are. You can do this fastest via an appropriate algorithm, but you can just see which sets of columns functionally determine every column but don't contain a smaller such set, since those are the CKs. Then check the two conditions in the definition.
If you want to normalize to 3NF then you need to follow an algorithm for decomposing to 3NF. You don't explain what process you think you should follow. But if you aren't following a proven algorithm then whatever components you pick might or might not always join to the original and might or might not each be in any particular higher normal form. Note that examples of decompositions you have seen are not presentations of decomposition algorithms.
The NF (normal form) definitions give conditions that a relation must meet to be in that NF. They don't tell you how to nonloss decompose (preserving FDs when possible) to relations in higher NFs. People have worked out algorithms for producing decompositions to particular NFs. (And decomposing to a given NF doesn't in general involve first decomposition to lower NFs. Going through lower NFs can actually prevent good higher-NF decompositions of the original from being generated when you get to decomposing per a higher NF.)
You may also not realize that when some FDs hold, certain other ones must hold. The latter can be determined via Armstrong's axioms from the former. So just because you decomposed to get rid of a particular FD whose presence violates a particular NF doesn't mean there weren't a bunch of other ones that violated it that you didn't deal with. They can be present in the new components. Or they can be not present in problematic ways, so that you have not "preserved" them when you could have, leading to poor designs.
Learn about specific NF algorithms, and for that matter NFs and normalization itself, in a college/university textbook/course/presentation. Many are online.

Is there a more efficient way to handle multi-valued attributes other than creating a relationship table?

I have three tables, tbl_school, tbl_courses and tbl_branches.
Each course can be taught in one or more branches of a school.
tbl_school has got:
id
school_name
total_branches
...
tbl_courses:
id
school_id
course_title
....
tbl_branches:
id
school_id
city
area
address
When I want to list all the branches of a school, it is a pretty straight forward JOIN.
However, each course will be taught in one or more branches or all the branches of the school and I need to store this information. Since there is a one-to-many relationship between tbl_courses and tbl_branches, I will have to create a new relationship table that maps each course record to it's respective branches.
When my users want to filter a course by city or area, this relationship table will be used.
I would like to know if this is the right approach or is there something better for my problem?
I was planning to store a JSON of branches of courses which would eliminate the relationship table and query would be much easier to find the city or area pattern in JSON string.
I am new to design patterns so kindly bear with me.
Issues
The table description you have given has a few errors, which need to be corrected first, after which my proposal will make more sense.
The use of a table prefix, especially tbl_, is incorrect. All the tables are tbl_s. If you do use a prefix, it is to group tables by Subject Area. Further, SQL allows a table qualifier when referring to any table in the code:
`... WHERE table_name.column_name = "something" ...
If you would like some advice re Naming Convention, please review this Answer.
Use singular, because the table name is supposed to refer to a row (relation), not to the content (we know it contains many rows). Then all the English used re the table_name makes sense. (Eg. refer my Predicates.)
You have some incorrect or extraneous columns. It is easier to give you a Data Model, than to explain each item. A couple of items do need explanation:
school.total_branches is a duplicate, because that value can easily be derived (by COUNT() of the Branches). It breaks Normalisation rules, and introduces an Update Anomaly, which can get "out of synch".
course.school_id is incorrect, given that each Branch may or may not teach a Course. That relation is 1 Course to many Branches, it should be in the new table you are contemplating.
By JSON, if you mean construct an array on the client instead of keeping the relations in the database, then no, definitely not. Data and relationships to data, should be implemented in the database. For many reasons, the most important of which is Integrity. Following that, you may easily drag it into the client, and keep it there for stream-performance purposes.
The table you are thinking about is an Associative Table, an ordinary Relational construct to relate ("map", "link") two parent tables, here Courses to Branches.
Data duplication is not prevented. Refer to the Keys is the Data Model.
ID columns you have do not provide row uniqueness, which the Relational Model demands. If that is not clear to you please read this Answer.
Solution
Here is the model.
Proposed School Data Model
Please review and comment.
I need to ensure that you understand the notation in IDEF1X models, that unlike non-standard diagrams: every little notch, tick and line means something very specific. If not, please got to the IDEF1X Notation link at the bottom right of the model.
Please check the Predicates carefully, they (a) explain the model, and (b) are used to verify it. It is a feedback loop. They have two separate benefits.
If you would like more information on Predicates, why they are relevant, please go to this Answer and read the Predicate section.
If you wish to thoroughly understand Predicates, with a view to understanding Data Modelling, consider that Data Model (latest version is linked at the top of the Answer) against those Predicates. Ie. see if you understand a database that you have never seen before, via the model plus Predicates.
The Relational Keys I have given provide the row uniqueness that is required for Relational databases, duplicate data must be prevented. Note that ID columns are simply not needed. The Relational Keys provide:
Data Integrity
Relational access to data (notice the ease of, and unlimited, joins)
Relational speed
None of which a Record Filing System (characterised by ID columns) has.
Column description:
I have implemented two address_lines. Obviously, that should not include city because that is a separate column.
I presume area means something like borough or county or the area that the school branch operates in. If it is a fixed geographic administrative region (my first two descriptors) then it requires a formal structure. If not (my third descriptor), ie. it is loose, or (eg) it spans counties, then a simple Lookup table is enough.
If you use formal administrative regions, then city must move into that structure.
Your approach with an additional table seems the simplest and most straightforward to me. I would not mix JSON in this.

Restructure Inventory Management Database (2 to 3 Tables; Development Stage)

I’m developing a database. I’d appreciate some help restructuring 2 to 3 tables so the database is both compliant with the first 3 normal forms; and practical to use and to expand on / add to in the future. I want to invest time now to reduce effort / and confusion later.
PREAMBLE
Please be aware that I'm both a nube, and an amateur, though I have a certain amount of experience and skill and an abundance of enthusiasm!
BACKGROUND TO PROJECT
I am writing a small (though ambitious!) web application (using PHP and AJAX to a MySQL database). It is essentially an inventory management system, for recording and viewing the current location of each individual piece of equipment, and its maintenance history. If relevant, transactions will be very low (probably less than 100 a day, but with a possibility of simultaneous connections / operations). Row count will also be very low (maybe a few thousand).
It will deal with many completely different categories of equipment, eg bikes and lamps (to take random examples). Each unit of equipment will have its details or specifications recorded in the database. For a bike, an important specification might be frame colour, whereas a lamp it might require information regarding lampshade material.
Since the categories of equipment have so little in common, I think the most logical way to store the information is 1 table per category. That way, each category can have columns specific to that category.
I intend to store a list of categories in a separate table. Each category will have an id which is unique to that category. (Depending on the final design, this may function as a lookup table and / or as a table to run queries against.) There are likely to be very few categories (perhaps 10 to 20), unless the system is particulary succesful and it expands.
A list of bikes will be held in the bikes table.
Each bike will have an id which is unique to that bike (eg bike 0001).
But the same id will exist in the lamp table (ie lamp 0001).
With my application, I want the user to select (from a dropdown list) the category type (eg bike).
They will then enter the object's numeric id (eg 0001).
The combination of these two ids is sufficient information to uniquely identify an object.
Images:
Current Table Design
Proposed Additional Table
PROBLEM
My gut feeling is that there should be an “overarching table” that encompasses every single article of equipment no matter what category it comes from. This would be far simpler to query against than god knows how many mini tables. But when I try to construct it, it seems like it will break various normal forms. Eg introducing redundancy, possibility of inconsistency, referential integrity problems etc. It also begins to look like a domain table.
Perhaps the overarching table should be a query or view rather than an entity?
Could you please have a look at the screenshots and let me know your opinion. Thanks.
For various reasons, I’d prefer to use surrogate keys rather than natural keys if possible. Ideally, I’d prefer to have that surrogate key in a single column.
Currently, the bike (or lamp) table uses just the first column as its primary key. Should I expand this to a composite key including the Equipment_Category_ID column too? Then make the Equipment_Article table into a view joining on these two columns (iteratively for each equipment category). Optionally Bike_ID and Lamp_ID columns could be renamed to something generic like Equipment_Article_ID. This might make the query simpler, but is there a risk of losing specificity? It would / could still be qualified by the table name.
Speaking of redundancy, the Equipment_Category_ID in the current lamp or bike tables seems a bit redundant (if every item / row in that table has the same value in that column).
It all still sounds messy! But surely this must be very common problem for eg online electronics stores, rental shops, etc. Hopefully someone will say oh that old chestnut! Fingers crossed! Sorry for not being concise, but I couldn't work out what bits to leave out. Most of it seems relevant, if a bit chatty. Thanks in advance.
UPDATE 27/03/2014 (Reply to #ElliotSchmelliot)
Hi Elliot.
Thanks for you reply and for pointing me in the right direction. I studied OOP (in Java) but wasn't aware that something similar was possible in SQL. I read the link you sent with interest, and the rest of the site/book looks like a great resource.
Does MySQL InnoDB Support Specialization & Generalization?
Unfortunately, after 3 hours searching and reading, I still can't find the answer to this question. Keywords I'm searching with include: MySQL + (inheritance | EER | specialization | generalization | parent | child | class | subclass). The only positive result I found is here: http://en.wikipedia.org/wiki/Enhanced_entity%E2%80%93relationship_model. It mentions MySQL Workbench.
Possible Redundancy of Equipment_Category (Table 3)
Yes and No. Because this is a lookup table, it currently has a function. However because every item in the Lamp or the Bike table is of the same category, the column itself may be redundant; and if it is then the Equipment_Category table may be redundant... unless it is required elsewhere. I had intended to use it as the RowSource / OptionList for a webform dropdown. Would it not also be handy to have Equipment_Category as a column in the proposed Equipment parent table. Without it, how would one return a list of all Equipment_Names for the Lamp category (ignoring distinct for the moment).
Implementation
I have no way of knowing what new categories of equipment may need to be added in future, so I’ll have to limit attributes included in the superclass / parent to those I am 100% sure would be common to all (or allow nulls I suppose); sacrificing duplication in many child tables for increased flexibility and hopefully simpler maintenance in the long run. This is particulary important as we will not have professional IT support for this project.
Changes really do have to be automated. So I like the idea of the stored procedure. And the CreateBike example sounds familiar (in principle if not in syntax) to creating an instance of a class in Java.
Lots to think about and to teach myself! If you have any other comments, suggestions etc, they'd be most welcome. And, could you let me know what software you used to create your UML diagram. Its styling is much better than those that I've used.
Cheers!
You sound very interested in this project, which is always awesome to see!
I have a few suggestions for your database schema:
You have individual tables for each Equipment entity i.e. Bike or Lamp. Yet you also have an Equipment_Category table, purely for identifying a row in the Bike table as a Bike or a row in the Lamp table as a Lamp. This seems a bit redundant. I would assume that each row of data in the Bike table represents a Bike, so why even bother with the category table?
You mentioned that your "gut" feeling is telling you to go for an overarching table for all Equipment. Are you familiar with the practice of generalization and specialization in database design? What you are looking for here is specialization (also called "top-down".) I think it would be a great idea to have an overarching or "parent" table that represents Equipment. Then, each sub-entity such as Bike or Lamp would be a child table of Equipment. A parent table only has the fields that all child tables share.
With these suggestions in mind, here is how I might alter your schema:
In the above schema, everything starts as Equipment. However, each Equipment can be specialized into Lamp, Bike, etc. The Equipment entity has all of the common fields. Lamp and Bike each have fields specific to their own type. When creating an entity, you first create the Equipment, then you create the specialized entity. For example, say we are adding the "BMX 200 Ultra" bike. We first create a record in the Equipment table with the generic information (equipmentName, dateOfPurchase, etc.) Then we create the specialized record, in this case a Bike record with any additional bike-specific fields (wheelType, frameColor, etc.) When creating the specialized entities, we need to make sure to link them back to the parent. This is why both the Lamp and Bike entities have a foreign key for equipmentID.
An easy and effective way to add specialized entities is to create a stored procedure. For example, lets say we have a stored procedure called CreateBike that takes in parameters bikeName, dateOfPurchase, wheelType, and frameColor. The stored procedure knows we are creating a Bike, and therefore can easily create the Equipment record, insert the generic equipment data, create the bike record, insert the specialized bike data, and maintain the foreign key relationship.
Using specialization will make your transactional life very simple. For example, if you want all Equipment purchased before 1/1/14, no joins are needed. If you want all Bikes with a frameColor of blue, no joins are needed. If you want all Lamps made of felt, no joins are needed. The only time you will need to join a specialized table back to the Equipment table is if you want data both from the parent entity and the specialized entity. For example, show all Lamps that use 100 Watt bulbs and are named "Super Lamp."
Hope this helps and best of luck!
Edit
Specialization and Generalization, as mentioned in your provided source, is part of an Enhanced Entity Relationship (EER) which helps define a conceptual data model for your schema. As such, it does not need to be "supported" per say, it is more of a design technique. Therefore any database schema naturally supports specialization and generalization as long as the designer implements it.
As far as your Equipment_Category table goes, I see where you are coming from. It would indeed make it easy to have a dropdown of all categories. However, you could simply have a static table (only contains Strings that represent each category) to help with this population, and still keep your Equipment tables separate. You mentioned there will only be around 10-20 categories, so I see no reason to have a bridge between Equipment and Equipment_Category. The fewer joins the better. Another option would be to include an "equipmentCategory" field in the Equipment table instead of having a whole table for it. Then you could simply query for all unique equipmentCategory values.
I agree that you will want to keep your Equipment table to guaranteed common values between all children. Definitely. If things get too complicated and you need more defined entities, you could always break entities up again down the road. For example maybe half of your Bike entities are RoadBikes and the other half are MountainBikes. You could always continue the specialization break down to better get at those unique fields.
Stored Procedures are great for automating common queries. On top of that, parametrization provides an extra level of defense against security threats such as SQL injections.
I use SQL Server. The diagram I created is straight out of SQL Server Management Studio (SSMS). You can simply expand a database, right click on the Database Diagrams folder, and create a new diagram with your selected tables. SSMS does the rest for you. If you don't have access to SSMS I might suggest trying out Microsoft Visio or if you have access to it, Visual Paradigm.

'Many to two' relationship

I am wondering about a 'many to two' relationship. The child can be linked to either of two parents, but not both. Is there any way to reinforce this? Also I would like to prevent duplicate entries in the child.
A real world example would be phone numbers, users and companies. A company can have many phone numbers, a user can have many phone numbers, but ideally the user shouldn't provide the same phone number as the company as there would be duplicate content in the DB.
This question shows that you don't fully understand entity relationships (no rudeness intended). Of which there are four (technically only 3) types below:
One to One
One to Many
Many to One
Many to Many
One to One (1:1):
In this case a table has been broken up into two parts for purposes of complying with normalisation, or more usually the open closed principle.
Normalisation compliance: You might have a business rule that each customer has only one account. Technically, you could in this case say customer and account could all be in the same table, but this breaks the rules of normalisation, so you split them and make a 1:1.
Open-Close principle compliance: A customer table, might have id, first & last names, and address. Later someone decides to add a date of birth and with it the ability to calculate age along with a bunch of other much needed fields. This is an over simplified example of one to one, but you get the main use for it is to extend your database without breaking existing code. Much code written (sadly) is tightly coupled to the database so changes in the structure of a table will break the code. Adding a 1:1 like this will extend the table to meet new requirements without modifying the origional, thereby allowing old code to continue functioning normally and new code to make use of the new db features.
The downside of normalisation and extending tables using 1:1 relationships in this way is performance. Often times on heavly used systems, the first target to increase database performance is de-normalising and combining such tables into a single table, and optimising the indexes thus removing the need to use joins and read from multiple tables. Normalisation / De-Normalisation is neither a good or bad thing, as it depends on the needs of the system. Most systems usually start off normalised changing back when needed, but this change needs to be done very carefully as mentioned, if code is tightly coupled to the DB structure, it will almost definitely cause the system to fail. i.e. When you combine 2 tables, one ceases to exist, all the code that includes that now nonexistant table fails until it is modified (in db terms, imagine connecting relationships to any of the tables in the 1:1, when you remove those tables, this breaks the relationships, and so the structure has to be greatly modified to compensate. Unfortunately, such bad designs are much easier to spot in the DB world than in the software world in most cases and you don't usually notice something went wrong in code until it all falls apart) unless the system is properly designed with separation of concerns in mind.
It the closest thing you can get to inheritance in object oriented programming. But its not quite the same.
One to Many (1:M) / Many to One (M:1):
These two relationships (hense why 4 become 3), are the most popular relationship types. They are both the same type of relationship, the only thing that changes is your point of view. An example A customer has many phone numbers, or alternately, many phone numbers can belong to a customer.
In object oriented programming this would be considered composition. Its not inheritance, but you are saying one item is composed of many parts. This is usually represented with arrays / lists / collections etc. inside of classes as opposed to an inheritance structure.
Many to Many (M:M):
This type of relationship with current technology is impossible. For this reason we need to break it down into two one to many relationships with an "association" table joining them. The many side of the two one to many relationships is always on the association / link table.
For your example, the person who said you need a many to many is correct. Because a two to many is effectively a many (meaning more than one) to many relationship. This is the only way you would get your system to work. Unless you are intending to research the field of relational calculus to find some new type of relationship that would allow this.
Also for such relationships (m2m) you have two choices, either create a compound key in the linker table so the combination of fields become a unique entry (if you are interested in db optimisation this is the slower choice, but takes less space). Alternately, you create a third field with an auto generated id column and make that the primary key (for db optimisation, this is the faster choice, but takes more space).
In your example specifically above...
A real world example would be phone numbers, users and companies. A company can have many phone numbers, a user can have many phone numbers, but ideally the user shouldn't provide the same phone number as the company as there would be duplicate content in the DB.
This would be a many to many relationship with the phone number table as the linker table between companies and users. As explained, to ensure no phone number is repeated, you simply set it as the primary key or use another primary key and set the phone number field to unique.
For those kind of questions, it is really down to how you phrase them. What is causing you to get confused about this, and how you overcome this confusion to see the solution is simple. Rephrase the problem as follows. Start by asking is it a one to one, if the answer is no, move on. Next ask is it a one to many, if the answer is no move on. The only other option remaining is many to many. Be careful though, ensure you have considered the first 2 questions carefully before moving on. Many inexperienced database people often over complicate issues by defining one to many as many to many. Once again, the most popular type of relationship by far is one to many (I would say 90%) with the many to many and one to one spliting the remaining 10% 7/3 respectevely. But those figures are just my personal perspective, so dont go quoting them as industry standard statistics. My point is to make extra extra sure it is definitely not a one to many before choosing many to many. It is worth the extra effort.
So now to find the linker table between the two, decide which two are your main tables, and what fields need to be shared between them. In this case, company and user tables both need to share the phone. Hense you need to make a new phone table as the linker.
The warning alarm of misunderstanding should show as soon as you decide none of the 3 are working for you. This should be enough to tell you that you simply are not phrasing the relationship question correctly. You will get better at it as time passes, but it is an essential skill and really should be mastered as soon as possible for your own sanaty.
Of course you could also go to an object oriented database which will allow a range of other relationships called "Hierarchacal" relationships. Thats great if you are thinking of becomming a programmer too. But I wouldnt recommend this as it going to make your head hurt when you start finding ways to combine the various types of relationships. Especially given there is not much need since nearly all databases in the world consist of just those 3 types of relationships unless they are something super duper special.
Hope this was a reasonable answer. Thanks for taking the time to read it.
Just make phone number a key in your contact numbers table.
For your phone number example, you would put the phone number in a table by itself, with an ID.
Then you link to that phone_id from each of users and companies.
For your parents example, you don't link the child to parent - instead you link the parent to the child. OR, you put both parents in the same table, and the child just links to one of them.

Setting up database for 'business_owners' and 'customers'

I'm setting up a database that will have 'business_owners' and 'customers'. I could set this up in a couple days but wanted to see what your opinion is on best practice.
I could have two tables, 'business_owners' and 'customers', each with name, email etc. or...
I could do one table 'Users' and have a user_type as 'business_owner' or 'customer' and just use that type to determine what to show.
I'm thinking the second option is best, any feedback?
Rule of thumb:
If you have more than one table with identical (or near identical) columns, they should be condensed into a single table. Use a type code/etc to distinguish between as necessary, and work out the business rules for columns that depend on the type code.
Answer:
The second option is the best approach. It's the most scalable, and will be the easiest to work with if you ever need to use resultsets that include both business owners & customers.
It depends on the difference between the two types, if they share exactly the same attributes aside from their role as either a 'user' or 'business owner' I would suggest going for the second option to avoid overkill in terms of having identical columns in 2 separate tables.
How would you model this in an object model? Would you set up a single superclass, call it "stakeholders", that captures the properties of both business-owners and customers? Would you then set up specialized subclasses, "business-owner" and "customer" that extend the definition of stakeholders? If so, read on.
Your case looks like an instance of the Gen-Spec design pattern. Gen-spec is familiar to object oriented programmers through the superclass-subclass hierarchy. Unfortunately, introductions to relational database design tend to skip over how to design tables for the Gen-Spec situation. Fortunately, it’s well understood. A web search on “Relational database generalization specialization” will yield several articles on the subject. Some of your hits will be previous questions here on SO. Here is one article that discusses Gen-Spec in terms of Object Relational Mapping.
The trick is in the way the PK for the subclass (specialized) tables gets assigned. It’s not generated by some sort of autonumber feature. Instead, it’s a copy of the PK in the superclass (generalized) table, and is therefore an FK reference to it.
Thus, if the case were vehicles, trucks and sedans, every truck or sedan would have an entry in the vehicles table, trucks would also have an entry in the trucks table, with a PK that’s a copy of the corresponding PK in the vehicles table. Similarly for sedans and the sedan table. It’s easy to figure out whether a vehicle is a truck or a sedan by just doing joins, and you usually want to join the data in that kind of query anyway.