Do I create an ER Diagram before or after normalization? - mysql

I'm learning about creating databases in MySQL and one of the theoretical parts is developing ER diagram.
Do I really need it when making my own project? And if I want to create one, do I create it after normalization of relations?

1). You're not required to do it. But it can definitely help to keep a clear overview over your scheme.
2). I'd just start by making an ER diagram and updating it after, or during, normalization. You could use tools like MySql Workbench to easily make and manage ER diagrams

Normalization technique must be the part of database refinement which has to be carried out before Entity Relationship Diagram.
As in ERD technique, we find out Primary Key and Foreign Key, using the same relationship amongst the entities.

Related

Another snowflake schema with many-to-many dimension

So there's already a question about SnowFlake Diagram and Many to Many relationships, but mine was a little bit different. Take a look at this schema.
draw.io (sorry I can't upload image to Imgur)
This is a simple star schema, I want to capture/retrieve some metric that is identifiable by a user and a team so the above schema makes sense. But there's a many-to-many relationship between dim_user and dim_team which of course everyone wants to avoid many-to-many relationships. A common approach is to create a bridge table between dim_user and dim_team. But this doesn't look like snowflake schema, the fact table is connected to 2 dimension table that are have relationships.
In my mind, I think this is fine but since what I can find about snowflake is only one of those 2 dimensions is connected to a fact table, I'm afraid this is a design leak. Any thought about this?
Just merge the Team attributes into the User Dim.
You can still keep the Team Dim as it is, if you have fact tables that are at the Team, rather than User, grain.

How do I create relationships between a single table?

I have the above tables in a database I'm designing right now, in MySQL. The primary purpose of the database is to create Bill Of Materials for a database and enforce revision control on these Bill Of Materials.
The Parts table follows Single Table Inheritance and has 3 different types of parts: Connectors, Terminals, Seals.
Brief Description Each Type
Connector: These are automotive grade connectors used in the manufacturing of automotive wiring harnesses.
Terminals: A connector has crimped wires inserted into it. A Terminal is crimped onto a wire in order to create a solder-less joint. These terminals then mate with their counterparts when the connecter is mated with it's counterpart in a vehicle.
Seals: These are special type of seals that are inserted onto the the wire in order to prevent water/dust getting through to the interconnection.
A connector can be used with multiple types of terminal and a terminal can also be used with multiple types of connectors.
The relationship between a connector and a seal is similar. A seal and terminal have no relationship.
What I'm aiming for:
If the user is browsing some part, I would like the view to show all it's related/associated parts. For instance, if Connector id 1 can be used with 5 different types of terminals, I would like all these terminals be shown in the view.
Similarly, when Terminal is being viewed, I would like all the different connectors that it can be used with shown as well.
Furthermore, a Connector can have substitute parts and I would like to relate that as well. This is a one-to-many relationship as a connector can have multiple substitutes.
And finally, a Connector can have multiple counter-parts and I would like these to be related as well.
I'm new to database design and I'm having trouble seeing the forest through the trees. Personally, I think I should ditch the Single Table idea and go with separate tables for Connectors, Seals and Terminals and draw up relationships between them.
That still answer how I can show substitutes and counter-part connectors.
If you are looking for an alternative to single table inheritance, look up class table inheritance. In your case, this would mean four tables in place of parts: Parts, Connectors, Seals, and Terminals. If each of the last three tables use shared primary key off of the parts table, you'll enforce the 1-to-1 nature of the "is-a" relationship, but not the mutually exclusive feature.
More to the point, you have not modelled the relationships between parts and other parts that you outlined in your verbal description. Those relationships may be inherent in the BOMS you create, but they aren't obvious here. If you model those relationships, will it make the database more functional? I don't know, offhand.
If you are having trouble seeing the forest for the trees, it may be that you are trying to design the solution without having analyzed the problem. I have found it handy to analyze the subject matter using the ER model. This breaks the subject matter down into entities, relationships, and attributes, but doesn't say anything about table design. Hint: your model is a relational model, not an ER model, even though you use ER diagramming conventions. Again, in my experience, once I have an ER model that describes the problem, it's easy (although tedious) to devise a relational model that designs the solution.

Relational Database & MyIsam

Just coming out of University, I have been taught the 'right' way of designing databases. e.g. database normalisation, how to structure tables etc.
Now I am faced with something which they didn't teach me at University...
It appears that I have a choice of 2 database engines - MyISAM or InnoDB.
I know that I can build a relational database with InnoDB storage engine, however as far as I can see, I cannot build a relational database with the MyISAM storage engine as I cannot link the tables.
So - my question - And please tell me if I am just being dumb or just missing a trick...
If I can't build a relational database with MyISAM, then what is it good for?
How do I ensure database integrity with MyISAM?
Do most people use MyISAM or INNODB?
How do I enforce constraints between two MyIsam tables?
E.g. If I am building a small online store, I will have one table for products, and one table for categories. A product must belong to 1 category. How would I build this using MyIsam?
It is true that the current version of MySQL will not enforce foreign key constraints that are defined on MyISAM tables, but that does not mean one cannot create relations between such tables (which are, after all, just a matter of holding in one table data that identifies a related record in another table): one must just be more careful to manage them properly.
If enforced ACID compliance is important to you, then InnoDB is the way to go; if you can sacrifice such compliance in return for improved performance in certain situations, then MyISAM may be worth a look. You can even mix and match both storage engines within the same database to achieve a balance, if required.
There are a lot of resources discussing the pros and cons of MyISAM vs InnoDB—just search on Google (or this site) and you will find!
The relational model was developed in 1969-1970 in order to help clarify the case for building databases that conformed to that model. There is no particular reason why the relational model should be used to model data that is eventually going to go into a hierarchical database. However, it might be useful to use the relational model to help describe the data as it comes out of a hierarchical database, and before it is delivered to the DBMS client.
It's important to realize that the relational model of data is a design tool, and not really an analysis tool. The ER model of data was invented precisely for data analysis without tilting the analysis towards one particular implementation, such as a relational or SQL implementation. In the ER model as such, there are no foreign keys. It's important to understand that foreign keys are a feature of the solution, not a feature of the problem.
Maybe, in order to build a decent database using MyISAM, you need to relearn the correct way to build a database. What you learned in university may have been how to build a relational database, presuming that no students would ever have to build a hierarchical database.
Caveat: most hierarchical DBMSes have had a "relational layer" plastered on top of them, so that people who think in relational terms can use the tool without have to leave the relational model to one side. And Relational DBMSes are "better" than non relational ones, in at least some ways. The arguments made in 1969-1970 are still largely valid.

Steps to design a well organized and normalized Relational Database

I just started making a database for my website so I am re-reading Database Systems - Design, Implementation and Management (9th Edition)but i notice there is no single step by step process described in the book to create a well organized and normalized database. The book seems to be a little all over the place and although the normalization process is all in one place the steps leading up to it are not.
I thought it be very usefull to have all the steps in one list but i cannot find anything like that online or anywhere else. I realize the answerer explaining all of the steps would be quite an extensive one but anything i can get on this subject will be greatly appreciated; including the order of instructions before normalization and links with suggestions.
Although i am semi familiar with the process i took a long break (about 1 year) from designing any databases so i would like everything described in detail.
I am especially interested in:
Whats a good approach to begin modeling a database (or how to list business rules so its not confusing)
I would like to use ER or EER (extended entity relationship model) and I would like to know
how to model subtypes and supertypes correctly using EER(disjoint and overlapping) (as well as writing down the business rules for it so you know that its a subtype if there is any common way of doing that)
(I allready am familiar with the normalization process but an answer can include tips about it as well)
Still need help with:
Writing down business rules (including business rules for subtypes and super types in EER)
How to use subtypes and super-types in EER correctly (how to model them)
Any other suggestions will be appreciated.
I would recommend you this videos (about 9) about E/R modeling
http://www.youtube.com/watch?v=q1GaaGHHAqM
EDIT:
"how extensive must the diagrams for this model be ? must they include all the entities and attributes?? "
Yes, actually you have ER modeling and extend ER modeling,
The idea is to make the Extended ER modeling, because there you not only specify the entities, you also specify the PK and FK and the cardinality. Take a look to this link (see the graphics and the difference between both models).
there are two ways of modeling, one is the real scenario and the other one is the real structure of the DB, I.E:
When you create a E-ER Modeling you create even the relationship and cardinality for ALL entities, but when you are going to create the DB is not necessary to create relations with cardinality 1:N(The table with cardinality N create a FK from table with card. 1, and you don't need to create the relation Table into the DB) or when you have a 1:1 cardinality you know that one of your entities can absorb the other entity.
look this Graphic , only the N:M relations entities were create (when you see 2 or more FK, that's a relation table)
But remember those are just "rules" and you can break it if your design need to, for performance, security, etc.
about tools, there are a lot of them, But I recommended workbench, because you can use it to connect to your DBs (if you are in mysql) and create designs E/R modeling, with attributes, and he will auto-create the relations tables N:M.
EDIT 2:
here I put some links that can explain that a little bit better, it will take a lot of lines and will be harder to explain here and by myself, please review this links and let me know if you have questions:
type and subtype:
http://www.siue.edu/~dbock/cmis450/4-eermodel.htm
business rules (integrity constrain)
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx (please take a look specially to this one, I think it will help you with all this info)
http://www.google.com/url?sa=t&rct=j&q=database%20design%20integrity%20constraints&source=web&cd=1&ved=0CFYQFjAA&url=http%3A%2F%2Fcs-people.bu.edu%2Frkothuri%2Flect12-constraints.ppt&ei=2aLDT-X4Koyi8gTKhZWnCw&usg=AFQjCNEvXGr7MurxM-YCT0-rU0htqt6yuA&cad=rja
I have reread the book and some articles online and have created a short list of steps in order to design a decent database (of course you need to understand the basics of database design first) Steps are described in greater detail below:
(A lot of steps are described in the book: Database Systems - Design, Implementation and Management (9th Edition) and thats what the page numbers are refering too but i will try to describe as much as I can here and will edit this answer in the following days to make it more complete)
Create a detailed narrative of the organization’s description of operations.
Identify the business rules based from the description of operations.
Identify the main entities and relationships from the business rules.
Translate entities/relationships to EER model
Check naming conventions
Map ERR model to logical model (pg 400)*
Normalize logical model (pg 179)
Improve DB design (pg 187)
Validate Logical Model Integrity Constraints (pg 402) (like length etc.)
Validate the Logical Model against User Requirements
Translate tables to mySQL code (in workbench translate EER to SQL file using export function then to mySQL)
*you can possibly skip this step if you using workbench and work of the ER model that you design there.
1. Describe the workings company in great detail. If you are creating personal project describe it in detail if you are working with a company ask for documents describing their company as well as interviewing the employees for information (interviews might generate inconsistent information make sure to check with supervisers which information is more important for design)
2. Look at the gathered information and start generating rules from them make sure to fill in any information gaps in your knowledge. Confirm with supervisers in the company before moving on.
3. Identify the main entities and relationships from the business rules. Keep in mind that during the design process, the database designer does not depend simply on interviews to help define entities, attributes, and relationships. A surprising amount of information can be gathered by examining the business forms and reports that an organization uses in its daily operations. (pg 123)
4. If the database is complex you can break down the ERD design into followig substeps
i) Create External Models (pg 46)
ii) Combine External Models to form Conceptual Model (pg 48)
Follow the following recursive steps for the design (or for each substep)
I. Develop the initial ERD.
II. Identify the attributes and primary keys that adequately describe the entities.
III. Revise and review the ERD.
IV. Repeat steps until satisfactory output
You may also use entity clustering to further simplify your design process.
Describing database through ERD:
Use solid lines to connect Weak Entities (Weak entities are those which cannot exist without parent entity and contain parents PK in their PK).
Use dashed lines to connect Strong Entities (Strong entities are those which can exist independently of any other entity)
5. Check if your names follow your naming conventions. I used to have suggestions for naming conventions here but people didn't really like them. I suggest following your own standards or looking up some naming conventions online. Please post a comment if you found some naming conventions that are very useful.
6.
Logical design generally involves translating the ER model into a set of relations (tables), columns, and constraints definitions.
Translate the ER to logical model using these steps:
Map strong entities (entities that dont need other entities to exist)
Map supertype/subtype relationships
Map weak entities
Map binary relationships
Map higher degree relationships
7. Normalize the Logical Model. You may also denormalize the logical model in order to gain some desired characteristics. (like improved performance)
8.
Refine Attribute Atomicity -
It is generally good practice to pay attention to the atomicity requirement. An atomic attribute is one that cannot
be further subdivided. Such an attribute is said to display atomicity. By improving the degree of atomicity, you also gain querying flexibility.
Refine Primary Keys as Required for Data Granularity - Granularity refers to the level of detail represented by the values stored in a table’s row. Data stored at their lowest
level of granularity are said to be atomic data, as explained earlier. For example imagine ASSIGN_HOURS attribute to represent the hours worked by a given employee on a given project. However, are
those values recorded at their lowest level of granularity? In other words, does ASSIGN_HOURS represent the hourly
total, daily total, weekly total, monthly total, or yearly total? Clearly, ASSIGN_HOURS requires more careful definition. In this case, the relevant question would be as follows: For what time frame—hour, day, week, month, and
so on—do you want to record the ASSIGN_HOURS data?
For example, assume that the combination of EMP_NUM and PROJ_NUM is an acceptable (composite) primary key
in the ASSIGNMENT table. That primary key is useful in representing only the total number of hours an employee
worked on a project since its start. Using a surrogate primary key such as ASSIGN_NUM provides lower granularity
and yields greater flexibility. For example, assume that the EMP_NUM and PROJ_NUM combination is used as the
primary key, and then an employee makes two “hours worked” entries in the ASSIGNMENT table. That action violates
the entity integrity requirement. Even if you add the ASSIGN_DATE as part of a composite PK, an entity integrity
violation is still generated if any employee makes two or more entries for the same project on the same day. (The
employee might have worked on the project a few hours in the morning and then worked on it again later in the day.)
The same data entry yields no problems when ASSIGN_NUM is used as the primary key.
Try to answer the questions: "Who will be allowed to use the tables and what portion(s) of the table(s) will be available to which users?" ETC.
Please feel free to leave suggestions or links to better descriptions in the comments below i will add it to my answer
One aspect of your question touched on representing subclass-superclass relationships in SQL tables. Martin Fowler discusses three ways to design this, of which my favorite is Class Table Inheritance. The tricky part is arranging for the Id field to propagate from superclasses to subclasses. Once you get that done, the joins you will typically want to do are slick, easy, and fast.
There are six main steps in designing any database :
1. Requirements Analysis
2. Conceptual Design
3. Logical Design
4. Schema Refinement
5. Physical Design
6. Application & Security Design.

The concept of implementing key/value stores with relational database languages

I want to get myself wet with the concept of implementing key/value stores with relational database languages (like mysql and sql server).
However this is one of the times when Google isn't good enough.
Does anyone know of any good info / good links regarding the concept of implementing key/value stores with relational database languages?
Wiki for EAV http://en.wikipedia.org/wiki/Entity-attribute-value_model. SO answer with link to whitepaper called "Best Practices for Semantic Data Modeling for Performance and Scalability" EAV over SQL Server
The primary reason to do a key value schema implementation in relational is to have the flexibility of a small sub-schema with key value aspects and the main schema being relational or vice versa. This could give one extreme flexibility to address key value lookups for some portion of the application and others a traditional relational option without having to keep multiple databases.
In fact we have implemented such cases for some of our customers, where the customers are either a specific relational DB shop or for the same above mentioned reasons. You can always create a key value store in a relational database but not the other way.