My organization works on Spring MVC with Hibernate. We always specify foreign key constraints in mapping file, like for person and contactList in Person.hbm.xml
<set name="ContactList">
<key column="PersonId" foreign-key="Fk_Peson_Contact"/>
<one-to-many class="Sample.Model.Contact"/>
</set>
This mapping will create one-to-many relationship between Person and Contact, and keep PersonID as foreign key column in the Contact table.
But, now organization has decided not to mention any kind of relationship in mapping file, means in above case Person mapping will not have any kind of one-to-many mapping instead of this Contact mapping will have property <property name="FK_PersonID"/> which will create column to hold personID. In this scenario table Person and Contact will look same but the difference is their is no relationship between Person and Contact coz no mapping is specified.
In such case if we want to fetch person's contactList then we have to fire two queries one to fetch person another for its contactList. let suppose we want to fetch personList with its contactList then we have to For loop on PersonList and fetch its ContactList which will fire number of queries.
When i ask why not specify relationship then senior said ,
If foreign key is in DB then we can't do slicing and partitioning.
When we fire join query DB takes more time to execute it.Which may slow down DB server.
But,my question is -
if i do For looping on personList then it will fire number of queries to fetch contacts is it feasible?
can such looping slow down application or application server?
what if i want to fetch personList with its conatctList,AddressList,QualificationList, Does this cause n+1 issue?
Which Scenario is more beneficial whether to specify mapping or not.
Well, I one of the early demos of partitioning in (MS) SQL by microsoft they showed a query which selected from a partioned table with data populated across several partions, and it used its indexes and stats to not even bother accessing partitions that had no (relevant) data. Of course if you dont have a fk constaint then it cant use the index and stats to know the fk conforms. I'd be fairly sure it would be covered by Kalen Delaney in her book (for the relevant version of MSSQL). The key is that the stats will contain the necessary information to prevent unnecessary seeks. The demo actually used a union to easily demo the effect.
Related
Here I have a EER diagram I am creating in the MySQL editor.
I want Person to relate to employee. I want some Persons entries to be employees/Customers/etc.
How would I implement this? Would it be a shared key? Join the tables? Related data?
I tried making EmployeeID a foreign key to PersonID. The tables still don't "relate". (There's no way to tell what employee X's name/dob/ect is.)
I encourage you to try again in the direction of making employeeID refer to personID, as a foreign key. You may not have implemented the whole idea.
The technique has a name: Shared Primary Key. If you search on this phrase, you will find articles describing what you have to do when adding a new employee. Without taking the correct steps when a new employee is added, you break the relationship between the two tables.
There is a tag with this name here in StackOverflow. shared-primary-key
What you are basically doing is setting up a situation where employee is a subclass of person. In an object-oriented system, this would be easy. SQL is not object oriented at the base level.
I was unable to find a clear answer of how to create an IS_A relationship in Access.
There was the same question here, but without a concise answer:
IS_A relationship primary key validation rules
I have the entity Employee, and two sub-entities Loan_Officer and Branch_Manager. It's a school example of an IS_A relationship really.
I've managed to create A relationship, but there needs to be a constraint that an employee must be either a Loan Officer or a Branch Manager, but can not be both. Now, I can't figure out how to do this, because what ever I do, I can assign the same Employee_ID in both sub-entity tables at once.
I've connected the tables via the PK, as it's shown here:
Now, this table design is just something I've done, in order to be able to connect them via a one-to-one relationship. I had to set the PK of Loan_Officer to "Number" and not "AutoNumber", in order to be able to connect them. The other option is to have a separate PK in Loan_Officer, like "Loan_Officer_ID", and a foreign key, "Employee_ID" in the Loan_Officer table, but the results are again the same (also according to the ER Diagram, the sub-entities don't have a separate PK).
You can't. This is not a feature of the Access database.
You can create CHECK constraints to check for such conditions, but those don't offer features to cascade operations.
See this answer for an example on how to create a CHECK constraint.
There is no such thing as an 'Is A' relationship in databases between tables. This is instead a field in the Employee table or Employee History Table.
The issue of 'can't be both' is a matter of validation logic. Where this validation logic is applied is probably at the form object level (during data entry), not the table level (no data should ever be entered directly into tables by end users).
Look into Access Data Macros . They can be used like SQL triggers firing off when a record is INSERTed, UPDATEed, DELETEed etc.
I'm unsure the best route to take for this example:
A table that holds information for a job; salary, dates of employment etc. The field I am wondering how best to store is 'job_title'.
Job title is going to be used as part of an auto-complete field so
I'll be using a query to fetch results.
The same job title will be used by multiple jobs in the DB.
Job title is going to be a large part of many queries in the
application.
A single job only ever has one title.
1 . Should I have a 2 tables, job and job_title, job table referencing the job_title table for its name.
2 . Should I have a 2 tables, job and job_title but store title as a direct value in job, job_title just storing a list of all preexisting values (somewhat redundant)?
3 . Or should I not use a reference table at all / other suggestion.
What is your choice of design in this situation, and how would it change in a one to many design?
This is an example, the actual design is much larger however I think this well conveys the issue.
Update, To clarify:
A User (outside scope of question) has many Jobs, a job (start/end date, {job title}) has a title, title ( name (ie. 'Web Developer' )
Your option 1 is the best design choice. Create the two tables along these lines:
jobs (job_id PK, title_id FK not null, start_date, end_date, ...)
job_titles (title_id PK, title)
The PKs should have clustered indexes; jobs.title_id and job_titles should have nonclustered or secondary indexes; job_titles.title should have a unique constraint.
This relationship can be modeled as 1-to-1 or 1-to-many (one title, many jobs). To enforce 1-to-1 modeling, apply a unique constraint to jobs.title_id. However, you should not model this as a 1-to-1 relationship, because it's not. You even say so yourself: "The same job title will be used by multiple jobs in the DB" and "A single job only ever has one title." An entry in the jobs table represents a certain position held by a certain user during a certain period of time. Because this is a 1-to-many relationship, a separate table is the correct way to model the data.
Here's a simple example of why this is so. Your company only has one CEO, but what happens if the current one steps down and the board appoints a new one? You'll have two entries in jobs which both reference the same title, even though there's only one CEO "position" and the two users' job date ranges don't overlap. If you enforce a 1-to-1 relationship, modeling this data is impossible.
Why these particular indexes and constraints?
The ID columns are PKs and clustered indexes for hopefully obvious reasons; you use these for joins
jobs.title_id is an FK for hopefully obvious data integrity reasons
jobs.title_id is not null because every job should have a title
jobs.title_id needs an index in order to speed up joins
job_titles.title has an index because you've indicated you'll be querying based on this column (though I wouldn't query in such a fashion, especially since you've said there will be many titles; see below)
job_titles.title has a unique constraint because there's no reason to have duplicates of the same title. You can (and will) have multiple jobs with the same title, but you don't need two entries for "CEO" in job_titles. Enforcing this uniqueness will preserve data integrity useful for reporting purposes (e.g. plot the productivity of IT's web division based on how many "web developer" jobs are filled)
Remarks:
Job title is going to be used as part of an auto-complete field so I'll be using a query to fetch results.
As I mentioned before, use key-value pairs here. Fetch a list of them into memory in your app, and query that list for your autocomplete values. Then send the ID off to the DB for your actual SQL query. The queries will perform better that way; even with indexes, searching integers is generally quicker than searching strings.
You've said that titles will be user created. Put some input sanitation and validation process in place, because you don't want redundant entries like "WEB DEVELOPER", "web developer", "web developer", etc. Validation should occur at both the application and DB levels; the unique constraint is part (but all) of this. Prodigitalson's remark about separate machine and display columns is related to this issue.
Edited: after getting the clarify
A table like this is enough - just add the job_title_id column as foreign key in the main member table
---- "job_title" table ---- (store the job_title)
1. pk - job_title_id
2. unique - job_title_name <- index this
__ original answer __
You need to clarify what's the job_title going represent
a person that hold this position?
the division/department that has this position?
A certain set of attributes? like Sales always has a commission
or just a string of what was it called?
From what I read so far, you just need the "job_title" as some sort of dimension - make the id for it, make the string searchable - and that's it
example
---- "employee" table ---- (store employee info)
1. pk - employee_id
2. fk - job_title_id
3. other attribute (contract_start_date, salary, sex, ... so on ...)
---- "job_title" table ---- (store the job_title)
1. pk - job_title_id
2. unique - job_title_name <- index this
---- "employee_job_title_history" table ---- (We can check the employee job history here)
1. pk - employee_id
2. pk - job_title_id
3. pk - is_effective
4. effective_date [edited: this need to be PK too - thanks to KM.]
I still think you need to provide us a use-case - that will greatly improve both of our understanding I believe
If there are only a few fixed job titles you might want to use an enum in our database.
See http://dev.mysql.com/doc/refman/5.0/en/enum.html
If that's not supported by your version of mysql simply encode it with a numerical index and resolve it to a human readable form in your queries.
I'm using MySQL / InnoDB, and using foreign keys to preserve relationships across tables. In the following scenaro (depicted below), a 'manager' is associated with a 'recordLabel', and an 'artist' is also associated with a 'recordLabel'. When an 'album' is created, it is associated with an 'artist' and a 'manager', but both the artist and the manager need to be associated with the same recordLabel. How can I guarantee that relationship with the current table setup, or do I need to redesign the tables?
You cannot achieve this result using pure DRI - Declarative Referential Integrity, or the linking of foreign keys to ensure the schema's referential integrity.
There are 2 ways to solve this problem:
Consider the requirement a database problem, and use a trigger on INSERT and UPDATE to validate the requirements, and fail otherwise.
Consider the nested link a business logic requirement, and implement it in your business logic in PHP/C#/whatever.
As a sidenote, I think the structure is rather strange from a practical perspective - as far as I know an Artist is signed to a RecordLabel, and assigned a Manager separately (either from the label or individually, many artists retain their own manager when switching to another label). Linking the Manager also to the Album only makes sense to record historic managers, enabling you to retrieve who was the manager to the artist when the album was released, but that automatically means your requirement is invalid if the artist switches labels and/or manages later on. I think therefore it is wrong from a practical data view to enforce this link.
What you do is add recordLabel id to the albums table. Then you put two, two column indexes on albumns (recordLabel_id, artist_id) and (recordLabel_id, managers_id).
Because the record_id can only have one value in each row of the albumns table you will have insured integrity.
Why can't I just leave those relationships out?
What's the point of them?
I can stil run queries and treat them like it a relationship myself...
Yes, you can always leave the foreign key constraints out but then you will be responsible about the integrity of your data. If you use foreign key constraints, then you won't have to worry about the referential integrity among tables. You can read more about referential integrity from Wikipedia. I will also try to explain it with an example below.
Think of a shopping cart scenario. You have three tables: item, shopping_cart and shopping_cart_item. You can choose not to define any relationship between these tables, that's fine for any SQL solution. When user starts shopping, you create a shopping cart by adding a shopping_cart entry. As user adds items to his shopping cart, you save this information by adding rows to shopping_cart_item table.
One problem may occur at this step: If you have a buggy code that assigns incorrect shopping_cart_id's to shopping_cart_items, then you will definitely end up with incorrect data! Yes, you can have this case even with a foreign key constraint if the assigned id actually exists in the shopping_cart table. But this error will be more detectable when a foreign key exists since it would not insert shopping_cart_item record when the foreign key constraint fails.
Let's continue with the assumption that your code is not buggy and you won't have first type of referential integrity. Then suddenly a user wants to stop shopping and delete the cart and you chose to implement this case by deleting the shopping_cart and shopping_cart_item entries. Then you will have to delete entries in both tables with two separate queries. If something goes wrong after you delete shopping_cart entries, then you will again have a referential integrity problem: You will have shopping_cart_items that are not related to any shopping_cart. You will then have to introduce transaction managing, try to provide meaningful data to your business logic about the error happened in data access layer, etc..
In this type of scenario's, foreign keys can save life. You can define a foreign key constraint that will prevent insertion of any sort of incorrect data and you can define cascade operations that will automatically perform deletion of related data.
If there is anything unclear, just leave a comment and I can improve the answer.
Apart from what the others have said about why you technically want (actually: need) them:
foreign key constraints also document your model.
When looking at a model without FK constraints you have no idea which table relates to which. But with FK constraints in place you immediately see how things belong together.
You create FOREIGN KEYs to instruct the database engine to ensure that you never perform an action on the database that creates invalid records.
So, if you create a FOREIGN KEY relationship between users.id and visits.userid the engine will refuse to perform any actions that result in a userid value in visits that does not exist in users. This might be adding an unknown userid to visits, removing an id from users that already exists in visits, or updating either field to "break" the relationship.
That is why PRIMARY and FOREIGN KEYs are referred to as referential integrity constraints. The tell your database engine how to keep your data correct.
It doesn't allow you to enter an id which does not exist in another table, for example, if you have products and you keep owner Id, by creating a foreign key ton the owner id to id field of the owners table, you do not allow users to create an object record which has an owner id which does not exist in the owner table. such things are called referential intergrity.
The foreign key constraint helps you ensure referential integrity.
If you delete a row in one table, mysql can automatically delete all rows in other tables that the deleted row refers to via the foreign key. You can also make it reject the delete command.
Also when you try to insert a row, mysql can automatically create new rows in other tables, so the foreign key does not refer to nothing.
That is what referential integrity is all about.
Databases can be affected by more than just the application. Not all data changes go through the application even if they are supposed to. People change stuff directly on the database all the time. Rules that need to apply to all data all the time belong on the database. Suppose you can update the prices of your stock. That's great for updating anindividual price. But what happens when the boss decides to raise all prices by 15%. No one is going to go through and change 10,000 prices one at a time through the GUI, they are going to write a quick SQL script to do the update. Or suppose two suppliers join together to have one company and you want to change all of thie items to be the new company. Those kinds of changes happen to databases every day and they too need to follow the rules for data integrity.
New developers may not know about all the places where the foreign key relationships should exist and thus make mistakes which cause the data to be no longer useful.
Databases without foreign key constraints have close to a 100% chance of having bad data in them. Do you really want to have orders where you can't identify who the customers were?
THe FKS will prevent you from deleting a customer who has orders for instance or if you use a natural key of company_name and the name changes, all related records must be changed with the key change.
Or suppose you decide to put a new GUI together and dump the old one, then you might have to figure out all the FK relationships again (because you are using a different datalayer or ORM) and the chances are you might miss some.
It is irresponsible in the extreme to not put in FK relationships. You are risking the lifeblood of your company's business because you think it is a pain to do. I'd fire you if you suggested not using FKs because I would know I couldn't trust my company's data to you.