Understanding Primary Key and Relationship Databases with MySQL (phpmyadmin) - mysql

Forewarning - Something could be wrong from the start. More than one thing. Any additional help outside of the question I ask would be much appreciated :)
Also, admin_tbl can be ignored. It is simply usernames/passwords for administrators for the site.
Aim (and question): To create two relationship tables. Using my HTML Form I want to be able to insert information such as:
albumName - Exploration
name - Dead Sea
timeDate - 2015-12-21
comment - A beautiful landscape of the Dead Sea in Israel.
Upload Image - (button to choose image)
I have managed to create code (php and sql files) to upload the information into the database and upload the image into a 'Uploads' folder. However, the aim is for each album to have an ID (albumID), and to be able to have multiple images be uploaded and be a part of one album (along with having multiple albums of course, each with different images in). Can you help me do this?
What I have so far:
'image_tbl' - - Table which stores image path and the information from the form (including Album Name)
'album_tbl' - - Table which stores albumID and then albumName
Problems (I have run into):
image_tbl - As you can see each image I insert into the database is given an id, which is auto-incremented. Apparently, the auto-incremented id (unsure of official term) has to be the Primary Key. However, to have albumName (in this table) have a relationship in album_tbl doesn't it need to be the Primary Key?
album_tbl - In this table I have albumID and albumName. What I want is that when a form is inserted with an albumName (e.g. 'Exploration), for it to be also given the albumID. Therefore if I printed a specific 'albumID' every image placed into that would be shown.
Hope everything is understandable.
P.s. If you need any additional information just ask, and if you have recommendations to do this a lot easier/more efficiently, just tell me!
P.p.s If it can quickly be done, can someone explain what the 'Index' option is in the context of my tables (so it can easily be understood), thanks!

TL;DR You don't need to declare a "relationship" ie foreign key (FK) to query. But its a good idea. When you do, a FK can reference a primary key (PK) or any other UNIQUE column(s).
PKs & FKs are erroneously called "relationships" in some methods and products. Application relationships are represented by tables. (Base tables and query results.) PKs & FKs are constraints: they tell the DBMS that only certain situations can arise, so it can notice when you make certain errors. They are not relationships, they are statements true in & of every database state and application situation. You do not need to know constraints to update and query a database.
Just know what every table means. Base tables have DBA-given meanings that tell you what their rows mean. Queries also have meanings that tell you what their rows mean. Query meanings are combined from base table meanings in parallel with how their result values are combined from base table values and conditions.
image_tbl -- image [Id] is in an album named [albumName], is named [name], is dated [dateTime] & has comment [comment]
album_tbl -- album [albumID] is named [albumName]
You do not have to declare any PKs/UNIQUEs or FKs! But it is a good idea because then the DBMS can disallow impossible/erroneous updates. A PK/UNIQUE says that a subrow value for its columns must appear only once. A FK says that a subrow value for its columns must appear as a PK/UNIQUE subrow value in its referenced table. The fact that these limitations hold on base tables means that certain limitations hold on query results. But the meanings of those query results are per the query's table & condition combinations, independent of those limitations. Eg whether or not album names are unique,
image_tbl JOIN album_tbl USING albumName -- image [Id] is in an album named [albumName], is named [name], is dated [dateTime] and has comment [comment] AND album [albumID] is named [albumName]
The only problem here is that if album names are not unique then knowing an image's album's name isn't going to tell you which album it's in; you only know that it's in an album with that name. On the other hand if album names are unique, you don't need album_tbl albumID.
So if album names are unique, declare albumName UNIQUE in album_tbl. Then in image_tbl identify the album by some PK/UNIQUE column of album_tbl. Since album_id is presumably present just for the purpose of identifying albums, normally we would expect it to be chosen. Then in image_tbl declare that column as a FK referencing album_tbl.
PS Indexes generally speed up querying at the cost of some time and space. A primary key declaration in a table declaration automatically declares an index. It's a good idea to index PK, UNIQUE and FK column sets.

Welcome to relational databases.
The idea with relational databases is to have all data relate to each other but to avoid replication of data. This is called Normalization.
In your example you are replicating "Album Name" name in both tables, which is not necessary. What you want to do is change the column Album Name (in image_tbl) to Album ID and add a foreign key on album_id.image_tbl to album_id. album_tbl.

Related

Normalize two tables with same primary key to 3NF

I have two tables currently with the same primary key, can I have these two tables with the same primary key?
Also are all the tables in 3rd normal form
Ticket:
-------------------
Ticket_id* PK
Flight_name* FK
Names*
Price
Tax
Number_bags
Travel class:
-------------------
Ticket id * PK
Customer_5star
Customer_normal
Customer_2star
Airmiles
Lounge_discount
ticket_economy
ticket_business
ticket_first
food allowance
drink allowance
the rest of the tables in the database are below
Passengers:
Names* PK
Credit_card_number
Credit_card_issue
Ticket_id *
Address
Flight:
Flight_name* PK
Flight_date
Source_airport_id* FK
Dest_airport_id* FK
Source
Destination
Plane_id*
Airport:
Source_airport_id* PK
Dest_airport_id* PK
Source_airport_country
Dest_airport_country
Pilot:
Pilot_name* PK
Plane id* FK
Pilot_grade
Month
Hours flown
Rate
Plane:
Plane_id* PK
Pilot_name* FK
This is not meant as an answer but it became too long for a comment...
Not to sound harsh, but your model has some serious flaws and you should probably take it back to the drawing board.
Consider what would happen if a Passenger buys a second Ticket for instance. The Passenger table should not hold any reference to tickets. Maybe a passenger can have more than one credit card though? Shouldn't Credit Cards be in their own table? The same applies to Addresses.
Why does the Airport table hold information that really is about destinations (or paths/trips)? You already record trip information in the Flights table. It seems to me that the Airport table should hold information pertaining to a particular airport (like name, location?, IATA code et cetera).
Can a Pilot just be associated with one single Plane? Doesn't sound very likely. The pilot table should not hold information about planes.
And the Planes table should not hold information on pilots as a plane surely can be connected to more than one pilot.
And so on... there are most likely other issues too, but these pointers should give you something to think about.
The only tables that sort of looks ok to me are Ticket and Flight.
Re same primary key:
Yes there can be multiple tables with the same primary key. Both in principle and in good practice. We declare a primary or other unique column set to say that those columns (and supersets of them) are unique in a table. When that is the case, declare such column sets. This happens all the time.
Eg: A typical reasonable case is "subtyping"/"subtables", where entities of a kind identified by a candidate key of one table are always or sometimes also of the kind identifed by the same values in another table. (If always then the one table's candidate key values are also in the other table's. And so we would declare a foreign key from the one to the other. We would say the one table's kind of entity is a subtype of the other's.) On the other hand sometimes one table is used with attributes of both kinds and attributes inapplicable to one kind are not used. (Ie via NULL or a tag indicating kind.)
Whether you should have cases of the same primary key depends on other criteria for good design as applied to your particular situation. You need to learn design including normalization.
Eg: All keys simple and 3NF implies 5NF, so if your two tables have the same set of values as only & simple primary key in every state and they are both in 3NF then their join contains exactly the same information as they do separately. Still, maybe you would keep them separate for clarity of design, for likelihood of change or for performance based on usage. You didn't give that information.
Re normal forms:
Normal forms apply to tables. The highest normal form of a table is a property independent of any other table. (Athough you might choose that form based on what forms & tables are alternatives.)
In order to normalize or determine a table's highest normal form one needs to know (in general) all the functional dependencies in it. (For normal forms above BCNF, also join dependencies.) You didn't give them. They are determined by what the meaning of the table is (ie how to determine what rows go in it in any given situation) and the possible situtations that can arise. You didn't give them. Your expectation that we could tell you about the normal forms your tables are in without giving such information suggests that you do not understand normalization and need to educate yourself about it.
Proper design also needs this information and in general all valid states that can arise from situations that arise. Ie constraints among given tables. You didn't give them.
Having two tables with the same key goes against the idea of removing redundancy in normalization.
Excluding that, are these tables in 1NF and 2NF?
Judging by the Names field, I'd suggest that table1 is not. If multiple names can belong to one ticket, then you need a new table, most likely with a composite key of ticket_id,name.

DB design for one-to-one single column table

I'm unsure the best route to take for this example:
A table that holds information for a job; salary, dates of employment etc. The field I am wondering how best to store is 'job_title'.
Job title is going to be used as part of an auto-complete field so
I'll be using a query to fetch results.
The same job title will be used by multiple jobs in the DB.
Job title is going to be a large part of many queries in the
application.
A single job only ever has one title.
1 . Should I have a 2 tables, job and job_title, job table referencing the job_title table for its name.
2 . Should I have a 2 tables, job and job_title but store title as a direct value in job, job_title just storing a list of all preexisting values (somewhat redundant)?
3 . Or should I not use a reference table at all / other suggestion.
What is your choice of design in this situation, and how would it change in a one to many design?
This is an example, the actual design is much larger however I think this well conveys the issue.
Update, To clarify:
A User (outside scope of question) has many Jobs, a job (start/end date, {job title}) has a title, title ( name (ie. 'Web Developer' )
Your option 1 is the best design choice. Create the two tables along these lines:
jobs (job_id PK, title_id FK not null, start_date, end_date, ...)
job_titles (title_id PK, title)
The PKs should have clustered indexes; jobs.title_id and job_titles should have nonclustered or secondary indexes; job_titles.title should have a unique constraint.
This relationship can be modeled as 1-to-1 or 1-to-many (one title, many jobs). To enforce 1-to-1 modeling, apply a unique constraint to jobs.title_id. However, you should not model this as a 1-to-1 relationship, because it's not. You even say so yourself: "The same job title will be used by multiple jobs in the DB" and "A single job only ever has one title." An entry in the jobs table represents a certain position held by a certain user during a certain period of time. Because this is a 1-to-many relationship, a separate table is the correct way to model the data.
Here's a simple example of why this is so. Your company only has one CEO, but what happens if the current one steps down and the board appoints a new one? You'll have two entries in jobs which both reference the same title, even though there's only one CEO "position" and the two users' job date ranges don't overlap. If you enforce a 1-to-1 relationship, modeling this data is impossible.
Why these particular indexes and constraints?
The ID columns are PKs and clustered indexes for hopefully obvious reasons; you use these for joins
jobs.title_id is an FK for hopefully obvious data integrity reasons
jobs.title_id is not null because every job should have a title
jobs.title_id needs an index in order to speed up joins
job_titles.title has an index because you've indicated you'll be querying based on this column (though I wouldn't query in such a fashion, especially since you've said there will be many titles; see below)
job_titles.title has a unique constraint because there's no reason to have duplicates of the same title. You can (and will) have multiple jobs with the same title, but you don't need two entries for "CEO" in job_titles. Enforcing this uniqueness will preserve data integrity useful for reporting purposes (e.g. plot the productivity of IT's web division based on how many "web developer" jobs are filled)
Remarks:
Job title is going to be used as part of an auto-complete field so I'll be using a query to fetch results.
As I mentioned before, use key-value pairs here. Fetch a list of them into memory in your app, and query that list for your autocomplete values. Then send the ID off to the DB for your actual SQL query. The queries will perform better that way; even with indexes, searching integers is generally quicker than searching strings.
You've said that titles will be user created. Put some input sanitation and validation process in place, because you don't want redundant entries like "WEB DEVELOPER", "web developer", "web developer", etc. Validation should occur at both the application and DB levels; the unique constraint is part (but all) of this. Prodigitalson's remark about separate machine and display columns is related to this issue.
Edited: after getting the clarify
A table like this is enough - just add the job_title_id column as foreign key in the main member table
---- "job_title" table ---- (store the job_title)
1. pk - job_title_id
2. unique - job_title_name <- index this
__ original answer __
You need to clarify what's the job_title going represent
a person that hold this position?
the division/department that has this position?
A certain set of attributes? like Sales always has a commission
or just a string of what was it called?
From what I read so far, you just need the "job_title" as some sort of dimension - make the id for it, make the string searchable - and that's it
example
---- "employee" table ---- (store employee info)
1. pk - employee_id
2. fk - job_title_id
3. other attribute (contract_start_date, salary, sex, ... so on ...)
---- "job_title" table ---- (store the job_title)
1. pk - job_title_id
2. unique - job_title_name <- index this
---- "employee_job_title_history" table ---- (We can check the employee job history here)
1. pk - employee_id
2. pk - job_title_id
3. pk - is_effective
4. effective_date [edited: this need to be PK too - thanks to KM.]
I still think you need to provide us a use-case - that will greatly improve both of our understanding I believe
If there are only a few fixed job titles you might want to use an enum in our database.
See http://dev.mysql.com/doc/refman/5.0/en/enum.html
If that's not supported by your version of mysql simply encode it with a numerical index and resolve it to a human readable form in your queries.

Database design - unique data from multiple sources

I have a table that stores information about pictures for automobiles. In a nutshell it contains the fields 'id', 'auto_id', 'name', and 'path'. So it is linked to a particular automobile entry through the 'auto_id' field.
Now consider that I want to add pictures for houses. Would it be better to just create another table similar to this one, or add a field in the existing table to point out the type of picture it is? Or is there an altogether better way to address this type of issue?
Edit: I apologize for the wording. It's obviously a simple problem, I just don't know how to best form it into a coherent question. Thanks for the patience and any help.
I would just model your picture table as:
id name path
and have a join table subjects table:
picture_id subject_id subject_type
Where picture_id is a FK to pictures, and subject_id is a FK to the specific subject deemed by subject_type(automobile, house etc)
Well, now given also that:
The 'id' field was just a unique identifier for the picture. The
'auto_id' field was the foreign key
for the automobile the picture was
linked to.
I would like to have it set up to
where one can have as many photos
linked to cars or houses as one would
like.
My suggested solution is:
Where
picture is your original pictures table, only that auto_id now is subject_id.
subject is your original automobiles table, now it will store both cars and houses records. Add a new field type_id as foreign key to subject types table.
subject_type is a new table where you'll store all possible subjects (not limited to cars and houses only, thinking to future expansion of your subject types).
(Sorry if this diagram doesn't reflect your real number/name/datatype of tables/columns, it's just that I don't know that info)
Given that only the subject (car or house) is what varies, and maybe in a future you will want to increase the number of subjects, I suggest you to keep all in one single table, just adding an extra field to store photo subject.
BTW: what is the difference between id and auto_id fields? If there is none, I'd suggest to get rid of auto_id column

Shared Primary Key

I would guess this is a semi-common question but I can't find it in the list of past questions. I have a set of tables for products which need to share a primary key index. Assume something like the following:
product1_table:
id,
name,
category,
...other fields
product2_table:
id,
name,
category,
...other fields
product_to_category_table:
product_id,
category_id
Clearly it would be useful to have a shared index between the two product tables. Note, the idea of keeping them separate is because they have largely different sets of fields beyond the basics, however they share a common categorization.
UPDATE:
A lot of people have suggested table inheritance (or gen-spec). This is an option I'm aware of but given in other database systems I could share a sequence between tables I was hoping MySQL had a similar solution. I shall assume it doesn't based on the responses. I guess I'll have to go with table inheritance... Thank you all.
It's not really common, no. There is no native way to share a primary key. What I might do in your situation is this:
product_table
id
name
category
general_fields...
product_type1_table:
id
product_id
product_type1_fields...
product_type2_table:
id
product_id
product_type2_fields...
product_to_category_table:
product_id
category_id
That is, there is one master product table that has entries for all products and has the fields that generalize between the types, and type-specified tables with foreign keys into the master product table, which have the type-specific data.
A better design is to put the common columns in one products table, and the special columns in two separate tables. Use the product_id as the primary key in all three tables, but in the two special tables it is, in addition, a foreign key back to the main products table.
This simplifies the basic product search for ids and names by category.
Note, also that your design allows each product to be in one category at most.
It seems you are looking for table inheritance.
You could use a common table product with attributes common to both product1 and product2, plus a type attribute which could be either "product2" or "product1"
Then tables product1 and product2 would have all their specific attributes and a reference to the parent table product.
product:
id,
name,
category,
type
product1_table:
id,
#product_id,
product1_specific_fields
product2_table:
id,
#product_id,
product2_specific_fields
First let me state that I agree with everything that Chaos, Larry and Phil have said.
But if you insist on another way...
There are two reasons for your shared PK. One uniqueness across the two tables and two to complete referential integrity.
I'm not sure exactly what "sequence" features the Auto_increment columns support. It seem like there is a system setting to define the increment by value, but nothing per column.
What I would do in Oracle is just share the same sequence between the two tables. Another technique would be to set a STEP value of 2 in the auto_increment and start one at 1 and the other at 2. Either way, you're generating unique values between them.
You could create a third table that has nothing but the PK Column. This column could also provide the Autonumbering if there's no way of creating a skipping autonumber within one server. Then on each of your data tables you'd add CRUD triggers. An insert into either data table would first initiate an insert into the pseudo index table (and return the ID for use in the local table). Likewise a delete from the local table would initiate a delete from the pseudo index table. Any children tables which need to point to a parent point to this pseudo index table.
Note this will need to be a per row trigger and will slow down crud on these tables. But tables like "product" tend NOT to have a very high rate of DML in the first place. Anyone who complains about the "performance impact" is not considering scale.
Please note, this is provided as a functioning alternative and not my recommendation as the best way
You can't "share" a primary key.
Without knowing all the details, my best advice is to combine the tables into a single product table. Having optional fields that are populated for some products and not others is not necessarily a bad design.
Another option is to have a sort of inheritence model, where you have a single product table, and then two product "subtype" tables, which reference the main product table and have their own specialized set of fields. Querying this model is more painful than a single table IMHO, which is why I see it as the less-desirable option.
Your explanation is a little vague but, from my basic understanding I would be tempted to do this
The product table contains common fields
product
-------
product_id
name
...
the product_extra1 table and the product_extra2 table contain different fields
these tables habe a one to one relationship enforced between product.product_id and
product_extra1.product_id etc. Enforce the one to one relationship by setting the product_id in the Foreign key tables (product_extra1, etc) to be unique using a unique constraint.
you will need to decided on the business rules as to how this data is populated
product_extra1
---------------
product_id
extra_field1
extra_field2
....
product_extra2
---------------
product_id
different_extra_field1
different_extra_field2
....
Based on what you have above the product_category table is an intersecting table (1 to many - many to 1) which would imply that each product can be related to many categories
This can now stay the same.
This is yet another case of gen-spec.
See previous discussion

Database Formatting for Album Tracks

I would like to store album's
track names in a single field in a
database.
The number of tracks are arbitrary
for each album.
Each album is one record in the table.
Each track must be linked to a specific URL which also should be stored in the database somewhere.
Is it possible to do this by storing them in a single field, or is a relational table for the track names/urls the only way to go?
Table: Album
ID/PK (your choice of primary key philosophy)
AlbumName
Table: Track
ID/PK (optional, could make AlbumFK, TrackNumber the primary key)
AlbumFK REFERENCES (Album.PK)
TrackNumber
TrackName
TrackURL
It's entirely possible, you could store the field as comma-separated or XML data for example.
Whether it's sensible is another question - if you ever want to query how many albums have more than 10 tracks for example you aren't going to be able to write an SQL query for that and you'll have to resort to pulling the data back into your application and dissecting it there which is not ideal.
Another option is to store the data in a separate "tracks" table (i.e. normalised), but also provide a view on those tables that gives the data as a single field in a denormalised manner. Then you get the benefit of properly structured data and the ability to query the data as a single field from the view.
Conventional approach would be to have one table with a row for each track (with any meta data). Have another table for each Album, and a third table that records the association for which tracks are on which album(s) and in which order.
Use two tables, one for albums, and one for tracks.
Album
-----
Id
Name
Artist
etc...
Track
-----
Id
AlbumId(Foreign Key to Album Table)
Name
URL
You could also augment this with a third table that joined the trackId and AlbumId fields (so don't have the AlbumId in the Track table). The advantage of this second approach would be that it would allow you to reuse a recording when it appeared on many albums (such as compilations).
The Wikipedia article on Database Normalization makes a reasonable effort to explain the purpose of normalization ... and the sorts of anomalies that the normalization rules are intended to prevent.