I would like to store album's
track names in a single field in a
database.
The number of tracks are arbitrary
for each album.
Each album is one record in the table.
Each track must be linked to a specific URL which also should be stored in the database somewhere.
Is it possible to do this by storing them in a single field, or is a relational table for the track names/urls the only way to go?
Table: Album
ID/PK (your choice of primary key philosophy)
AlbumName
Table: Track
ID/PK (optional, could make AlbumFK, TrackNumber the primary key)
AlbumFK REFERENCES (Album.PK)
TrackNumber
TrackName
TrackURL
It's entirely possible, you could store the field as comma-separated or XML data for example.
Whether it's sensible is another question - if you ever want to query how many albums have more than 10 tracks for example you aren't going to be able to write an SQL query for that and you'll have to resort to pulling the data back into your application and dissecting it there which is not ideal.
Another option is to store the data in a separate "tracks" table (i.e. normalised), but also provide a view on those tables that gives the data as a single field in a denormalised manner. Then you get the benefit of properly structured data and the ability to query the data as a single field from the view.
Conventional approach would be to have one table with a row for each track (with any meta data). Have another table for each Album, and a third table that records the association for which tracks are on which album(s) and in which order.
Use two tables, one for albums, and one for tracks.
Album
-----
Id
Name
Artist
etc...
Track
-----
Id
AlbumId(Foreign Key to Album Table)
Name
URL
You could also augment this with a third table that joined the trackId and AlbumId fields (so don't have the AlbumId in the Track table). The advantage of this second approach would be that it would allow you to reuse a recording when it appeared on many albums (such as compilations).
The Wikipedia article on Database Normalization makes a reasonable effort to explain the purpose of normalization ... and the sorts of anomalies that the normalization rules are intended to prevent.
Related
Forewarning - Something could be wrong from the start. More than one thing. Any additional help outside of the question I ask would be much appreciated :)
Also, admin_tbl can be ignored. It is simply usernames/passwords for administrators for the site.
Aim (and question): To create two relationship tables. Using my HTML Form I want to be able to insert information such as:
albumName - Exploration
name - Dead Sea
timeDate - 2015-12-21
comment - A beautiful landscape of the Dead Sea in Israel.
Upload Image - (button to choose image)
I have managed to create code (php and sql files) to upload the information into the database and upload the image into a 'Uploads' folder. However, the aim is for each album to have an ID (albumID), and to be able to have multiple images be uploaded and be a part of one album (along with having multiple albums of course, each with different images in). Can you help me do this?
What I have so far:
'image_tbl' - - Table which stores image path and the information from the form (including Album Name)
'album_tbl' - - Table which stores albumID and then albumName
Problems (I have run into):
image_tbl - As you can see each image I insert into the database is given an id, which is auto-incremented. Apparently, the auto-incremented id (unsure of official term) has to be the Primary Key. However, to have albumName (in this table) have a relationship in album_tbl doesn't it need to be the Primary Key?
album_tbl - In this table I have albumID and albumName. What I want is that when a form is inserted with an albumName (e.g. 'Exploration), for it to be also given the albumID. Therefore if I printed a specific 'albumID' every image placed into that would be shown.
Hope everything is understandable.
P.s. If you need any additional information just ask, and if you have recommendations to do this a lot easier/more efficiently, just tell me!
P.p.s If it can quickly be done, can someone explain what the 'Index' option is in the context of my tables (so it can easily be understood), thanks!
TL;DR You don't need to declare a "relationship" ie foreign key (FK) to query. But its a good idea. When you do, a FK can reference a primary key (PK) or any other UNIQUE column(s).
PKs & FKs are erroneously called "relationships" in some methods and products. Application relationships are represented by tables. (Base tables and query results.) PKs & FKs are constraints: they tell the DBMS that only certain situations can arise, so it can notice when you make certain errors. They are not relationships, they are statements true in & of every database state and application situation. You do not need to know constraints to update and query a database.
Just know what every table means. Base tables have DBA-given meanings that tell you what their rows mean. Queries also have meanings that tell you what their rows mean. Query meanings are combined from base table meanings in parallel with how their result values are combined from base table values and conditions.
image_tbl -- image [Id] is in an album named [albumName], is named [name], is dated [dateTime] & has comment [comment]
album_tbl -- album [albumID] is named [albumName]
You do not have to declare any PKs/UNIQUEs or FKs! But it is a good idea because then the DBMS can disallow impossible/erroneous updates. A PK/UNIQUE says that a subrow value for its columns must appear only once. A FK says that a subrow value for its columns must appear as a PK/UNIQUE subrow value in its referenced table. The fact that these limitations hold on base tables means that certain limitations hold on query results. But the meanings of those query results are per the query's table & condition combinations, independent of those limitations. Eg whether or not album names are unique,
image_tbl JOIN album_tbl USING albumName -- image [Id] is in an album named [albumName], is named [name], is dated [dateTime] and has comment [comment] AND album [albumID] is named [albumName]
The only problem here is that if album names are not unique then knowing an image's album's name isn't going to tell you which album it's in; you only know that it's in an album with that name. On the other hand if album names are unique, you don't need album_tbl albumID.
So if album names are unique, declare albumName UNIQUE in album_tbl. Then in image_tbl identify the album by some PK/UNIQUE column of album_tbl. Since album_id is presumably present just for the purpose of identifying albums, normally we would expect it to be chosen. Then in image_tbl declare that column as a FK referencing album_tbl.
PS Indexes generally speed up querying at the cost of some time and space. A primary key declaration in a table declaration automatically declares an index. It's a good idea to index PK, UNIQUE and FK column sets.
Welcome to relational databases.
The idea with relational databases is to have all data relate to each other but to avoid replication of data. This is called Normalization.
In your example you are replicating "Album Name" name in both tables, which is not necessary. What you want to do is change the column Album Name (in image_tbl) to Album ID and add a foreign key on album_id.image_tbl to album_id. album_tbl.
I am new to mysql, so help would be much appreciated :-)
Let's take the movie db example:
movie_td (mov_id auto_increment pk, title, year, duration)
actor_td (act_id auto_increment pk, name)
director_td (dir_id auto_increment pk, name)
movie_actor_td (movie_id fk, actor_id fk)
movie_director_td (movie_id fk, director_id fk)
I understand how to insert a .csv type of a file into a single td where all the names are stored in one column, but it's a little bit confusing to do this in a normalized format. If I already have all the data stored in one table, does it make sense to create a static mov_id first so that I can reference the rest of columns to it? Or is there a better way of doing this?
Thanks!
If you will store all the data in one table, you will face issue if any of your movie has multiple actors or has more than one directors.
This normalized database approach is better to avoid insert, update and delete anomalies of redundant data in database tables.
Also, you will have to write same name(for actor/director) for each row of the movie if same actor is concerned to many movies. Thus, updating actor/director name in a particular row and not in other rows will create inconsistency in the names of actor/director in the table.
If you go by definition, a relation is in first normal form if the domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain. (Source: wikipedia.org).
Hence, when you insert multiple values separated by comma in a row, you are violating the first NF itself! This is because there is a many-to-many relationship among data and you are not mapping it correctly.
Moreover, you ask a very basic question- If I already have all the data stored in one table, does it make sense to create a static mov_id first so that I can reference the rest of columns to it? - well, if you just want to have all the data stored in one table, why not go for XML? You will have one single file storing all the relevant data. But the fact is, you can not run a complete application using XML. XML has different purpose, database tables have different purpose. You do need a data structure that can be queried however you want and not worry about how the storage is happening. I would suggest you read Korth's book on database design.
Coming over to designing databases and table structures, it doesn't matter whether you know how to store a .csv file into a column or not. What matters is how long it is going to take to develop the complicated code to fetch values from the CSV column. It is always better to write a few simple queries than complicated search loops to fetch values.
Let's look a the example you have posted. I'd take only three tables from it.
Consider the table movie_td (I don't understand the reason behind the _td part but I'll stick to it because you posted it.) This table stores information about a movie. Now, in the real world, a movie may have multiple attributes (columns) like title, release date (now, that too depends on the region where it is released, it may have multiple release dates as per region, it's a different story altogether), running time, name of the director(I've only watched movies by single director or director duo so far. I'm yet to see a multi-director movie ;), etc.
We must consider two facts here:
A movie has multiple actors portraying multiple characters.
An actor may have acted in multiple movies.
This gives us with a many-to-many relationship between actors and movies and this is where the table movie_actor_td comes into picture. This table stores which movie has which actor cast in it, with movie_id and actor_id each being a foreign key. A movie may have multiple entries in this table against those many actors. An actor may also have multiple entries in this table against those many movies, so a mutual many-to-many relationship is maintained among these.
A major reason to have this sort of structure is querying the tables. If you store the names of the actors comma separated in the movies table, you have no means to drill down data for the actors using actor_id- you cannot get the actor's other details like their date of birth and other biodata.
What if someone asks you how many movies has the actor foo done? Would you go looking for the actor's name in the CSV column in every row? How fast would it be?
But now that you have the given table structure, you can find that out by a simple query like this:
SELECT count(*)
FROM movie_actor_td
WHERE actor_id = (SELECT actor_id
FROM actor_td
WHERE name = 'foo');
Let's consider an even more complex example. For this, I'd take the freedom to add a column character_name to the table movie_actor_td, as an actor usually plays a single character in a movie. So your movie_actor_td table would look like:
movie_actor_td (movie_id, actor_id, character_name)
So now, there is an actor who played James Bond in movie Goldeneye that was released in 1996. I don't know his name. I want to know how many movies has he done in year 2002. I'd simply put a query like:
SELECT COUNT(*)
FROM movie_actor_td
WHERE actor_id = (SELECT actor_id
FROM movie_actor_td
WHERE movie_id = (SELECT movie_id
FROM movie_td
WHERE name = 'Goldeneye'
AND release_year = 1996)
AND character_name = 'James Bond');
Can you fetch that so easily if you have all the data stored in a single CSV column? I doubt that. I'd suggest you continue with the current schema in hand.
EDIT
You ask about creating a static mov_id first and the reference all the other columns to it. I think you need to read further about primary keys, foreign keys and database constraints first. Then read about auto-increnemted column values in MySQL.
I had one single table that had lots of problems. I was saving data separated by commas in some fields, and afterwards I wasn't able to search them. Then, after search the web and find a lot of solutions, I decided to separate some tables.
That one table I had, became 5 tables.
First table is called agendamentos_diarios, this is the table that I'm gonna be storing the schedules.
Second Table is the table is called tecnicos, and I'm storing the technicians names. Two fields, id (primary key) and the name (varchar).
Third table is called agendamento_tecnico. This is the table (link) I'm goona store the id of the first and the second table. Thats because there are some schedules that are gonna be attended by one or more technicians.
Forth table is called veiculos (vehicles). The id and the name of the vehicle (two fields).
Fith table is the link between the first and the vehicles table. Same thing. I'm gonna store the schedule id and the vehicle id.
I had an image that can explain better than I'm trying to say.
Am I doing it correctly? Is there a better way of storing data to MySQL?
I agree with #Strawberry about the ids, but normally it is the Hibernate mapping type that do this. If you are not using Hibernate to design your tables you should take the ID out from agendamento_tecnico and agendamento_veiculos. That way you garantee the unicity. If you don't wanna do that create a unique key on the FK fields on thoose tables.
I notice that you separate the vehicles table from your technicians. On your model the same vehicle can be in two different schedules at the same time (which doesn't make sense). It will be better if the vehicle was linked on agendamento_tecnico table which will turn to be agendamento_tecnico_veiculo.
Looking to your table I note (i'm brazilian) that you have a column called "servico" which, means service. Your schedule table is designed to only one service. What about on the same schedule you have more than one service? To solve this you can create a table services and create a m-n relationship with schedule. It will be easier to create some reports and have the services well separated on your database.
There is also a nome_cliente field which means the client for that schedule. It would be better if you have a cliente (client) table and link the schedule with an FK.
As said before, there is no right answer. You have to think about your problem and on the possible growing of it. Model a database properly will avoid lot of headache later.
Better is subjective, there's no right answer.
My natural instinct would be to break that schedule table up even more.
Looks like data about the technician and the client is duplicated.
There again you might have made a decisions to de-normalise for perfectly valid reasons.
Doubt you'll find anyone on here who disagrees with you not having comma separated fields though.
Where you call a halt to the changes is dependant on your circumstances now. Comma separated fields caused you an issue, you got rid of them. So what bit of where you are is causing you an issue now?
looks ok, especially if a first try
one comment: I would name PK/FK (ids) the same in all tables and not using 'id' as name (additionaly we use '#' or '_' as end char of primary / foreighn keys: example technicos.technico_ and agendamento_tecnico has fields agend_tech_ and technico_. But this is not common sense. It makes queries a bit more coplex (because you must fully qualify the fields), but make the databse schema mor readable (you know in the moment wich PK belong to wich FK)
other comment: the two assotiative (i never wrote that word before!) tables, joining technos and agendamento_tecnico have an own ID field, but they do not need that, because the two (primary/unique) keys of the two tables they join, are unique them selfes, so you can use them as PK for this tables like:
CREATE TABLE agendamento_tecnico (
technico_ int not null,
agend_tech_ int not null,
primary key(technico_,agend_tech_)
)
I would guess this is a semi-common question but I can't find it in the list of past questions. I have a set of tables for products which need to share a primary key index. Assume something like the following:
product1_table:
id,
name,
category,
...other fields
product2_table:
id,
name,
category,
...other fields
product_to_category_table:
product_id,
category_id
Clearly it would be useful to have a shared index between the two product tables. Note, the idea of keeping them separate is because they have largely different sets of fields beyond the basics, however they share a common categorization.
UPDATE:
A lot of people have suggested table inheritance (or gen-spec). This is an option I'm aware of but given in other database systems I could share a sequence between tables I was hoping MySQL had a similar solution. I shall assume it doesn't based on the responses. I guess I'll have to go with table inheritance... Thank you all.
It's not really common, no. There is no native way to share a primary key. What I might do in your situation is this:
product_table
id
name
category
general_fields...
product_type1_table:
id
product_id
product_type1_fields...
product_type2_table:
id
product_id
product_type2_fields...
product_to_category_table:
product_id
category_id
That is, there is one master product table that has entries for all products and has the fields that generalize between the types, and type-specified tables with foreign keys into the master product table, which have the type-specific data.
A better design is to put the common columns in one products table, and the special columns in two separate tables. Use the product_id as the primary key in all three tables, but in the two special tables it is, in addition, a foreign key back to the main products table.
This simplifies the basic product search for ids and names by category.
Note, also that your design allows each product to be in one category at most.
It seems you are looking for table inheritance.
You could use a common table product with attributes common to both product1 and product2, plus a type attribute which could be either "product2" or "product1"
Then tables product1 and product2 would have all their specific attributes and a reference to the parent table product.
product:
id,
name,
category,
type
product1_table:
id,
#product_id,
product1_specific_fields
product2_table:
id,
#product_id,
product2_specific_fields
First let me state that I agree with everything that Chaos, Larry and Phil have said.
But if you insist on another way...
There are two reasons for your shared PK. One uniqueness across the two tables and two to complete referential integrity.
I'm not sure exactly what "sequence" features the Auto_increment columns support. It seem like there is a system setting to define the increment by value, but nothing per column.
What I would do in Oracle is just share the same sequence between the two tables. Another technique would be to set a STEP value of 2 in the auto_increment and start one at 1 and the other at 2. Either way, you're generating unique values between them.
You could create a third table that has nothing but the PK Column. This column could also provide the Autonumbering if there's no way of creating a skipping autonumber within one server. Then on each of your data tables you'd add CRUD triggers. An insert into either data table would first initiate an insert into the pseudo index table (and return the ID for use in the local table). Likewise a delete from the local table would initiate a delete from the pseudo index table. Any children tables which need to point to a parent point to this pseudo index table.
Note this will need to be a per row trigger and will slow down crud on these tables. But tables like "product" tend NOT to have a very high rate of DML in the first place. Anyone who complains about the "performance impact" is not considering scale.
Please note, this is provided as a functioning alternative and not my recommendation as the best way
You can't "share" a primary key.
Without knowing all the details, my best advice is to combine the tables into a single product table. Having optional fields that are populated for some products and not others is not necessarily a bad design.
Another option is to have a sort of inheritence model, where you have a single product table, and then two product "subtype" tables, which reference the main product table and have their own specialized set of fields. Querying this model is more painful than a single table IMHO, which is why I see it as the less-desirable option.
Your explanation is a little vague but, from my basic understanding I would be tempted to do this
The product table contains common fields
product
-------
product_id
name
...
the product_extra1 table and the product_extra2 table contain different fields
these tables habe a one to one relationship enforced between product.product_id and
product_extra1.product_id etc. Enforce the one to one relationship by setting the product_id in the Foreign key tables (product_extra1, etc) to be unique using a unique constraint.
you will need to decided on the business rules as to how this data is populated
product_extra1
---------------
product_id
extra_field1
extra_field2
....
product_extra2
---------------
product_id
different_extra_field1
different_extra_field2
....
Based on what you have above the product_category table is an intersecting table (1 to many - many to 1) which would imply that each product can be related to many categories
This can now stay the same.
This is yet another case of gen-spec.
See previous discussion
Say we have this scenario:
Artist ==< Album ==< Track
//ie, One Artist can have many albums, and one album can have many tracks
In this case, all 3 entities have basically the same fields:
ID
Name
A foreign of the one-many relationship to the corresponding children (Artist to Album and Album to Track
A typical solution to the provided solution would be three tables, with the same fields (ArtistID, AlbumID etc...) and foreign key constraints in the one-many relationship field.
But, can we in this case, incorporate a form of inheritance to avoid the repetition of the same field ? I'm talking something of the sort:
Table: EntityType(EntityTypeID, EntityName)
This table would hold 3 entities (1. Artist, 2. Album, 3. Track)
Table: Entities(EntityID, Name, RelField, EntityTypeID)
This table will hold the name of the entity (like the name of
an artist for example), the one-many field (foreign-key
of EntityID) and EntityTypeID holding 1 for Artist, 2 for Album
and so on.
What do you think about the above design? Does it make sense to incorporate "OOP concepts" in this DB scenario?
And finally, would you prefer having the foreign-key constraints of the first scenario or the more generic (with the risk of linking an artist with a Track for example, since there is no check to see the inputter foreign-key value is really of an album) approach?
..btw, come to think of it, I think you can actually check if an inputted value of the RelField of an Artist corresponds to an Album, with triggers maybe?
I have recently seen this very idea of abstraction implemented consistenly, and the application and its database became a monster to maintain and troubleshoot. I will stay away from this technique. The simpler, the better, is my mantra.
There's very little chance that the additional fields that will inevitably accumulate on the various entities will be as obliging. Nothing to be gained by not reflecting reality in a reasonably close fashion.
I don't imagine you'd even likely conflate these entities in your regular OO design.
This reminds me (but only slightly) of an attempt I saw once to implement everything in a single table (named "Entity") with another table (named "Attributes") and a junction table between them.
By stucking all three together, you make your queries less readble (unless you then decompose the three categories as views) and you make searching and indexing more difficult.
Plus, at some point you'll want to add attributes to one category, which aren't attributes for the others. Sticking all three together gives you no room for change without ripping out chunks of your system.
Don't get so clever you trip yourself up.
The only advantage I can see to doing it in your OOP way is if there are other element types added in future (i.e., other than artist, album and track). In that case, you wouldn't need a schema change.
However, I'd tend to opt for the non-OOP way and just change the schema in that case. Some problems you have with the OOP solution are:
what if you want to add the birthdate of artist?
what if you want to store duration of albums and tracks?
what if the want to store track type?
Basically, what if you want to store something that's psecific only to one or two of the element types?
If you're in to this sort of thing, then take a look at table inheritance in PostgreSQL.
create table Artist (id integer not null primary key, name varchar(50));
create table Album (parent integer foreign key (id) references Artist) inherits (Artist);
create table Track (parent integer foreign key (id) references Album) inherits (Artist);
I agree with le dorfier, you might get some reuse out of the notion of a base entity (ID, Name) but beyond that point the concepts of Artist, Album, and Track will diverge.
And a more realistic model would probably have to deal with the fact that multiple artists may contribute to a single track on an album...