mySQL and general database normalization question - mysql

I have question about normalization.
Suppose I have an applications dealing with songs.
First I thought about doing like this:
Songs Table:
id | song_title | album_id | publisher_id | artist_id
Albums Table:
id | album_title | etc...
Publishers Table:
id | publisher_name | etc...
Artists Tale:
id | artist_name | etc...
Then as I think about normalization stuff. I thought I should get rid of "album_id, publisher_id, and artist_id in songs table and put them in intermediate tables like this.
Table song_album:
song_id, album_id
Table song_publisher
song_id, publisher_id
Table song_artist
song_id, artist_id
Now I can't decide which is the better way. I'm not an expert on database design so If someone would point out the right direction. It would awesome.
Are there any performance issues between two approaches?
Thanks

Forget about performance issues. The question is Does this model represent the data correctly?
The intermediate tables are called "junction tables" and they are useful when you can have a many-to-many relationship. For example, if you store the song "We Are the World" in your database, then you are going to have many artists for that song. Each of those artists are also responsible for creating many other songs. Therefor, to represent the data correctly, you will have to use junction tables, just as you did in the second version.

That depends. If you can guarantee that a particular song always belongs to one single album, go for your first approach. If not, you have a n-to-n relationship and need a join table: that is your second approach. Both are completely ok in terms of normalization.
It is important that you design your database in a way you can map your data to it.
Dont worry about performance here. Performance depends more on how you optimized your indexes and how your queries look like, than on having to do one more join operation or not (your second approach, the join table, would need one more join in every query).

The first structure is mixing up the semantics (e.g. writing the publisher name for each single song). The second structure will allow you to put invalid data in the database (e.g. one song can belong to two albums). Here is what I understood from the problem domain and my suggestions for the design:
One album is published by only one publisher, thus you don't need to specify the publisher in every single song, you just need to put the publisher_ID in the Albums table. Also if you keep the artist_ID in the Songs table, each one of your songs can have only one artist at a time; but by putting the song_ID and artist_ID in a linkage table you can have multiple artists for one song (like the time when 2 singers sing one song together). The publisher_id goes to albums table as each album is published by one publisher.
Also for table names it is always advised to use singular form.
Here is my suggested design:
Song Table:
id | song_title | album_id | ...
Album Table:
id | album_title | publisher_id | ...
Publisher Table:
id | publisher_name | ...
Artist Table:
id | artist_name | ...
Song_Artist Table:
song_id | artist_id | artist_role | ...

Songs can appear on multiple albums. Think a greatest hits release. Its important to zoom out of the technical muck and consider the real world use of an application (or database).

I'd stick with the first one, for two reasons:
A song is only associated with one album, one publisher and one artist, so you don't need to create separate tables for them (if, for example, a song can have more than one artist, then create the song_artist table).
It's more efficient. With the second approach you'll need to make some joins.

Related

Any way to "shortcut" common joins or code blocks in MySQL?

I have my tables in fairly normalized forms (I've only just started learning this, so I would wildly guess that I'm using 3rd normal form but I also know that I'm probably wrong, just as I know that some of my example code will be horribly written and also potentially wrong), and there are a few very common joins that I use across several queries.
For illustration's sake, let's say I have information on songs, something like this:
Songs:
song_id (PK) | song_title | first_artist | most_famous_artist
Albums:
album_id (PK) | album_title | version | format | year_released
Song versions:
song_version_id (PK) | song_id (FK) | artist | year_released
Album tracks:
album_id (PK, FK) | track_no (PK) | song_version_id (FK)
And suppose that I am frequently pulling up information on a particular version of a song, including the name of the song and the album it appeared on, so I keep writing:
FROM
song_versions as sv
INNER JOIN songs as s on sv.song_id = s.song_id
INNER JOIN album_tracks as at on at.song_version_id = sv.song_version_id
INNER JOIN albums as a on a.album_id = at.album_id
or whatever the case may be. I know I could denormalize a bit, and either combine a couple of tables together or have some data duplicated across tables and make sure I keep them consistent. But is there a way to keep the data structure the same, but have a shortcut to refer to the combination of song/version/album without having to write the same join all the time? (Especially if I may often want to only refer to particular subsets of that same join - e.g. the albums containing the most famous version of a particular song, or songs that have appeared on an album released after 2000.)

Mysql table relations

I have read various topics regarding table relations and while i am building my database i am a bit confused on what should i do.
I have 3 type of registration on my site(artist, fan, companies). Each registered user gets a unique key and username and the appropriate type of user (ex. fan). I am trying to involve music genres to all types of registration but genres will also be added to uploaded music files. At the moment i am storing one music genre per track and user by an array list that is shown in a form. Then system is storing it to the appropriate field. But i want some users to have more than 1 genres stored.
Now what i have done is below:
Users table (total 14 columns)
ID | username | email | password | type | signup | lastlogin | etc.
Settings table (total 10 columns)
ID | username | description | banner | country | genres | avatar etc.
Music table
ID | username | artist | title | maingenre | othergenre | cover | fileurl
By having in mind performance and let's assume that thousand of thousand users is registering...
Should i just add all settings column in the users table or it's ok to keep as i have them now? Settings can be updated by user while users table is updated by the system.
Should i split the user table according to users type? For example Artist table, fan table etc. that will store the appropriate registration and settings? Type of user needs to be in a column as is important for some functions of the site.
Regarding music table i was thinking to making a table for each genre that will store the uploads according to the genre specified by the user. Is this a good way of storing tracks to database or not? So when i want to call tracks of disco music i just use the disco music table.
Any help will be much appreciated.
not quiet sure I understand everything completely how your table is correlated, or what exactly you want or plan to do, but here is one idea about how to store genres in your database. And to connect it with your Setting table
First create table Genres in which you will store all genres. That table could look like this
Table: Genres
ID | genres_name | description etc.
ID - will be primary key auto increment
genres_name - will hold the name of genre (blues, jazz, disco...)
description - this column i added just if you want to add something specific by every genre it's not necessary
Next step is to create table Settings_genres. This table will store relation between your Setting table and Genres table and will look like this
Table: Settings_genres
settings_id | genres_id
So data in this table will look like this (for the setting ID 1 which will have 3 different Genres)
settings_id | genres_id
------------------------
1 | 2
------------------------
1 | 4
------------------------
1 | 5
------------------------
settings_id and genres_id will be primary key pair which means that you wont be able to store two identical pair int this table (You can have only one relation between one settings column and one genre column)
That is something called Many to many relationship and I'm sure that you can easily find more about that if you google it just a little.
When you want to pull data off from database which will show all settings and all genres you can do it with query like this
SELECT Settings.*, Genres.genres_name, Genres.description
FROM Settings
INNER JOIN Settings_Genres
ON Settings.ID = Settings_Genres.settings_id
INNER JOIN Genres
ON Settings_Genres.genres_id = Genres.ID
ORDER BY ID
Here is SQL Fiddle to see how it's look like.
When you want to pull data from settings table where that table is connected with specific genre you do that like this
SELECT Settings.*, Genres.genres_name, Genres.description
FROM Settings
INNER JOIN Settings_Genres
ON Settings.ID = Settings_Genres.settings_id
INNER JOIN Genres
ON Settings_Genres.genres_id = Genres.ID
WHERE Genres.genres_name = 'Rock'
ORDER BY ID;
This can also be achieved by this query which may be a little faster but let's not go into detail...
SELECT Settings.*, Genres.genres_name, Genres.description
FROM Settings
INNER JOIN Settings_Genres
ON Settings.ID = Settings_Genres.settings_id
INNER JOIN Genres
ON Settings_Genres.genres_id = Genres.ID
AND Genres.genres_name = 'Rock'
ORDER BY ID;
Here is FIDDLE for that...
So basically I suggest you to learn a little bit about relation between tables especially many to many relationship. And on than you will see how to improve your data table design.
Hope I help a little.
GL!
i think the way your table is, is okay. you dont have to split the table based on the type of users you have. but i think what you could use is font end technologies to allow users preform activities you want them, which is restricting them to only what you want them to do, they by controlling flow of information within the system. i hope that helps.

Should a boolean field or a separate table be used?

In order to learn MySQL I'm building a music CD database, which is pretty complex but so far I'm doing rather well. I have set up, among others, a table with albums, another with artists and an album_artists one which links album_id's with artist_id's. But in an album with various artists, usually one, or some, of them are the main artists, so when making a query I shouldn't order them by alphabetical or id order, but by primary or secondary. Question is:
Should I make a separate table of secondary_artists, identical to the original album_artists one, or make a boolean isPrimary field in the album_artists table? Are both ways acceptable?
Many bibliographic / discographic schemes for recording multiple creators assign an ordinal number to each contributor. So, instead of a flag indicating "primary", your album_artist table could contain
album_id artist_id artist_order
So if
"Daylight Again" had album_id 314,
David Crosby had artist_id 87,
Steven Stills had artist id 33,
Graham Nash had artist id 50,
your album_artist table would have these rows.
album_id artist_id artist_order
312 87 1
312 33 2
312 50 3
This would give you sufficient information to get the artists in the order mentioned in the work (which is the right order for most catalogs).
Don't put "secondary" artists in a different table.

MySQL Database column having multiple values

I had a question about whether or not my implementation idea is easy to work with/write queries for.
I currently have a database with multiple columns. Most of the columns are the same thing (items, but split into item 1, item 2, item 3 etc).
So I have currently in my database ID, Name, Item 1, Item 2 ..... Item 10.
I want to condense this into ID, Name, Item.
But what I want item to have is to store multiple values as different rows. I.e.
ID = One Name = Hello Item = This
That
There
Kind of like the format it looks like. Is this a good idea and how exactly would I go about doing this? I will be using no numbers in the database and all of the information will be static and will never change.
Can I do this using 1 database table (and would it be easy to match items of one ID to another ID), or would I need to create 2 tables and link them?
If so how exactly would I create 2 tables and make them relational?
Any ideas on how to implement this? Thanks!
This is a classical type of denormalized data base. Denormalization sometimes makes certain operations more efficient, but more often leads to inefficiencies. (For example, if one of your write queries was to change the name associated with an id, you would have to change many rows instead of a single one.) Denormalization should only be done for specific reasons after a fully normalized data base has been designed. In your example, a normalized data base design would be:
table_1: ID (key), Name
table_2: ID (foreign key mapped to table_1.ID), Item
You're talking about a denormalized table, which SQL databases have a difficult time dealing with. Your Item field is said to have a many-to-one relationship to the other fields. The correct things to do is to make two tables. The typical example is an album and songs. Songs have a many-to-one relationship to albums, so you could structure your ables like this:
Table Album
album_id [Primary Key]
Title
Artist
Table Song
song_id [Primary Key]
album_id [Foreign Key album.album_id]
Title
Often this example is given with a third table Artist, and you could substitute the Artist field for an artist_id field which is a Foreign Key to an Artist table's artist_id.
Of course, in reality songs, albums, and artists are more complex. One song can be on multiple albums, multiple artists can be on one album, there are multiple versions of the same song, and there are even some songs which have no album release at all.
Example:
Album
album_id Title Artist
1 White Beatles
2 Black Metallica
Song
song_id album_id Title
1 2 Enter Sandman
2 1 Back in the USSR
3 2 Sad but True
4 2 Nothing Else Matters
5 1 Helter Skelter
To query this you just do a JOIN:
SELECT * FROM Album INNER JOIN Song ON Album.album_id = Song.album_id
I don't think one table really makes sense in this case. Instead you can do:
Main Table:
ID
Name
Item Table:
ID
Item #
Item Value
Main_ID = Main Table.ID
Then when you do queries you can do a simple join

Efficient Classifieds Mysql Structure

I am restructuring a classifieds MySQL db where the different main sections are separated into separate tables. For example, sale items have their own table with unique ID's, jobs have their own table with unique ID's, personals have their own table as well.
These sections all share a few common characteristics:
-id
-title
-body
-listing status
-poster
-reply email
-posting date
But they each have some separate information required as well:
-each have different sets and trees of categories to choose from (which affect the structure needed to store them)
-jobs need to store things like salary, start date, etc.
-sale items need to store things like prices, obo, etc.
Therefore, is it a better practice to refactor the db while I can to a universal table to store ALL the general listing info regardless of section, and then task out customized data storage to small tables, or is it better to leave the current structure alone and leave the sections separated?
Sounds like they are all separate entities that have nothing to do with each other (ecxept for sharing some column-definitions), right?
Do you ever want to do a SELECT like
SELECT *
FROM main_entity
WHERE entity_type IN ('SALE_ITEM', 'JOB', 'PERSONAL')?
Otherwise I don't think I would merge them into one table.
Don't use a single table. Go relational.
What I would recommend setting up is a so-called polymorphic relationship between your "main" table (the one with the common characteristics), and three tables containing specific information. The structure would look something like this:
Main table
id
title
...
category_name (VARCHAR or CHAR)
category_id (INTEGER)
Category table
id
(specific columns)
The category_name field should contain the table name of the specific category table, eg. 'job_category', while the category_id should point to ID in the category table. An example would look like this:
# MAIN TABLE
id | title | ... | category_name | category_id
-------------------------------------------------------
123 | Some title | ... | job_category | 345
321 | Another title | ... | sale_category | 543
# SPECIFIC TABLE (job_category)
id | ...
---------
345 | ...
# SPECIFIC TABLE (sale_category)
id | ...
---------
543 | ...
Now, whenever you query the main table, you will immediately know which table to fetch the additional data from, and you will know the ID in that table. The only downside to this approach is that you have to perform two separate queries to fetch information for one single item. It would probably be possible to do this in a transaction, however.
For fetching data the other way around (eg. you search the jobs_category for something), on the other hand, you can fetch the associated data from the main table with a JOIN. Remember to not only join main.category_id = jobs_category.id, but also to use the category_name column as a join condition. Otherwise, you may fetch data that belongs to one of the other categories.
For optimal performance, you may want to index the category_name and category_id columns. This would mostly speed up any queries that join the two tables, as described in the previous paragraph.
Hope this helps!