Foreign Key or Composite Key? - mysql

I just started applying everything that I read about table relationships but I'm kind of confused on how to insert data on tables with MANY-TO-MANY relationship considering there's a third table.
Right now I have these tables.
subject
name
code PK
units
description
schoolyear
schoolyearId PK
yearStart
yearEnd
schoolyearsubjects (MANY TO MANY table)
id PK
code FK
schoolyearId FK
But the problem with the above schoolyearsubjects table is that, I don't know how I can insert the schoolyearId from the GUI. On the GUI screenshot, as soon as "Save" button is clicked, a TRANSACTION containing 2 INSERT statements (to insert on subject) and (to insert on schoolyearsubjects) will execute. If I stick with the above, I'll have to insert the schoolyearId. schoolyearId definitely won't come from GUI.
I'm thinking of changing the columns of schoolyearsubjects and schoolyear to this:
schoolyear
--composite keys (yearStart, yearEnd)
yearStart (PK)
yearEnd (PK)
schoolyearsubjects(MANY TO MANY table)
id PK
code (FK)
yearStart (FK) --is this possible?
yearEnd (FK) --is this possible?
1.) Is the solution to change the columns and make a composite key so I can just insert yearStart and yearEnd values instead of schoolyearId?
2.) Is my junction / linking table schoolyearsubjects correct?
3.) What can you advise?
I'd appreciate any help.
Thanks.

For me schoolyear is a period, and as such, there is no need to use a surrogate key here. This always makes things more confusing, and it is always more difficult to develop a graphical interface for it (I'm talking about how we model periods as developers).
If you stop to think, periods are seen itself as something unique. Will you have a period equal to the other? Stop and think. Even if you have, this will occur in years or different times. So we already have a primary key for schoolyear. Eliminate "schoolyard PK" from schoolyear. Use composite key here with yearStart and yearend. So, your schoolyear entity (in future, table) will be like:
yearStart PK
yearEnd PK
In the intermediate table, you will have 3 fields as composite primary key (also foreign key!):
yearStart PK FK (from schoolyear)
yearEnd PK FK (from schoolyear)
code PK FK (from subject)
This will permit that a period have only a single subject. If, on the other hand, you want a period with more than one subject, you would have to put a surrogate key here.
Now, to draw the graphical interface, you will only have to use a select box (combo box). In this case you will have each item as a text, something like "from year X to Y" (a period). Your users can very well understand and select it.
Note: In anyway, you may not have the ID of a record in an interface, but the values that identify it. This is permissible to be seen, and identifies a record of their remaining.
If, however, you do not have periods as something unique, then "yearStart" and "yearEnd" are fields in subject entity, and there is no schoolyear entity. To be honest, the entity "schoolyear" should only exist if you want to reuse it's records to relationships with other records of other(s) table(s). I'm not saying this is or is not the case. Watch out as well. If you do this you say that every period has only one subject (as fields). I do not know if this is exactly what you want. We must always remember the most important thing in shaping an ER-Diagram:
CONTEXT
Check your context. What does it ask? If you have any questions, please comment. If you can offer me some more context here, I can help you more.

Assuming you have parameters #code, #yearStart and #yearEnd with values from the UI:
INSERT INTO schoolyearsubjects ( code, yearStart, yearEnd )
SELECT #code, y.yearStart, y.yearEnd
FROM schoolyear y
WHERE #yearStart <= y.yearStart
AND y.yearEnd <= #yearEnd;
...but I think you have a design flaw with your schoolyearsubjects because it allows duplicates e.g. doing this:
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
looks like it would result in three de facto duplicate rows.

With your current scheme you can insert the schoolyearId with a request as follows:
INSERT INTO schoolyearsubjects (id, code, schoolyearId)
VALUES ( ${id},
${code_from_GUI},
( SELECT schoolyearId
FROM schoolyear
WHERE yearStart=${start_from_GUI} AND yearEnd=${end_from_GUI})
);
For this to work, the unique constraint on (yearStart, yearEnd) in the schoolyear table is required.
As to the rest of your questions:
1) You can use a composite key in the schoolyear table it will work either way.
2) The schoolyearsubjects is correct as it allows to write join queries. If you get rid of the schoolyearId columns than you will not probably need the schoolyear table alltogether as all data you may want to get will be in the schoolyearsubjects table.
3) This article may help to deside what type of key to use.

Related

Database normalization of an entity with varied properties

In my application I have an entity, table, called actions with varied properties. To clarify the case, the following is the table actions structure:
id,
status_id(not null),
section_id(not null),
job_id (not null)
equipment_id (null),
cause_id (null),
solution_id (null),
created_at,
closed_at,
action_type (not null) char(3)
Where all fields suffixed with _id are foreign keys and the action_type is very limited and defined list of actions types, so I defined it in a configuration file i.e there is no database entity for action_type.
My question is more general than this one: Can a foreign key be NULL and/or duplicate? where I'm asking about normalization principal.
In my case, some action types has no need, for example, for equipment_id, where others need equipment_id but not need both cause_id and solution_id, etc
In my database design, the actions table looks like Many to Many conjugation table.
The above design allows, easily, to get many statistics data about sections and jobs without need to perform complex join queries.
My question is: Does my normalization and design correct?
Yes. A foreign key containing a NULL represents a case where a relationship is optional, and the relationship is not present in this instance.
In your case, there may be entries where there is simply no corresponding equipment, and equipment_id is accordingly left NULL. When a join is done to the reference table, rows with NULL in the foreign key will simply drop out.
Yes if these _id are not so important so in normalization you can set them as null or remove them :)

How to query MySQL by one of the field's subvalue?

Let's assume there is a table, with theese rows:
-personID,
-personName,
-personInterests
There is also another table, which stores the interests:
-interestID
-interestName
One person can have multiple interests, so I put the serialize()-d or JSON representation of the interest array into the interest field. This is not a String, like "reading", buth rather an index of the interests table, which stores the possible interests. Something like multiple foreign keys in one field.
The best way would be to use foreign keys, but it is not possible to achieve multiple references in one field...
How do I run such a query, without REGEX or splitting the field's content by software? If putting indexes to one field is not the way to go, then how is it possible, to achieve a structure like this?
Storing multiple indexes or any references in one field is strictly not advised.
You have to create something that I call "rendezvous" table.
In your case it has:
- ID
- UserID (foreign key)
- InterestID (foreign key)
Every single person can have multiple interests, so when a person adds a new interest to himself, you just add a new row into this table, that will have a reference to the person and the desired interest with a foreign key NOT NULL.
On large-scale projects when there are too many variations available, it is advised, to not to give an ID row to this table, but rather set the two foreign keys also primary keys, so the duplication will be impossible and the table-index will be smaller, as well as in case of lookup, it will consume less from the expensive computing power.
So the best solution is this:
- UserID (foreign key AND primary key)
- InterestID (foreign key AND primary key)
I believe the only way you can implement this is to create a third table, which will actually get updated by a trigger (Similar to what Gabor Dani advised)
Table1
-personID,
-personName,
-personInterests
Table2
-interestID
-interestName
Table3
-personInterestID (AutoIncrement Field)
-personID
-interestID
Then you need to write a trigger which will do this a stored procedure may be needed because you will need to loop through all the values in the field.

Composite primary key comprising two foreign keys referencing same table: SQL Server vs. MySQL

I've read over a number of posts regarding DB table design for a common one-to-many / users-to-friends scenario. One post included the following:
USERS
* user_id (primary key)
* username
FRIENDS
* user_id (primary key, foreign key to USERS(user_id))
* friend_id (primary key, foreign key to USERS(user_id))
> This will stop duplicates (IE: 1, 2)
from happening, but won't stop
reversals because (2, 1) is valid.
You'd need a trigger to enforce that
there's only one instance of the
relationship...
The bold portion motivated me to post my question: is there a difference between how SQL Server and MySQL handle these types of composite keys? Do both require this trigger that the poster mentions, in order to ensure uniqueness?
I ask, because up until this point I've been using a similar table structure in SQL Server, without any such triggers. Have I just luckily not run into this data duplication snake that's lurking in the grass?
Yes, all DBMS will treat this the same. The reason is that the DBMS assumes that the column has meaning. I.e., the tuple is not comprised of meaningless numbers. Each attribute has meaning. user_id is assumed to have different meaning than friend_id. Thus, it is incumbent upon the designer to build a rule that claims that 1,2 is equivalent to 2,1.
You could just use a check constraint that friend_id > user_id to prevent "reversals". This would enforce that it was not possible to enter a pair such as (2, 1) such a relationship would have to be entered as (1, 2).
If you friendship relationship is symmetrical, you need to add a CHECK(user_id < friend_id) into the table definition and insert the data like this:
INSERT
INTO friends
VALUES (
(CASE user_id < friend_id THEN user_id ELSE friend_id END),
(CASE user_id > friend_id THEN user_id ELSE friend_id END)
)
In SQL Server, you can build a UNIQUE index on a pair of computed columns:
CREATE TABLE friends (orestes INT, pylades INT, me AS CASE WHEN orestes < pylades THEN orestes ELSE pylades END, friend AS CASE WHEN orestes > pylades THEN orestes ELSE pylades END)
CREATE UNIQUE INDEX ux_friends_me_friend ON friends (me, friend)
INSERT
INTO friends
VALUES (1, 2)
INSERT
INTO friends
VALUES (2, 1)
-- Fails
To fetch all friends for a given user, you need to run this query:
SELECT friend_id
FROM friends
WHERE user_id = #myuser
UNION ALL
SELECT user_id
FROM friends
WHERE friend_id = #myuser
However, in MySQL, it may be more efficient to always keep each both copies of each pair.
You may find these article interesting:
Selecting friends
Six degrees of separation
If relationship is symmetrical, then one alternative is to "define" the relationship as asymetrical in the database, but just always add both tuples every time you add either one.
You are basically saying "Nature of friendship is in DB assymetrical, A can be friend to B while B is not friend to A, but application will always add (or remove) BOTH records (a,B) and (B, A) anytime I add (remove) either. That simplifies the query logic as well since you don't have to look in both columns anymore. One extra insert / delete each time you modify data, but fewer reads when querying...

MySQL practical normalisation on large single table

I'm relatively new to PHP MySQL and have tasked myself on learning with the "hands on" approach. Luckily, I currently have a (very) large database all relating to coin data with one table to work with. It currently has the following columns (each row representing a single item [coin]):
Group
ItemNo
ListNo
TypeCode
DenomCode
PeriodCode
ActualDate
SortDate
CostPrice
SalePrice
Estimate
StockLevel
DateEntered
DateSold
Archived
ArchiveWhenSold
Highlight
KeepSold
OnLists
NotForSale
Proof
StockItem
OnWeb
Cats
Ref1
Ref2
Variety
Picture
Description
TypeName
TypeHeading
DenomName
DenomHeading
DenomValue
PeriodName
PeriodHeading
PeriodStartYear
PeriodEndYear
The groupings for new tables are relatively obvious:
Period:
PeriodCode
PeriodName
PeriodHeading
PeriodStartYear
PeriodEndYear
Denom:
DenomCode
DenomName
DenomHeading
DenomValue
Type:
TypeCode
TypeName
TypeHeading
All the rest, under a Coin table:
Group
ItemNo
ListNo
TypeCode
ActualDate
SortDate
CostPrice
SalePrice
Estimate
StockLevel
DateEntered
DateSold
Archived
ArchiveWhenSold
Highlight
KeepSold
OnLists
NotForSale
Proof
StockItem
OnWeb
Cats
Ref1
Ref2
Variety
Picture
Description
So I'm looking to normalise the table into the tables specified. I know that i'm looking at JOINs but am wondering the best way to go about it. Do I create a new table FIRST with each data group (Denom, Period, Type) and THEN insert the data using a JOIN statement? Or is there a way to create new tables "on the fly" with a JOIN statement. I've got a honking great book open here and am following along nicely the section on MySQL and also looking through this site, but haven't been able to figure out the "correct" way to do this.
The reason I ask here for some knowledgable advice is that i'm a little unsure about how to maintain the "relationships" and keys etc. i.e If I create a table called "Denom" and populate it with all the distinct items from all the current tables data and also have it create a unique primary key, how to I then insert the reference to this new primary key from the Denom table into the main Coin table (under a new item DenomID) so that they match up?
I basically need to split this table up into 4 separate tables. I've tried this using Access 2007's table analyzer wizard and it looked promising for a n00b like me, but there was so much data, it actually crashed. Repeatedly. Probably for the best, but now I need to know some best practice, and also HOW to put it into practice. Any advice/help/relevant links would be greatly appreciated.
Create the tables first, don't forget to add a foreign key field to all the child tables that contains the Primary key from the main table (also each new table must get a primary key), so that you can join the tables. If you don't have a primary key, you need to create one before doing anything else.
To put the data into the tables is a simple insert
insert tableb (field1, field2)
select field1, field2 from tablea
You will join to get the database out, so rememebr to create indexes on the new tables especially onthe foreign key field.

Scoped/composite surrogate keys in MySQL

Here's an excerpt of my current database (changed the table-names for an easier understanding):
Pet(ownerFK, id, name, age)
Owner(id, name)
Where id is always a surrogate key, created with auto_increment.
I want to have the surrogate key Pet.id to be "scoped" by Pet.ownerFK or in otherwords, have a composite key [ownerFk, id] as my minimum key. I want the table to behave like this:
INSERT Pet(1, ?, "Garfield", 8);
INSERT Pet(1, ?, "Pluto", 12);
INSERT Pet(2, ?, "Mortimer", 1);
SELECT * FROM Pet;
RESULT:
Pet(1, 1, "Garfield", 8)
Pet(1, 2, "Pluto", 12)
Pet(2, 1, "Mortimer", 1)
I am currently using this feature of MyISAM where "you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups."
However, due to various (and maybe obvious) reasons, I want to switch from MyISAM to InnoDB, as I need transactions at some places.
Is there any way how to achieve this effect with InnoDB?
I found some posts on this issue, many of them proposed to write-lock the table before insertion. I am not very familiar with this, but wouldn't be a table-write-lock a little-bit of an overhaul for this one? I rather thought of having write-safe transactions (which I never did before) if these are possible - having a Owner.current_pet_counter as an helper field.
So another acceptable Solution would be...
Actually I don't need the "scoped" ID to be part of the actual Key. My actual database design uses a separate "permalink" table which uses this 'feature'. I currently use it as a workaround for the missing transactions. I thought of the following alternative:
Pet(id, ownerFK, scopedId, name, age), KEY(id), UNIQUE(ownerFK, scopedId)
Owner(id, name, current_pet_counter)
START TRANSACTION WITH CONSISTENT SNAPSHOT;
SELECT #new=current_pet_counter FROM Owner WHERE id = :owner_id;
INSERT Pet(?, :owner_id, #new, "Pluto", 21);
UPDATE Owners SET current_pet_counter = #new + 1 WHERE id = :owner_id;
COMMIT;
I haven't worked with transactions/transactionvars in MySQL yet, so I don't know whether there would be serious issues with this one.
Note: I do not want to reuse ids that have been given to a pet once. That's why I don't use MAX(). Does this solution have any caveats?
I don't believe so. If you really had to have that schema, you could use a transaction to SELECT the MAX(id) WHERE ownerFK, then INSERT.
I'm very sceptical there's a good reason for that schema, though; the primary key is now also a fact about the key, which might make the database theorists unhappy.
Normally you'd want ‘id’ to really be a proper primary key on its own, with ownerFK used to group and, if you needed it, a separate ‘rank’ column to put pets in a particular order per owner, and a UNIQUE index over (ownerFK, rank).