MySQL: Proper way to implement a "conditional primary key" - mysql

Here is my submissions table. Users make Submissions on Challenges. They can make as many submissions as they want, until there is a correct submission. Once a correct submission is recorded there should be no more submissions given a challenge_id, user_id combo. I was initially enforcing this constraint from within my app but would like to move this constraint to the DB.
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | YES | MUL | NULL | |
| challenge_id | int(11) | YES | MUL | NULL | |
| correct | tinyint(1) | YES | | NULL | |
| timestamp | datetime | YES | | NULL | |
| flag | varchar(512) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
What I've Tried
I've tried making the primary key of the table be KEY(user_id, challenge_id, correct). The problem with this is that there could be multiple submissions as long as correct is false.
What is one way to solve this issue?

If you don't need to record the incorrect submissions, don't.
If you do need the full history, it cannot be done by a UNIQUE key, as you observed.
Plan A: Add a TRIGGER that checks for inserting a second correct answer. Meanwhile, something else would need to be the PK.
Plan B: Have a table of correct submissions and a table of incorrect (or possibly all) submissions. Neither would necessarily need the correct column. And perhaps some other columns don't have to be in both tables. The PKs would be different.

Related

Should I use FOREIGN KEY or ADD INDEX in sql?

Today I have seen a video lecture in which they gave the foriegn key by using ADD INDEX on a table -
CASE 1 -
DECRIPTION OF TABLE 1 : subjects
+-----------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| menu_name | int(11) | YES | | NULL | |
| position | int(3) | YES | | NULL | |
| visible | tinyint(1) | YES | | NULL | |
+-----------+------------+------+-----+---------+----------------+
DECRIPTION OF TABLE 2 : pages
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| subject_id | int(11) | YES | | NULL | |
| menu_name | varchar(255) | YES | | NULL | |
| position | int(3) | YES | | NULL | |
| visible | tinyint(1) | YES | | NULL | |
| content | text | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
So in the column subject_id of table pages should store the id of table subjects.
Which one should i use and why ? -
ALTER TABLE pages ADD INDEX fk_subject_id (subject_id);
OR
ALTER TABLE pages
ADD FOREIGN KEY (subject_id) REFERENCES students(id);
video lecture uses ALTER TABLE pages ADD INDEX fk_subject_id (subject_id);.
CASE 2 -
Now Please cosider one more example -
According to above details, If I have 5 more tables including pages table(defined above).
All 5 tables have column subject_id which should store the data accodring to column id of table subjects.
So in this case
In this case, Which one Should I use ADD INDEX or FOREIGN KEY and why ?
Q : case 1 - Which one should i use ?
A : I'll choose fk not index,because the reference between pages and subjects is multi to one,if you add index on column with duplicate values that would not be helpful,because in most cases, only one index can be used to optimize a database query,and there's a primary index on subjects,so don't do that again.
note : you have to make pages - subject_id not null
Q : case 2 - Which one should i use ?
A : if the 5 tables between subjects are also multi to one,i'll choose FK not index,the reason like case1's answer.
Using a FOREIGN KEY and an INDEX are different things. FOREIGN KEYs are used for data integrity, so that you cannot have a reference, which points to nothing and that you cannot delete the "base" row without deleting the "linking" rows first (unless you use ON CASCADE DELETE stuff).
Indices are used to improve the search speed to find the correct rows faster in an SELECT and UPDATE query. This has nothing to do with data integrity.
To answer your question: You use a FOREIGN KEY if you want to reference the Id of rows from the other table (like you do with subject_id). Also, you don't need to add an INDEX on the column subject_id, because the InnoDB engine already does that.

Does the foreign key slow down the join query?

I have two databases test & test2. Both have the same tables(employees & salaries) and both have the same records. test2 database uses a foreign key and test database doesn't.
test structure
test.employees
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| emp_id | int(11) | NO | PRI | NULL | |
| name | varchar(30) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
test.salaries
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| salary | int(11) | YES | | NULL | |
| emp_id | int(11) | NO | | NULL | |
+--------+---------+------+-----+---------+----------------+
test2 structure
test2.employees
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| emp_id | int(11) | NO | PRI | NULL | |
| name | varchar(30) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
test2.salaries
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| salary | int(11) | YES | | NULL | |
| emp_id | int(11) | NO | MUL | NULL | |
+--------+---------+------+-----+---------+----------------+
I run the same join query on both databases
select * from employees inner join salaries on employees.emp_id=salaries.emp_id;
This is the output i get from test database which doesn't contain a foreign key
2844047 rows in set (3.25 sec)
This is the output i get from test2 database which contains a foreign key
2844047 rows in set (17.21 sec)
So does the foreign key slow down the join query?
Your empirical evidence suggests that in at least one case it does. So, if we believe your numbers, the answer is clearly "yes" -- and I assume you have ruled out other potential causes such as locks on the table or resource competition (actually the difference is pretty big). I presume that you want to know why.
In most databases, declaring a foreign key is about relational integrity. It would have no effect on the optimization of queries. The join conditions in the query would redundantly cover the same information.
However, MySQL does a bit more when a foreign key is declared. A foreign key declaration automatically creates an index on the columns being used. This is not standard behavior -- I'm not even sure if any other database does this.
Normally, an index would benefit performance. In this case, the optimizer has more choices on how to approach the query. For whatever reason, it is using a substandard execution plan.
You should be able to look at the explain plans and see a difference. The issue is that the optimizer has chosen the wrong plan. I would say that this is uncommon and should not dissuade you from using proper foreign key declarations in your databases.

How to simplify my SQL requests?

I have two tables here. One is Items and other is Parts.
Items have a part_id and Parts have an item_id.
When a user press on the submit button from the ItemDetail view, data are sent to the server and inserted into those two tables.
Here is how my code works :
Insert to Items table first and get the id of new Item data
Insert to Parts table with this item_id and other Part data
Update to Items table using new part_id
But can I write those three SQL requests in just one request ?
Here is the structure of my tables:
Items
Field | Type | Null | Key | Default | Extra |
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(255) | YES | | NULL | |
| price | int(11) | YES | | NULL | |
| part_id | int(10) unsigned | YES | | NULL | |
| type | varchar(255) | YES | | NULL | |
Parts
Field | Type | Null | Key | Default | Extra |
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| item_id | int(10) unsigned | NO | | NULL | |
| name | varchar(255) | NO | | NULL | |
| number | varchar(255) | YES | | NULL | |
You shouldn't have 2 tables pointing to each other like this, only one of the tables should have a foreign key, not both.
Then what you are looking for is this: http://dev.mysql.com/doc/refman/5.7/en/commit.html
Transactions make sure that either all queries are executed, or if there is an error somewhere all changes will be reverted.
Looking at the logic you are using, you are doing it correctly.
As they are two separate tables you will need to do two separate insert statements in SQL. Of course you can use a stored procedure so that you only need to call one item in your code and the SP will do two inserts.
A question here is what code are you using? If you are using something like entity framework and your relationships are defined between your elements such as
Items
-Field 1
-Parts (FK) List<Parts>
That would work, but looking at what you have tagged I am guessing your not using a ASP language?? If you are let me know and I may have a better solution for you.

In mysql can I have a composite primary key composed of an auto increment and another field? Also, please critique my "mysql partitioning" logic

I am experimenting with mysql partitioning ( splitting the table up to help it scale better ), and I am having a problem with the keys on the table. First, I am using a python's threaded comments module... here is the schema
+-----------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| content_type_id | int(11) | NO | MUL | NULL | |
| object_id | int(10) unsigned | NO | | NULL | |
| parent_id | int(11) | YES | MUL | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
| date_submitted | datetime | NO | | NULL | |
| date_modified | datetime | NO | | NULL | |
| date_approved | datetime | YES | | NULL | |
| comment | longtext | NO | | NULL | |
| markup | int(11) | YES | | NULL | |
| is_public | tinyint(1) | NO | | NULL | |
| is_approved | tinyint(1) | NO | | NULL | |
| ip_address | char(15) | YES | | NULL | |
| id | int(11) | YES | | NULL | |
+-----------------+------------------+------+-----+---------+-------+
Note, I have modified this database by dropping the id col (primary by default), and re adding it.
Essentially, I want to have id AND content_type_id as my primary keys. I also want id to auto increment. Is this possible.
Second question. Since I am just learning about mysql partitioning, I am wondering if my partitioning logic is sound. There are 67 different content_types, and some (maybe all) of those content types allow comments to be made on them. My plan is to partition based on the type of object that is being commented on. For instance, the images will be commented on a lot, so I put any content type pertaining to images into one partition, and another content type that can be commented on is "blog entries", so there is a separate partition for that, and so on and so on. This will allow me to spread these partitions possibly to dedicated machines as the load grows. How is my understanding of this concept so far?
Thanks so much!
Since id will be auto incremented, it can be the primary key all by itself. Adding content_type to the primary key would not gain you anything in regards to the uniqueness of the key.
If you want to add an index for faster performance to the 2 columns, then add an alternate unique index to the table with the 2 columns instead of trying to add them both to the primary key. However, be aware that enforing uniqueness on the 2 columns would be a waste since id is already guaranteed to be unique by itself, so a regular index would make more sense if needed.

Audit logging for products data?

When the staff change the information of product name, option name or prices. It should insert the data into history log and who done it.
items table:
item_id (PK)
item_name
item_description
Note: item prices are in the item_options table
item_options table:
option_id (PK)
item_id (FK)
option_name
option_price
A item can have 1 or more options.
If I want to change the name items.item_name, It should copy the current record to the history table, delete current record from items table and then insert a new record with the new information in the items table?
What about the item_options, how would that work? If there are multiple options from specific item_id, do that mean I need to duplicate options to history table?
What Audit logging/history tables should look like for items and item_options?
Thanks
Your audit data should be stored per-table, rather than all in one place. What you'd do is create an audit table for each of the tables you want to track, and create triggers to create a record in the audit table for any data-manipulation operation on the audited table.
It's definitely advisable to disallow DELETE operations on the items and item_options tables - add flags like item_active and item_option_active so that you can softdelete them instead. This is normal practice in situations where you're doing things like storing invoices that reference products ordered in the past, and need the data for historical reporting purposes, but not for day-to-day use.
Your audit tables aren't something you should use for referencing old data, your normal data model should support simply "hiding" old data where it's likely that it's still going to be used, and storing multiple versions of data that will change over time.
For auditing, it's also useful to store the username of the last user to modify a given record - when used from a web application, you can't use MySQL's USER() function to get any useful information about who's logged on. Adding a column and populating it means you can use that information in your audit triggers.
NB: I'll assume that you won't allow item IDs to be changed under normal conditions - that would make your auditing system more complex.
If you add active flags, and last-modified-by data to your tables, they'll look something like:
Items table:
mysql> desc items;
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| item_id | int(11) | NO | PRI | NULL | auto_increment |
| item_name | varchar(100) | YES | | NULL | |
| item_description | text | YES | | NULL | |
| item_active | tinyint(4) | YES | | NULL | |
| modified_by | varchar(50) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
Item options table:
mysql> desc item_options;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| option_id | int(11) | NO | PRI | NULL | auto_increment |
| item_id | int(11) | YES | MUL | NULL | |
| option_name | varchar(100) | YES | | NULL | |
| option_price | int(11) | YES | | NULL | |
| option_active | tinyint(4) | YES | | NULL | |
| modified_by | varchar(50) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
Your audit tables need to store four extra pieces of information:
Audit ID - this ID is only unique for the history of this table, it's not a global value
Change made by - the database user who made the change
Change date/time
Action type - INSERT or UPDATE (or DELETE if you were allowing it)
Your audit tables should look something like:
Items audit table:
mysql> desc items_audit;
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| audit_id | int(11) | NO | PRI | NULL | auto_increment |
| item_id | int(11) | YES | | NULL | |
| item_name | varchar(100) | YES | | NULL | |
| item_description | text | YES | | NULL | |
| item_active | tinyint(4) | YES | | NULL | |
| modified_by | varchar(50) | YES | | NULL | |
| change_by | varchar(50) | YES | | NULL | |
| change_date | datetime | YES | | NULL | |
| action | varchar(10) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
Item options audit table:
mysql> desc item_options_audit;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| audit_id | int(11) | NO | PRI | NULL | auto_increment |
| option_id | int(11) | YES | | NULL | |
| item_id | int(11) | YES | | NULL | |
| option_name | varchar(100) | YES | | NULL | |
| option_price | int(11) | YES | | NULL | |
| option_active | tinyint(4) | YES | | NULL | |
| modified_by | varchar(50) | YES | | NULL | |
| change_by | varchar(50) | YES | | NULL | |
| change_date | datetime | YES | | NULL | |
| action | varchar(10) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
Don't use foreign keys on your audit tables; the rows in the audit tables aren't child rows of the records they're auditing, so foreign keys aren't of any use.
Triggers
NB: MySQL doesn't support multi-statement-type triggers, so you need one for each of INSERT, UPDATE and DELETE (if applicable).
Your triggers simply need to INSERT all the NEW values into the audit table. The trigger definitions for the items table might be:
/* Trigger for INSERT statements on the items table */
CREATE DEFINER=`root`#`localhost` TRIGGER trigger_items_insert_audit
AFTER INSERT ON items
FOR EACH ROW BEGIN
INSERT INTO items_audit (
item_id, item_name, item_description,
item_active, modified_by, change_by,
change_date, action
) VALUES (
NEW.item_id, NEW.item_name, NEW.item_description,
NEW.item_active, NEW.modified_by, USER(),
NOW(), 'INSERT'
);
END;
/* Trigger for UPDATE statements on the items table */
CREATE DEFINER=`root`#`localhost` TRIGGER trigger_items_update_audit
AFTER UPDATE ON items
FOR EACH ROW BEGIN
INSERT INTO items_audit (
item_id, item_name, item_description,
item_active, modified_by, change_by,
change_date, action
) VALUES (
NEW.item_id, NEW.item_name, NEW.item_description,
NEW.item_active, NEW.modified_by, USER(),
NOW(), 'UPDATE'
);
END;
Create similar triggers for the item_options table.
Update: Data History In E-commerce
The auditing we did above will allow you to keep a history of any given database table, but creates a data store that isn't suitable for use for data that needs to be accessed regularly.
In an e-commerce system, keeping usable historical data is important, so that you can change attributes while still presenting old values in certain situations.
This should be completely separate from your auditing solution
The best way to store history is to create a history table for each attribute that needs to be stored historically. This Stackoverflow question has some good information about keeping a history of a given attribute.
In your situation, if you're only concerned about price and title, you'd create a prices table, and an item_titles table. Each one would have a foreign key to either the item_options table or the items table (the master tables would still store the current price, or title), and would have the price or title, with its effective dates. These tables should have fine-grained (possibly column-based) permissions to avoid updating the effective_from dates, and the actual values once the record is inserted.
You should use the auditing solution above on these tables also.
if you do not have a bunch of constraints - then your data will get messed up in a hurry when you orphan the item entries by removing option entries and visaversa.
what you are asking for can be done in triggers, but this is not probably what you want.
imaging if you have an item with 2 options.
now you change the item name, that item gets deelted (and moved to history) - you have unlinkable options... is that what you intend?
what about order or other things that reference the items? same issues.
instead, create trigger logic to only allow 'reasonable' edits to the item. if desired, put a copy of the record into a parallel history table, but DO NOT delete the original.
you may also consider adding a status column to the item or some date ranges in order to account for the idea that this item is currently available or whatever other status you may need.