I am looking for the best way to prevent duplicate data in a table that is based on another data point.
Table: (a joining table between two entities, person and school)
CREATE TABLE with_school (
person_id INTEGER NOT NULL,
school_id INTEGER NOT NULL,
type_id INTEGER NOT NULL,
PRIMARY KEY (person_id,school_id)
);
person_id and school_id are also foreign keys, declared in a different statement.
What I want is something that prevents a person from adding the same school more than once.
Examples:
Row 1:
person_id = 1
school_id = 1
Row 2:
person_id = 1
school_id = 2
is okay, but:
Row 1:
person_id = 1
school_id = 1
Row 2:
person_id = 1
school_id = 1
is not.
What would be the easiest way to prevent these kinds of duplicates?
I have tried using a trigger, but I haven't been able to make it work the way I want it to:
ALTER TABLE with_school
ADD CHECK (
school_id != (
SELECT school_id
FROM with_school
WHERE person_id = person_id
)
);
(I cannot differentiate between the initial person_id and the one it is checking)
You could try creating uniqueness at the database level....like:
ALTER TABLE with_school ADD UNIQUE uniqueindex (person_id, school_id);
And...probably in addition....you could do an sql select 'check' at the application level before inserting to make sure you don't already have those two keys.
Related
Context
I have an old database with relationships based on a string (person name), not the id. For instance, a person has many comments join by the column person_name on the comments table.
I would like to fix this by changing the column person_name into person_id.
Create the new relationship
It consists of creating the new person_id column on comments and update the value:
UPDATE comments SET
person_id = (SELECT id FROM people WHERE LOWER(person_name) = name);
Drop the old column
I cannot just drop person_name and update the foreign keys. I need to ensure all comments are properly linked to their authors. By simply selecting all the comments that have a person_name but an empty person_id, I can raise the red flag, because this migration will be applied automatically on many tables.
SELECT 1 FROM comments WHERE person_id IS NULL AND person_name IS NOT NULL
Notice that some comments are anonymous, so person_name could be NULL.
Do the migration
To do this atomically I could do:
IF EXISTS(SELECT 1 FROM comments WHERE person_id IS NULL AND person_name IS NOT NULL)
THEN
ALTER TABLE comments
DROP COLUMN person_name;
END IF;
Unfortunately this seems to only work with MSSQL, not MySQL
What alternative can I use?
i created two database (php using XAMPP) one for employee (id, name) and another for administrator(id, name).
the id in the two tables are primary key, i need to build a relation between the two table where id don't repeat .for example :admin(1,a)uses id = 1 which should not be used in the employee table
please help
The normative approach to this problem is to use a single table. That makes it very easy to keep the id values distinct.
You can include a discriminator column that indicates whether a row represents an "employee" or an "administrator". In your example, there's two possible values.
CREATE TABLE employee
( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'pk'
, ename VARCHAR(50) NOT NULL
, admin TINYINT(1) UNSIGNED NOT NULL DEFAULT '0' COMMENT 'boolean'
)
Some example data, to illustrate:
id ename admin
--- ---------------- -------
42 Barney Rubble 0
43 Fred Flintstone 0
17 Mr. Slate 1
Sample queries:
-- select "employee" rows
SELECT id, ename FROM employee WHERE admin=0
-- select "administrator" rows
SELECT id, ename FROM employee WHERE admin
If you need two separate tables, that you asked about
Bottom line is that there is no declarative constraint available in MySQL that will enforce the id values between the two tables to be "distinct" from one another.
To do that, you would have to "roll your own" solution. And that solution is not trivial, it can be rather involved.
There are some solutions to simpler problems, automatically generating unique id values. But to actually enforce uniqueness, there is no simple way to do that.
Is your goal to just enforce a constraint, such that INSERT and UPDATE statements will throw an error if they attempt to violate the constraint, you are going to need to write triggers.
Three tables: users, roles and a pivot table (many to many) role_user.
user:
- id
- name
role:
- id
- name
role_user
- id
- user_id: foreign key link to user
- role_id: foreign key link to role
If I wanted to limit the amounts of maximum roles a user can have to only 1 for example, I could put the role_id foreign link on the user as a role_1 field instead of using a pivot table of many to many.
users:
- id
- name
- role_id_1
The same goes if I wanted only two roles per user.
users:
- id
- name
- role_id_1
- role_id_2
What if I wanted to limit the amount to 1, 2 or something else using a pivot table (Not using foreign role links on the user table) ? Is there an option for that in sql ?
Something like a composite unique index option including role_id and user_id in the pivot table, but instead of a constraint on the uniqueness, a custom constraint on the limit of the user_id number of appearances.
There is a way you can implement this in SQL without triggers. It is a bit complicated, but you could do it.
It starts by adding another table. Let me call it RoleNumbers. This table would consist of one row for each possible role for a user. So, you set it up with 1, 2, or however many roles you want.
Then for the junction table:
create table UserRoles (
UserRoleId int not null auto_increment primary key,
UserId int not null references users(user_id),
RoleId int not null references roles(role_id),
RoleNumber int not null references RoleNumbers(Number),
unique (UserId, RoleId),
unique (UserId, RoleNumber)
);
This uses my naming conventions. I have no problem with having a synthetic key on a junction table.
When you insert a new record, you would have to assign a value to RoleNumber that is not already being used. Hence, you get the limit. The most efficient way to do this is via triggers, but that is not strictly necessary. You could do an insert as:
insert into UserRoles(UserId, RoleId, RoleNumber)
select $UserId, $RoleId, coalesce(max(RoleNumber), 0) + 1
from UserRoles
where UserId = $UserId;
delete would require a separate query for maintaining the numbering scheme.
I have a select statement which, at least should, return only unique userId. If this does not happen and userIds are double a user did input something illegal.
To illustrate we use a simple SELECT userId, name, FROM USER. Now, usually you will make the userId unique/primarykey at the table level. Just for the sake of the example we don't.
The expected result would be:
userId name
---------------
1 Roel
2 Joe
3 John
But the result is something like
userId name
---------------
1 Roel
1 Roel
2 Joe
3 John
3 John
Is there possiblity to make the query in such a way that it would give an error when the result contains more than one the userIds?
Just add DISTINCT. And it will make your rows unique.
SELECT DISTINCT userId, name
FROM USER
By definition, the DISTINCT keyword can be used to return only distinct (different) values.
UPDATE 1
The reason why is that you didn't specify a constraint on your table. Make a table definition like this.
CREATE TABLE userList
(
ID INT NOT NULL AUTO_INCREMENT,
NAME VARCHAR(50) NOT NULL,
CONSTRAINT id_PK PRIMARY (ID),
CONSTRAINT name_unique UNIQUE (NAME)
)
when you don't want ID to be auto_incremented, you can remove the AUTO_INCREMENT word on the table definition, or create a table definition like this
CREATE TABLE userList
(
ID INT NOT NULL,
NAME VARCHAR(50) NOT NULL,
CONSTRAINT id_PK PRIMARY (ID),
CONSTRAINT name_unique UNIQUE (ID, NAME)
)
SELECT
UserId, COUNT(*)
FROM
User
GROUP BY
UserId
HAVING
COUNT(*) > 1
Any records returned from this will be those for which there is erroneous data. That would be the simplest way to identify when to raise an error, but it wouldn't simply raise one for you. You could wrap the above query into a procedure, and use some logic to determine whether to raise an error or run the main query.
Well, I guess this is importent during registration... so just count() the hits of a given username before you allow an INSERT.
Secondly make the name column UNIQUE and then you get the right error upon INSERT
Another solution is
SELECT *
FROM User
UNION
SELECT *
FROM User
The advantage of this is that you don't have to list the variables in SELECT. Usually, it is bad not to write explicitly the column names but I think this case is one of the rare cases where it makes sense.
I want to know if I can repopulate the autoincrement value in mysql.
Because, I have records that look similar:
ID Name
1 POP
3 OLO
12 lku
Basically , I want a way to update the ID to this
ID Name
1 POP
2 OLO
3 lku
Is there any way to do this in mysql?
Thanks.
It's not best practice to fiddle your primary keys - better to let your DB handle it itself. There can be issues if, in between the UPDATE and ALTER, another record is added. Because of this, you must LOCK the table, which might hang other queries and spike load on a busy production server.
LOCK TABLES table WRITE
UPDATE table SET id=3 WHERE id=12;
ALTER TABLE table AUTO_INCREMENT=4;
UNLOCK TABLES
OR - for thousands of rows (with no foriegn key dependencies):
CREATE TEMPORARY TABLE nameTemp( name varchar(128) not null )
INSERT INTO name SELECT name FROM firstTable
TRUNCATE firstTable
INSERT INTO firstTable SELECT name FROM nameTemp
The latter method will only work where you have no foreign keys. If you do, you'll require a lookup table.
CREATE TEMPORARY TABLE lookup( newId INTEGER AUTO_INCREMENT, oldId INTEGER, PRIMARY KEY newId( newId ) );
INSERT INTO lookup (oldId) SELECT id FROM firstTable
[do temp table queries above]
You now have a lookup table with the old to new ids which you can use to update the foreign key dependencies on your other tables (on a test server!)
Changing the primary key is a very bad idea as it endangers your referential integrity (what if another table uses the id without having a foreign key with proper "on change"?).
If you really, really have to do it and don't care about bad side-effects:
Create a second table with identical structure
INSERT INTO new_table (id, [other fields]) SELECT NULL, [other fields] FROM old_table;
DROP old_table;
RENAME new_table old_table;
Warning:
This will damage every other table that has foreign keys on this table (but if you had such then you wouldn't be doing this anyways).
You may want to try something like...
Create Temporary table MyBackup
( ID as your autoincrement,
OldID as Int for backlinking/retention,
RestOfFields as their type )
insert into MyBackup
( OldID
RestOfFields )
select
ID as OldID,
RestOfFields
from
YourOriginalTable
order by
ID (this is your original ID)
Then you'll have a new table with an autoincrement with new IDs assigned, yet have a full copy of their original ID. Then, you can do correlated updates against other tables and set the ID = ID where ID = OldID. By keeping your insert via order by the original ID, it will keep the numbers from replacing out of sequence.. Ex: if your table was orderd as
Old ID = 3, new ID = 1
Old ID = 1, new ID = 3
Old ID = 12, new ID = 2
Your old 3's will become 1's, then the 1's would become 3's, and 12's become 2's
Old ID = 1, new ID = 1
Old ID = 3, new ID = 2
Old ID = 12, new ID = 3
your 3's won't overwrite the higher number, and the 12's won't conflict with the 3's since the threes were already lowered to 2's.