Check for duplicates across two columns sql - mysql

I am currently designing a mysql database that contains emails, and I am trying to find the best way to store said emails. I have read in articles and some stackoverlow posts that it is a good idea to store the local part of an email address separately to the domain, as many emails use the same domain (e.g. gmail.com). Currently I have set up a table that is named emails that contains an id, local_email, and domain_id, the domain_id being a foreign key for a table containing email domains. According to what I found this is the best way to set up a database as it minimises the storage used in repeating email domains.
So far this seems to work very well, however the one problem I seem to be having is that when I want to add a new email, what is the best way to ensure that a duplicate is not being added. Normally I would use the UNIQUE constraint, but since the local part and domain of the email are split up into two different columns, I am unable to do that. So my question is, is there a way to check if an exact email already exists on the database side or do I have to do that at an application level, and if I do, will I not have a problem with the race condition (I know that this is unlikely but would still prefer to not introduce bugs lik that).
I am fairly new to database design so any help is welcome. Thank you.

The Impaler gave you a really good idea by using CONSTRAINT uq1 UNIQUE (local, domain) . Apart from that, we can also use a trigger to check for duplicate values, which provides you with customizations such as returning an error message of your choice . Assuming we have four entries in the domains table using insert into domains (domain_id,domain_name) values(1,'gmail.com'),(2,'yahoo.com'),(3,'yahoo.co.nz'),(4,'msn.com'); and the emails table is empty at the moment. Then we create a trigger which checks for the uniqueness of new rows before they are inserted. If the uniqueness is violated, an error with a message is signaled.
delimiter //
drop trigger if exists check_duplicate ;
create trigger check_duplicate before insert on emails for each row
begin
if new.local_email in (select local_email from emails where domain_id=new.domain_id) then
signal SQLSTATE VALUE '99999' SET MESSAGE_TEXT = 'duplicate address detected';
end if;
end//
delimiter ;
After creating the trigger, we move on to test against it using the following INSERT statements:
insert into emails (local_email,domain_id) values ('john',2); -- successful
insert into emails (local_email,domain_id) values ('marry',1); -- successful
insert into emails (local_email,domain_id) values ('jack_frost',2); -- successful
insert into emails (local_email,domain_id) values ('jack_frost',1); -- successful
insert into emails (local_email,domain_id) values ('jack_frost',2); -- error message 'duplicate address detected'
insert into emails (local_email,domain_id) values ('jack_frost',3); -- successful

Related

MySQL merge data on duplicate key

I'm adopting dashboard now and I created two tables for selecting from frontend;
DATA_SELECTED_HISTORY
DATA_SELECTED_NOW
My frontend page get data from DATA_SELECTED_NOW and my backend algorithm put new data to this database.
I want to put my new data to DATA_SELECTED_NOW,
and the former data to be pushed to DATA_SELECTED_HISTORY when being faced with duplicate key.
I think I could use a swap table solution or insert(select subquery) + insert on duplicate key solution, but I don't get an idea anymore.
How can I use this solution in SQL?
you can use trigger in this case, to check duplication before insert to DATA_SELECTED_NOW and insert in DATA_SELECTED_HISTORY if it duplicates, check the below code
CREATE TRIGGER TRIGGER_Name
BEFORE INSERT ON DATA_SELECTED_NOW
FOR EACH ROW
BEGIN
IF (EXISTS(SELECT 1 FROM User WHERE key = NEW.Key)) THEN
-- you can replace "key = NEW.Key " with your logic to check
-- inset into DATA_SELECTED_HISTORY
END IF;
END$$

mysql/mariadb, is it a bad idea to check if row exists by trying to insert a new row and checking for errors? [duplicate]

This question already has answers here:
How can I do 'insert if not exists' in MySQL?
(11 answers)
Closed 1 year ago.
I found a not so nice method for checking if there a row already exists, and if it does exists, then it doesn't add it to avoid duplicates. Am i completely crazy to rely on this method or should i go old fashioned way where i check if it exists BEFORE trying to insert row in database?
The table is VERY simple :)
-ID [PK]
-Message
-Hashed_message [UNIQUE] (stored procedure, takes message and hashes it upon insert)
Now when i try to insert a new row i would say
*`insert into .... message = xxx
Upon insertion mysql will create a hash on message automatically, but since it's an unique column, incase the hash already exists in database, it will simply throw an error, and no duplicates will exist ever... i hope.
The reason for using hashes, is simply to avoid checking duplicates by scanning every large message, instead i though a short hash would be easier to check for duplicates.
So is this method bad for avoiding duplicates?
I mean i could before insert, manually create that hash of my message and check if that hash exists and THEN insert the message, but i would hope to avoid always trying to match the stored procedure function on PHP as well.
quick note: there is a similar thread about insert and then ignoring error on duplicate, but this one is related to how it is handled when a derived column(Stored procedure) is used to accomplish this
If the hashed message has to be unique, create a key on that column with the UNIQUE constrain: so there won't be two rows with the same hash.
Then, when you insert a new row modify your query with the following:
INSERT INTO table SET message='$message', hashed_message='$hashed_message'
ON DUPLICATE KEY id=id;
This will perform an insert if the hashed_message is unique. Otherwise will not do any update.
If you want to update something in case of duplicate your query will become:
INSERT INTO table SET message='$message', hashed_message='$hashed_message'
ON DUPLICATE KEY UPDATE message='$updated_message'
just to make an example.
Note that this method won't raise any exception in case of duplicate values: you need extra logic if you need to perform actions in your frontend in case of duplicates (i.e. message shown to the user).
More details here

SQL Trigger not on rows but on attributes

Hey guys a little question for you.
I'm currently working on SQL Triggers and my goal is to archive logging if there are changes made to our database. For example we got some tables like customers with: name, firstname, placeofbirth and so on. We offer the users to update their own data and want to save the OLD data in a new table for logging reasons. To have only one logging table for all updates the logging table is kind of generic with:
id, timestamp, table_name, column, old_value, new_value.
table_name is the updated table, colum the updated column in this table and all the rest should speak for itself. Therefore it would be great to know not only in which tuple but also in which particular column the update has happened.
My question: Is there a construct like:
create trigger logging_trigger on customer**.firstname** after insert ...
to trigger an action only if there happened an update on let's say the 'firstname' column?
If not is there a smooth solution for handling all possible update cases?
Thank you.
I use a format like you described in my system... Below is how I accomplish it with your required logic.
CREATE DEFINER = CURRENT_USER TRIGGER `testing_schema`.`new_table_BEFORE_UPDATE` BEFORE UPDATE ON `new_table` FOR EACH ROW
BEGIN
IF NEW.ColumnName <> OLD.ColumnName THEN
INSERT INTO HistoryTable (`ColumnName1`, `ColumnName2`, ect..) VALUES (OLD.ColumnName1, OLD.ColumnName2, ect...);
END IF;
END
The main difference In mine is, that I do not have an IF condition. I simply copy the entire row to the history table every time an Update/Delete is made to that row. That way I don't have to maintain any form of logic to handle scenarios of investigating "what changed", I just save the entire row because I know "something" changed.

What is proper way to set and compare variable inside an sql trigger

Am populating a table using a trigger after an insert event occurs on another table and that worked fine. However i then noticed that the trigger would still insert a new row for existing records. To fix this, I want to create the trigger again but this time it would only fire if a condition is met...but not having previously used triggers in the past am getting a syntax error and not able to identify what am doing wrong. Kindly have a look and help me fix this
CREATE TRIGGER `students_gen_insert`
AFTER INSERT ON `students` FOR EACH ROW
BEGIN
INSERT INTO records (student_id, subject_id)
SELECT new.student_id, subjects.subject_id
FROM subjects
WHERE category = new.class;
END;
Am currently using MySql 5.6.17 version.
It is generally not a good idea to SELECT from the table the trigger is on, and forbidden to UPDATE or INSERT (not that you are doing those). Assuming you are trying to get the values for the row just inserted, the first SET ... SELECT you have is needless; just use NEW.fieldname to get the fields of the inserted row.
The second SET ... SELECT and following condition are a bit confusing. If referential integrity is being maintained, I would think it would be impossible for the records table to refer to that particular student_id of the students table at the point the trigger is executed. Perhaps this was to avoid the duplicate inserts from the trigger's previous code? If so, it might help for you to post that so we can pinpoint the actual source of redundant inserts.

MySQL - Split up INSERT in to 2 queries maybe

I have an INSERT query which looks like:
$db->Query("INSERT INTO `surfed` (user, site) VALUES('".$data['id']."', '".$id."')");
Basically I want to insert just like the above query but if the site is already submitted by another user I don't want it to then re-submit the same $id in to the site column. But multiple users can view the same site and all users need to be in the same row as the site that they have viewed which causes the surfed table to have 10s of thousands of inserts which dramatically slows down the site.
Is there any way to maybe split up the insert in some way so that if a site is already submitted it won't then submit it again for another user. Maybe there's a way to use UPDATE so that there isn't an overload of inserts?
Thanks,
I guess the easiest way to do it would be setting up a stored procedure which executes a SELECT to check if the user-site-combination is already in the table. If not, you execute the insert statement. If that combination already exist, you're done and don't execute the insert.
Check out the manual on stored procedures
http://dev.mysql.com/doc/refman/5.1/en/create-procedure.html
You need to set a conditional statement that asks whether the id already exists then if it does update otherwise insert
If you don't need to know whether you actually inserted a line, you can use INSERT IGNORE ....
$db->Query("INSERT IGNORE INTO `surfed` (user, site) VALUES('".$data['id']."', '".$id."')");
But this assumes that you have a unique key defined for the columns.
IGNORE here will ignore the Integrity constraint violation error triggered by attempting to add the same unique key twice.
The MySQL Reference Manual on the INSERT syntax has some informations on that http://dev.mysql.com/doc/refman/5.5/en/insert.html