Delete Duplicates in MYSQL using a text column for comparison - mysql

We've got a database table that need a multiple column unique key. However, one of those columns is TEXT and has lengths as long as 1000 chars (so varchar won't work either). Because of the TEXT column, I can't actually have a unique key for those columns. What's a good way to remove duplicates? Of course, fast would be nice.

The best way is to use a UNIQUE INDEX to avoid duplicate.
Creating a new unique key on the over columns you need to have as uniques will automatically clean the table of any duplicates.
ALTER IGNORE TABLE `table_name`
ADD UNIQUE KEY `key_name`(`column_1`,`column_2`);
The IGNORE part does not allow the script to terminate after the first error occurs. And the default behavior is to delete the duplicates.

Add a unique constraint as below:
ALTER IGNORE TABLE table1
ADD UNIQUE unique_name(column1, comlumn1, column3 ... Text);
Here IGNORE will help in removing the duplicates while creating the constraint.

Related

MSQL unique constraint check only onward insert records Not already existing records

The table already has duplicates entries. I want to create a unique constraint in MQSL DB without deleting the existing duplicates. If any duplicate entries coming onwards then it will show an error. Given blow queries not working in MYSQL.
ALTER TABLE presence
ADD CONSTRAINT present uniqueness UNIQUE (employee_id,roll_number) where id >10000;
or
ALTER TABLE presence
ADD CONSTRAINT present uniqueness UNIQUE (employee_id,roll_number) where id <> (343,34534,34534)
Do we have something like that solution in SQL?
Add an additional column to the table that indicates the existing values.
Set it to NULL for the existing values. And give it a constant value, say 1, for the new rows. Then create a unique index or constraint on this column:
alter table t add constraint unique (employee_id, is_old)
Actually, I realize that you probably don't want duplicates with singleton old values and new values. That is just an issue of setting the value to NULL only for duplicates in the history. So, one row would have a constant value (say 1) in the historical data.
MySQL allows duplicate values on NULL, which is why this works.

Any chance of failure in REPLACE over INSERT?

I have a table that I'm inserting records into. It has a primary key made out of two fields. My syntax up until now has been a simple:
INSERT into table (field,field,field) VALUES ('foo', 'bar', foo') type deal, but I've come across the scenario where I may need to overwrite existing values.
I am familiar, and in the past, have used the INSERT INTO .... ON DUPLICATE KEY UPDATE... syntax, but I recently came across the much more simple REPLACE INTO... syntax.
My assumption of this REPLACE INTO is that IF no data exists for the primary key I'm writing to, it will act as an INSERT. IF the primary does exist however, it will delete the record and insert a new one. Is this correct?
If this is correct, are there any downsides to me just forgoing the INSERT INTO... statement and running a REPLACE INTO... for 100% of the lines users are inserting into the table? Are there any potential risks to using the REPLACE INTO... 100% of the time?
REPLACE does a DELETE first, so if you use foreign key constraints with ON DELETE CASCADE, you could unintentionally delete a lot of dependent data. The re-insert step of REPLACE will not recover that data deleted from dependent tables.
I think I've seen cases where REPLACE causes a new auto-increment value to be generated for the primary key. Maybe if the conflict is based on a secondary UNIQUE KEY instead of the primary key, That could throw off other references to the row even if you don't use foreign key constraints.

Replace/Insert on duplicate key with where clause [mySQL]

Basically we have the same problem as this question: ON DUPLICATE KEY update (with multiple where clauses)
But we can't have unique keys for the keys of reference because we need duplicates of both.
Is there a way to do this with one query?
We have a unique identifier, and also need to record date, and increment a value, but also be able to update/insert without making multiple queries.
Please excuse me if I'm understanding you incorrectly, but it seems to me that what you want can in fact be done with the UNIQUE contraint mentioned in the question you're referencing.
Are you aware that you can create a UNIQUE constraint on more than one column? That is, the combination of the 2 columns is unique, but the columns themselves don't have to be.
In your case, you would use ALTER TABLE table ADD CONSTRAINT uq_table_id_date UNIQUE (id, date).

Database Design: optional, but must be unique if provided a value

I have a column in one of my tables. It's optional, so it can be left blank. However, if a value is provided for that column, it must be unique. Two questions:
How do I implement this in my database design (I'm using MySQL Workbench, by the way)
Is there a potential problem with my model?
Just use a UNIQUE index on the column. See:
http://dev.mysql.com/doc/refman/5.1/en/create-index.html
A UNIQUE index creates a constraint
such that all values in the index must
be distinct. An error occurs if you
try to add a new row with a key value
that matches an existing row. For all
engines, a UNIQUE index permits
multiple NULL values for columns that
can contain NULL. If you specify a
prefix value for a column in a UNIQUE
index, the column values must be
unique within the prefix.
It's can be null (and not blank) and unique. By default value can be null. There is no problem for me.
You can create a UNIQUE index on a table. In MySQL workbench that is the UQ checkbox when creating/editing the table.
Step 1, ALTER the table and MODIFY the field so NULLs are allowed.
ALTER TABLE my_table MODIFY my_field VARCHAR(100) NULL DEFAULT NULL;
Step 2, add a UNIQUE index on the field.
ALTER TABLE my_table ADD UNIQUE INDEX U_my_field (my_field);
It's fine to use -- my only hesitation with putting a UNIQUE index on a nullable field it that it is slightly counter-intuitive at first glance.
1) Move the column to a new table, make it unique and non-nullable. You can now have a row in this new table only when you have a value for it.
2) Yes. Keys and dependencies on keys are the basis of data integrity in a relational database design. If an attribute is supposed to be unique then it should be implemented as a key. A nullable "key" is not a key at all and anyway is never necessary because it can always be moved to a new table and made non-nullable without loss of any information.

MySQL INSERT/UPDATE without ON DUPLICATE KEY

I may have either misunderstood how to use the ON DUPLICATE KEY syntax, alternatively my database structure needs some work but here goes.
I have a table (bookings-meta) which represents meta data associated with another table (bookings) , different rows in the bookings table may or may not have specific meta data associated with them in the other table.
The bookings-meta table has the following columns, meta_id (primary key), booking_id, key and value.
From my understanding, to use ON DUPLICATE KEY I need to know what in this case is the meta_id, often this isn't the case, I'm trying to simply push a key, value pair to the table using the booking_id, so if the particular key exists then its replaced otherwise its inserted.
At the moment I have a seperate query to try to select the row, if its found then I UPDATE, if not then its an INSERT.
Is there a way of doing an insert/update in one query without using ON DUPLICATE KEY or have I missed a trick with this one?
If possible, I'd drop the meta_id column entirely and turn booking_id and key into a composite primary key. That'll save space in your table, allow use of ON DUPLICATE KEY, and be cleaner in general.