To check varchar values and avoid duplicate entries in MySQL table - mysql

I need to run a application to collect the news feeds and update new entries in my database. So I planned to create two tables one source and other as target.
My plan is to first update all the information into source table and latter update target table with unique data (currently updated news or new records).
But the issue is some feeds are repeated in some other websites. so the application breaks immediately after reading a duplicate entry.
I have attached my MySQL query below
create table table1 (
DateandTime datetime,
Name tinytext,
Title varchar(255),
Content Longtext unique(Title)
);
I know that this sounds too basic. But i dont have any solution.
I appreciate your feedbacks and Ideas. Thank you

Few solutions:
Unique column should prevent duplicate data
INSERT WHERE NOT EXISTS
Use MERGE engine in MySQL (http://dev.mysql.com/doc/refman/5.1/en/merge-storage-engine.html)

I modified my query on Marcus Adams suggestion.
Insert Ignore table1 (
DateandTime,
Name,
Title,
Content)
values
(.......
);
I think single table is sufficient to address my issue.Thank you

Related

Using IF statements to avoid adding duplicates with an SQL query?

I'm trying to learn about databases and SQL, and this is an issue I'm having trouble with: how do I detect if a new entry is a duplicate, and if it is, not discard the new entry, but merge it with the old one?
An example makes it clearer. Let's say that I'm making a database of my video game collection. My columns are 'Title' (varchar) and then a boolean column for each platform I own the game on, since some games are on multiple platforms.
I buy World of Goo, and go to my database to say
INSERT INTO `collections`.`games` (`Title`,`Windows`) VALUES ('World of Goo','1');
Easy. But six months later, I buy it again on Android, because I really like that game and want to play it in bed. What I want to do now is write a query that says
IF (select * from `games` where title = 'World of Goo') {
UPDATE `games`
SET `Android` = '1'
WHERE `title` = 'World of Goo';
} ELSE {
INSERT INTO `collections`.`games` (`Title`,`Android`) VALUES ('World of Goo','1');
}
(I know the first line is wrong, I'd need to write "if query returns 1 or more results", but ignore that for now.)
Now... I know I could do this with a PHP script. But I don't want to use a more complex solution than is necessary -- is there a way do this in SQL alone?
(And this is an example of a problem, I know that in reality I'd remember that I owned a game and just write the query to update it.)
MySQL has implemented an UPSERT statement using INSERT ... ON DUPLICATE KEY UPDATE.
INSERT INTO collections.games (Title, Android)
VALUES ('World of Goo', '1')
ON DUPLICATE KEY UPDATE Android = '1'
but in order to work the statement above, you need to make column Title unique.
ALTER TABLE games ADD CONSTRAINT games_uq UNIQUE (Title)
I would use insert into ... on duplicate key
INSERT INTO games (Title,Android) VALUES ('World of Goo','1') ON DUPLICATE KEY UPDATE Android=1
Remember that your Title column has tu be UNIQUE. If it is not, let it be with:
CREATE UNIQUE INDEX unique_title
ON games (Title)
Anyways, i think your model is not the best, since if youu consider a new platform in the future, you will have to alter the table, and probaly update many records.
I would prefer a games table, a platforms table, and a game_rel_platform table where you put an entry for every gameid-platformid pair
Noticed by the tags you are using MySQL. My suggestion is to use INSERT INTO ... ON DUPLICATE KEY UPDATE and have title as primary key (declared as UNIQUE):
INSERT INTO `collections`.`games` (`Title`,`Windows`) VALUES ('World of Goo','1') ON DUPLICATE KEY UPDATE `Windows`=VALUES(`Windows`)

MySQL - insert into with foreign key index

Here is the scenario:
I have 2 tables and 2 temporary tables. Before I insert user data to the official tables, I insert them to a temp table to let them do the checks. There is a company table with company info, and a contact table that has contact info. The contact table has a field called company_id which is a foreign key index for the company table.
Temp tables are set up the same way.
I want to do something like: INSERT INTO company () SELECT * FROM temp_company; and INSERT INTO contact () SELECT * FROM temp_contact
My question is, how do I transfer the foreign key from the temp_company to the newly inserted id on the company table using a statement like this? Is there a way to do it?
Currently I am:
grabbing the temp rows
going one by one and inserting them
grabbing the last insert id
then inserting the contacts afterwards with the new last insert id
I just don't know if that is the most efficient way. Thanks!
if you have the same number of columns in both tables and then you should just be able to use the syntax you have there? Just take out the (). Just make sure there aren't any duplicate primary keys:
INSERT INTO company SELECT * FROM temp_company;
INSERT INTO contact SELECT * FROM temp_contact;
You can also specifically specify the columns that get inserted, this way you can specify exactly which column you insert as the new ID.
INSERT INTO company (`ID`,`col_1`,...,`last_col`) SELECT `foreign_key_col`,`col_1`,...,`last_col` FROM temp_company;
INSERT INTO contact (`ID`,`col_1`,...,`last_col`) SELECT `foreign_key_col`,`col_1`,...,`last_col` FROM temp_contact;
Just make sure you are selecting the right # of columns.

Update a row in mysql and drop the row if it creates a duplicate as defined by the unique key

I've got a MySQL table that has a lot of entries. Its got a unique key defined as (state, source) so there are no duplicates for that combination of columns. However now I am realizing that much of the state data is not entered consistently. For example in some rows it is entered as "CA" and others it might be spelled out as "California."
I'd like to update all the entries that say "California" to be "CA" and if it creates a conflict in the unique key, drop the row. How can I do that?
You may be better off dumping your data and using an external tool like Google Refine to clean it up. Look at using foreign keys in the future to avoid these issues.
I don't think you can do this in one SQL statement. And if you have foreign key relationships from other tables to the one you are trying to clean-up then you definitely do not want to do this in one step (even if you could).
CREATE TABLE state_mappings (
`old` VARCHAR(64) NOT NULL,
`new` VARCHAR(64) NOT NULL
);
INSERT INTO state_mappings VALUES ('California', 'CA'), ...;
INSERT IGNORE INTO MyTable (state, source)
SELECT sm.new, s.source from states s JOIN state_mappings sm
ON s.state = sm.old;
// Update tables with foreign keys here
DELETE FROm MyTable WHERE state IN (SELECT distinct old FROM state_mappings);
DROP TABLE state_mappings;
I'm no SQL pro, so these statements can probably be optimized, but you get the gist.

mysql inserting with foreign key

i do have a form field which includes values which will be put in different tables in mysql.
they are all connected with foreign keys.
how do i put these values to different tables.
pseudo tables:
users_table:
userId|userlogin
user_info:
info_id|userId|name|surname
user_contact:
contact_id|userId|phone|email
form includes:
userlogin
name
surname
phone
email
in my research, i found out that i can use mysql_insert_id to link the FKs, but i wonder if that can cause problems if there is high load in the website (diff. requests sent at the same time).
i also found out that i can set triggers to create new fk values:
CREATE TRIGGER ins_kimlik AFTER INSERT ON hastalar
for each row
insert into hasta_kimlik set idhasta = new.idhasta
but i don't know how to add data to them. i can use
UPDATE table SET (name, surname) VALUES ('John', 'Brown') WHERE info_id = LAST_INSERT_ID();
but it doesn't feel the native way.
what is the best practise?
i found out that i can use mysql_insert_id to link the FKs, but i wonder if that can cause
problems if there is high load in the website (diff. requests sent at the same time).
mysql_insert_id returns the last auto-increment value generated by the database connection currently in use.
It doesn't matter what other processes do on other connections. It is safe. You'll get the right value.
but it doesn't feel the native way.
nope. The right way is :
INSERT user
get id
INSERT user_info
If the tables are connected by foreign keys shouldn't you just start with the basic table (users_table here) and then add in either user_info table and then in user_contact table, or the other way around. As long as you have filled in the table that has the primary key of the fk's in the other tables, then you can add easily.
INSERT SQL command:
INSERT INTO table_name (column1, column2, column3,...) VALUES (value1,
value2, value3,...)
Is that what you were asking?

Fix DB duplicate entries (MySQL bug)

I'm using MySQL 4.1. Some tables have duplicates entries that go against the constraints.
When I try to group rows, MySQL doesn't recognise the rows as being similar.
Example:
Table A has a column "Name" with the Unique proprety.
The table contains one row with the name 'Hach?' and one row with the same name but a square at the end instead of the '?' (which I can't reproduce in this textfield)
A "Group by" on these 2 rows return 2 separate rows
This cause several problems including the fact that I can't export and reimport the database. On reimporting an error mentions that a Insert has failed because it violates a constraint.
In theory I could try to import, wait for the first error, fix the import script and the original DB, and repeat. In pratice, that would take forever.
Is there a way to list all the anomalies or force the database to recheck constraints (and list all the values/rows that go against them) ?
I can supply the .MYD file if it can be helpful.
To list all the anomalies:
SELECT name, count(*) FROM TableA GROUP BY name HAVING count(*) > 1;
There are a few ways to tackle deleting the dups and your path will depend heavily on the number of dups you have.
See this SO question for ways of removing those from your table.
Here is the solution I provided there:
-- Setup for example
create table people (fname varchar(10), lname varchar(10));
insert into people values ('Bob', 'Newhart');
insert into people values ('Bob', 'Newhart');
insert into people values ('Bill', 'Cosby');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Adam', 'Sandler');
-- Show table with duplicates
select * from people;
-- Create table with one version of each duplicate record
create table dups as
select distinct fname, lname, count(*)
from people group by fname, lname
having count(*) > 1;
-- Delete all matching duplicate records
delete people from people inner join dups
on people.fname = dups.fname AND
people.lname = dups.lname;
-- Insert single record of each dup back into table
insert into people select fname, lname from dups;
-- Show Fixed table
select * from people;
Create a new table, select all rows and group by the unique key (in the example column name) and insert in the new table.
To find out what is that character, do the following query:
SELECT HEX(Name) FROM TableName WHERE Name LIKE 'Hach%'
You will se the ascii code of that 'square'.
If that character is 'x', you could update like this:(but if that column is Unique you will have some errors)
UPDATE TableName SET Name=TRIM(TRAILING 'x' FROM Name);
I'll assume this is a MySQL 4.1 random bug. Somes values are just changing on their own for no particular reason even if they violates some MySQL constraints. MySQL is simply ignoring those violations.
To solve my problem, I will write a prog that tries to resinsert every line of data in the same table (to be precise : another table with the same caracteristics) and log every instance of failures.
I will leave the incident open for a while in case someone gets the same problem and someone else finds a more practical solution.