I'm trying to merge multiple tables all sharing the same data structure into one singular table however, it seems as though upon inserting the tables all it is doing is inserting an x amount of rows to equal out with the amount in the source.
Table 1: 20,000 Data Rows
Index Table: 10,000 Data Rows
So if I were to go and insert Table 1 into the Index table using the following:
INSERT IGNORE INTO 'database1`.`Index` SELECT * FROM
`database1`.`Table1` ;
Using the above, it only inserts 10,000 rows of the available 20,000 to equal out.
My guess is that the other 10,000 are duplicate values and, since you're using IGNORE on the INSERT, the statement completes without error.
Since you're using INSERT IGNORE, I'm guessing you have 10000 duplicate keys which are being silently thrown away. Didn't you think it odd that you need that?
Depending on your table layout, you'll need some way to rejig your tables to get around the key constraint. E.g., create a new autoincrement key for the table you're inserting into, append the other table, and sort out the duplications somehow.
It depends on the structures of the tables. Can you show us "show create table ..." on your two tables?
Also, list the feedback you get from the mysql client (e.g., "Records: 100 Duplicates: 0 Warnings: 0") - that helps pinpoint what happened and why.
If there are values with UNIQUE KEYS in your table, you won't get all the inserts to succeed.
Did you tried. REPLACE INTO it's like INSERT but it't replacing by uniq fields.
REPLACE INTO user (id,name) VALUES (12,"John");
#if there is user with id = 12 it wil lreplace it.
Related
I'm using java/Mybatis with MySQL in my project. I need to insert multiple rows into a table and I want to ignore those rows which has a dupliate UNIQUE index. Also I want to get to know which rows are ignored. How to do it? it seems to me that insert ignore into can not tell me which rows are ignored.
I cannot help you out with a solution on how to do that while inserting. But depending on when you need to know the rows that are ignored you could either:
Invert you select so that you get all the duplicates before inserting into the new table.
Deduct the rows in the sink table from the rows in the source table(s) after the insert.
I have database like the following with 10K rows. How to delete duplicate if all fields are same. I don't want to search for any specific company. Is there a way to search and find any multiple entries with all same fields get deleted. Thanks
This command adds a unique key, and drops all rows that generate errors (due to the unique key). This removes duplicates.
ALTER IGNORE TABLE table ADD UNIQUE KEY idx1(title);
Note: This command may not work for InnoDB tables for some versions of MySQL. See this post for a workaround. (Thanks to "an anonymous user" for this information.)
OR
Simply creates a new table without duplicates. Sometimes this is actually faster and easier than trying to delete all the offending rows. Just create a new table, insert the unique rows (I used min(id) for the id of the resulting row), rename the two tables, and (once you are satisfied that everything worked correctly) drop the original table
This below query used to find the duplicate entry using all fields:
Select * from Table group by company_name,city,state,country having count(*)>1;
I have some words like ["happy","bad","terrible","awesome","happy","happy","horrible",.....,"love"].
These words are large in number, exceeding 100 ~ 200 maybe.
I want to saving that to DB at the same time.
I think calling to DB connection at every word is so wasteful.
What is the best way to save?
table structure
wordId userId word
You are right that executing repeated INSERT statements to insert rows one at a time i.e processing RBAR (row by agonizing row) can be expensive, and excruciatingly slow, in MySQL.
Assuming that you are inserting the string values ("words") into a column in a table, and each word will be inserted as a new row in the table... (and that's a whole lot of assumptions there...)
For example, a table like this:
CREATE TABLE mytable (mycol VARCHAR(50) NOT NULL PRIMARY KEY) ENGINE=InnoDB
You are right that running a separate INSERT statement for each row is expensive. MySQL provides an extension to the INSERT statement syntax which allows multiple rows to be inserted.
For example, this sequence:
INSERT IGNORE INTO mytable (mycol) VALUES ('happy');
INSERT IGNORE INTO mytable (mycol) VALUES ('bad');
INSERT IGNORE INTO mytable (mycol) VALUES ('terrible');
Can be emulated with single INSERT statement
INSERT IGNORE INTO mytable (mycol) VALUES ('happy'),('bad'),('terrible');
Each "row" to be inserted is enclosed in parens, just as it is in the regular INSERT statement. The trick is the comma separator between the rows.
The trouble with this comes in when there are constraint violations; either the whole statement succeeds or fails. Unlike the individual inserts, where one of them can fail and the other two succeed.
Also, be careful that the size (in bytes) of the statement does not exceed the max_allowed_packet variable setting.
Alternatively, a LOAD DATA statement is an even faster way to load rows into a table. But for a couple of hundred rows, it's not really going to be much faster. (If you were loading thousands and thousands of rows, the LOAD DATA statement could potentially be much faster.
It would be helpful to know you are generating that list of words but you could do
insert into table (column) values (word), (word2);
Without more info that is about as much as we can help
You could add a loop in whatever language is needed to iterate over the list to add them.
I messed up when trying to create a test Database and accidently duplicated everything inside of a certain table. Basically there is now 2 of every entry there was once before. Is there a simple way to fix this? (Using InnoDB tables)
Yet another good reason to use auto incrementing primary keys. That way, the rows wouldn't be total duplicates.
Probably the fastest way is to copy the data into another table, truncate the first table, and re-insert it:
create temporary table tmp as
select distinct *
from test;
truncate table test;
insert into test
select *
from tmp;
As a little note: in almost all cases, I recommend using the complete column list on an insert statement. This is the one case where it is optional. After all, you are putting all the columns in another table and just putting them back a statement later.
Like this way:
delete from `table` where id = 3;
insert into table (id, value) values (3, "aaa"), (3, "bbb"), (3, "ccc");
The count of value is hundreds, and a lot of value is the same compared with the last time, only a little records to add or delete.
I use this table to store person's property, and that property is repeated, so I insert many records in the table for one person. When update one's property, some records add or delete, and most records not changed, but when I got the new property set, I don't known which to add and which to delete. So I have to delete all the old records, and then insert the new ones, but it too slow for me, is there a faster way?
I think what you did is probably the fastest method when the number of records per person are small relative to the number of records in the whole table, the only obvious way to improve speed is by creating a non-unique index on the id column.
Another way to do what you want, if you are willing to denormalize a little bit, is to combine the properties into a comma separated values. So instead of deleting then inserting multiple rows, you only have to update a single row:
update table set id=3, values="aaa,bbb,ccc" where id=3;
With this, you lose the ability to search by values, unless you manually maintain a reverse index, and your values cannot contain a comma (or whatever terminating character you use). Another trick that might be useful when using this technique is to surround the values with terminating characters:
update table set id=3, values=",aaa,bbb,ccc," where id=3;
This way, you can still do a full-text search on values by surrounding the search term with the terminating character: select * from table where ",aaa," in values
Additionally, you lose the ability to specify unique constraint, therefore you'll have to ensure you don't have duplicate entry for values in your application logic.
are you intending to update multiple tables using a primary key?
then you may have a look at this and this
Once you set up one person with the correct rows, you can copy them to another person like this, for example to copy from id 1 to 2
insert into table (id, value)
select 2, value
from table
where id = 1;