How to avoid duplicates while updating a MySQL database?

How to avoid duplicates while updating a MySQL database? - mysql

I'm receiving a MySQL dump file .sql daily from an external server, which I don't have any control of. I created a local database to store all data in the .sql file. I hope I can set up a script to automatically update my local database daily. The sql file I'm receiving daily contains old data that is in the local database already. How can I avoid duplicates of such old data and only insert into the local MySQL server new data? Thank you very much!

You can use a third-party database compare tool such as those from Red Gate to create two databases, one current (your "master") and the new dump. You can then run the compare tool between the two versions and update only changes between them, updating your master.

Use unique constraints on field, that you want to be unique.
Also, as Danny Beckett mentioned, to avoid errors in output (which I would prefer to redirect into file for future analysis, to check, if I haven't missed anything in process), you can use INSERT IGNORE construct instead of INSERT.

You can use a constraint supported with IGNORE statement.
The second option, you can first insert the data to a temp table. Then insert only the difference.
Using the second option you may use some restriction to do not search for duplication through add records stored in database.

You need to create a primary key in your table. It should be a unique combination of column values. Using the INSERT query with IGNORE will avoid adding duplicates in this table.
see http://dev.mysql.com/doc/refman/5.5/en/insert.html

If this is a plain vanilla mysqldump file, then normally it includes DROP TABLE IF EXISTS... statements and create table statements, so the tables are recreated when the data is imported. So duplicte data should not be a problem, unless I'm missing something.

Related

java and mysql load data infile misunderstanding

Thanks for viewing this. I need a little bit of help for this project that I am working on with MySql.
For part of the project I need to load a few things into a MySql database which I have up and running.
The info that I need, for each column in the table Documentation, is stored into text files on my hard drive.
For example, one column in the documentation table is "ports" so I have a ports.txt file on my computer with a bunch of port numbers and so on.
I tried to run this mysql script through phpMyAdmin which was
LOAD DATA INFILE 'C:\\ports.txt" INTO TABLE `Documentation`(`ports`).
It ran successfully so I went to do the other load data i needed which was
LOAD DATA INFILE 'C:\\vlan.txt' INTO TABLE `Documentation` (`vlans`)
This also completed successfully, but it added all the rows to the vlan column AFTER the last entry to the port column.
Why did this happen? Is there anything I can do to fix this? Thanks

Why did this happen?
LOAD DATA inserts new rows into the specified table; it doesn't update existing rows.
Is there anything I can do to fix this?
It's important to understand that MySQL doesn't guarantee that tables will be kept in any particular order. So, after your first LOAD, the order in which the data were inserted may be lost & forgotten - therefore, one would typically relate such data prior to importing it (e.g. as columns of the same record within a single CSV file).
You could LOAD your data into temporary tables that each have an AUTO_INCREMENT column and hope that such auto-incremented identifiers remain aligned between the two tables (MySQL makes absolutely no guarantee of this, but in your case you should find that each record is numbered sequentially from 1); once there, you could perform a query along the following lines:
INSERT INTO Documentation SELECT port, vlan FROM t_Ports JOIN t_Vlan USING (id);

MySQL backup multi-client DB for single client

I am facing a problem for a task I have to do at work.
I have a MySQL database which holds the information of several clients of my company and I have to create a backup/restore procedure to backup and restore such information for any single client. To clarify, if my client A is losing his data, I have to be able to recover such data being sure I am not modifying the data of client B, C, ...
I am not a DB administrator, so I don't know if I can do this using standard mysql tools (such as mysqldump) or any other backup tools (such as Percona Xtrabackup).
To backup, my research (and my intuition) led my to this possibile solution:
create the restore insert statement using the insert-select syntax (http://dev.mysql.com/doc/refman/5.1/en/insert-select.html);
save this inserts into a sql file, either in proper order or allowing this script to temporarily disable the foreign key checks to meet foreign keys' constraint;
of course, I do this for all my clients on a daily base, using a file for each client (and day).
Then, in the case I have to restore the data for a specific client:
I delete all his data left;
I restore the correct data using his sql file I created during the backup.
This way I believe I may recover the right data of client A without touching the data of client B. Is my solution eventually working? Is there any better way to achieve the same result? Or do you need more information about my problem?
Please, forgive me if this question is not well-formed, but I am new here and this is my first question so I may be unprecise...thanks anyway for the help.
Note: we will also backup the entire database with mysqldump.

You can use the --where parameter, you could provide a condition like *client_id=N* . Of course I am making an assumption since you don't provide any information on your schema.
If you have a Star schema , then you could probably write a small script that backups all lookup tables (considering they are adequately small) by using this parameter --tables and use the --where condition for your client data table. For additional performance, perhaps you could partition the table by the client_id.

How to dump database from mysql with sensitive data removed or corrupted?

I am using mysql. Some of the tables contain sensitive data like user names, email addresses, etc. I want to dump the data but with these columns in the table removed or modified to some fake data. Is there any way to do it easily?

I'm using this approach:
Copy contents of sensitive tables to a temporary table.
Clear/encrypt the sensitive columns.
Provide --ignore-table arguments to mysqldump.exe to leave the original tables out.
It preserves foreign key contraints, and you can keep columns that are not sensitive.
The first two actions are contained in a stored procedure that I call before doing the dump. It looks something like this:
BEGIN
truncate table person_anonymous;
insert into person_anonymous select * from person;
update person_anonymous set Title=null, Initials=mid(md5(Initials),1,10), Midname=md5(Midname), Lastname=md5(Lastname), Comment=md5(Comment);
END
As you can see, I'm not clearing the contents of the fields. Instead, I keep a hash. That way, you can still see which rows have the same value, and between exports you can see if something changed or not, without anyone being able to read the actual values.

There is a tool called Jailer that is typically used to export a subset of a database. We use this at work to create a smaller test database from a production backup, with all sensitive data obfuscated.
The GUI is a bit crude, but Jailer is the best alternative I have found so far.
You can simply unselect the sensitive tables or columns and get a full copy of the rest. Jailer also supports obfuscating data during export - you could for instance md5 hash all user names or change all email addresses to user#example.org.
There is a tutorial to get you started.

ProxySQL is another approach.
Here is an article explaining how to obfuscate data with proxysql.
https://proxysql.com/blog/obfuscate-data-from-mysqldump

partial restore from sql dump?

I have a table that has 7000 rows,
I added a new column to this table
The table has a mysql DateTime so.
When i updated the table to fill in this new table it updated the datetime,
I took an sql dump just before i did the update so now i need to use the sql dump to revert the datetime back (and only that column).
How do i do that?

There are a couple ways I can think of to do this off the top of my head.
First is to create another mysql database and load the dump into that database (make sure it's not going to load into the first database from a use commmand in the dump), and then use the data from that database to construct the update queries for the first.
The second, easier, more hackish way, is to open the dump in a text editor, pull out just that table, and find and replace to make update statements for just that column based on primary key instead of inserts. You'd need to be able to find and replace on patterns.
A third way would be to load the dump in an abstract sql tool letting it do the parsing for you, and write new queries from the data in the abstract syntax trees.
A fourth, again hackish, possibility, if this isn't a live system, is to rollback and re-perform the more recent transformations (only if they are simple).

Restore the dump to a second table. Select the ID and datetime from that table. Use those results to update the rows in the original table corresponding to the IDs you got.

question about MySQL database migration

If I have a MySQL database with several tables on a live server, now I would like to migrate this database to another server. Of course, the migration I mean here involves some database tables, for example: add some new columns to several tables, add some new tables etc..
Now, the only method I can think of is to use some php/python(two scripts I know) script, connect two databases, dump the data from the old database, and then write into the new database. However, this method is not efficient at all. For example: in old database, table A has 28 columns; in new database, table A has 29 columns, but the extra column will have default value 0 for all the old rows. My script still needs to dump the data row by row and insert each row into the new database.
Using MySQLDump etc.. won't work. Here is the detail. For example: I have FOUR old databases, I can name them as 'DB_a', 'DB_b', 'DB_c', 'DB_d'. Now the old table A has 28 columns, I want to add each row in table A into the new database with a new column ID 'DB_x' (x to indicate which database it comes from). If I can't differentiate the database ID by the row's content, the only way I can identify them is going through some user input parameters.
Is there any tools or a better method than writing a script yourself? Here, I dont need to worry about multithread writing problems etc.., I mean the old database will be down (not open to public usage etc.., only for upgrade ) for a while.
Thanks!!

I don't entirely understand your situation with the columns (wouldn't it be more sensible to add any new columns after migration?), but one of the arguably fastest methods to copy a database across servers is mysqlhotcopy. It can copy myISAM only and has a number of other requirements, but it's awfully fast because it skips the create dump / import dump step completely.

Generally when you migrate a database to new servers, you don't apply a bunch of schema changes at the same time, for the reasons that you're running into right now.
MySQL has a dump tool called mysqldump that can be used to easily take a snapshot/backup of a database. The snapshot can then be copied to a new server and installed.
You should figure out all the changes that have been done to your "new" database, and write out a script of all the SQL commands needed to "upgrade" the old database to the new version that you're using (e.g. ALTER TABLE a ADD COLUMN x, etc). After you're sure it's working, take a dump of the old one, copy it over, install it, and then apply your change script.

Use mysqldump to dump the data, then echo output.txt > msyql. Now the old data is on the new server. Manipulate as necessary.

Sure there are tools that can help you achieving what you're trying to do. Mysqldump is a premier example of such tools. Just take a glance here:
http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html
What you could do is:
1) You make a dump of the current db, using mysqldump (with the --no-data option) to fetch the schema only
2) You alter the schema you have dumped, adding new columns
3) You create your new schema (mysql < dump.sql - just google for mysql backup restore for more help on the syntax)
4) Dump your data using the mysqldump complete-insert option (see link above)
5) Import your data, using mysql < data.sql
This should do the job for you, good luck!

Adding extra rows can be done on a live database:
ALTER TABLE [table-name] ADD [row-name] MEDIUMINT(8) default 0;
MySql will default all existing rows to the default value.
So here is what I would do:
make a copy of you're old database with MySql dump command.
run the resulting SQL file against you're new database, now you have an exact copy.
write a migration.sql file that will modify you're database with modify table commands and for complex conversions some temporary MySql procedures.
test you're script (when fail, go to (2)).
If all OK, then goto (1) and go live with you're new database.

These are all valid approaches, but I believe you want to write a sql statement that writes other insert statements that support the new columns you have.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008