Is it possible to set up different indexing on a read only slave, from on the master? Basically, this seems like it makes sense given the different requirements of the two systems, but I want to make sure it will work and not cause any problems.
I believe so. After replication is working, you can drop the indexes on the slave and create the indexes you want and that should do it. Since MySQL replicates statements and not data (at least by default), as long as the SQL necessary to insert or update or select from the table doesn't need to change, it shouldn't notice.
Now there are obviously downsides to this. If you make a unique key that isn't on the master, you could get data inserted on the master that can't be inserted on the slave. If an update is done that uses an index it may run fast on the master but cause a table scan on the slave (since you don't have whatever index was handy).
And if any DDL changes ever happen on the master (such as to alter an index) that will be passed to the slave and the new index will be created there as well, even though you don't want it to.
For sure. I do it all the time. Issues I've run into:
Referencing indexes via FORCE/USE/IGNORE INDEX in SELECTS will error out
Referencing indexes in ALTER statments on the master can break replication
Adds another step to promoting a slave to be the master in case of emergency
If you're using statement based replication (the norm), and you're playing around with UNIQUE indexes, any INSERT... ON DUPLICATE KEY, INSERT IGNORE or REPLACE statments will cause extreme data drifting / divergence
Performance differences (both good and bad)
Sure, I think it's even a common practice to replicate InnoDB tables into MyISAM tables on the slave to be able to add full-text indexes.
Related
I'm currently thinking about the following problem:
A customer has set up a simple master/slave replication between two mariaDB systems. For unknown reasons they have set the flag "Replicate_Wild_Ignore_Table" to skip "logdb.%". Obviously, they decided to skip the skipping of that database and want the logdb to be included in the replication again.
I'm curious now, is it possible to somehow remove that flag and have the database in question be replicated as the rest or is there no way to circumvent the "stop slave, dump master, import dump, recreate replication based on current logpos, start slave" procedure?
You can't assume that the master still has all relevant binlogs that once contained updates to the logdb.% tables. That is, even if you could re-apply those updates, do you have enough history to account for all changes to the tables?
Another risk is if you use statement-based replication, if there were ever statements that referenced both a table in logdb.% and a table in another database, the replication filter has skipped that statement. So for example:
INSERT INTO mydb.mytable SELECT * FROM logdb.othertable;
Therefore even the tables that are not in logdb.% might be compromised. The point is you don't know for sure.
The bottom line is that you should definitely reinitialize the replica now by taking a current backup of the master, and avoid using replication filters in the future.
If you use InnoDB tables, you might consider using Percona XtraBackup to make the process easier. See https://www.percona.com/doc/percona-xtrabackup/2.3/howtos/setting_up_replication.html
I need a fresh opinion on the case. Any thoughts are appreciated.
Input: we have a huge percona mysql (5.5) database that takes a couple of Tb (terabytes). Tables on innodb engine.
More than a half (2/3) of that size should be deleted as quick as possible.
Also we have master-slave configuration.
As the quickest way to achieve that I am considering the following solution:
Execute for each table on the slave server (to avoid production downtime) :
Stop replication
Select the rows NOT to be deleted into an empty new table that has the same structure as the original table
Rename original table to "table_old", new table - to correct name
Drop the original table "table_old"
Start replication
The problem is that we have a lot of FK constraints. Also I am afraid to break the replication during this process.
Questions:
1) What the potential problems can be with FK constraints in this solution?
2) How do not break replication?
3) Opinions? Alternative solutions?
Thank you in advance.
if you can put db offline (aka no one is accessing the db except you) for a while, you can go with your solution but you need to drop the FK involved before and to recreate them after. You should also check for AUTO_INCREMENT columns that will change number with copy operation.
the FK are needed if you want the db online, I had a similar problem with some huge log tables, any try to delete all the records at a time will probably lock the database or corrupt the table.
so I went for a slow approach, I made a procedure that will delete batches of rows from the tables using clustered primary key, and then I scheduled it to run every n seconds.
We have got a MySQL table which has got more than 7.000.000 (yes seven million) rows.
We are always doing so much SELECT / INSERT / UPDATE queries per 5 seconds.
Is it a good thing that if we create MySQL INDEX for that table? Will there be some bad consequences like data corrupting or loosing MySQL services etc.?
Little info:
MySQL version 5.1.56
Server CentOS
Table engines are MyISAM
MySQL CPU load between 200% - 400% always
In general, indexes will improve the speed of SELECT operations and will slow down INSERT/UPDATE/DELETE operations, as both the base table and the indexes must be modified when a change occurs.
It is very difficult to say such a thing. I would expect that the indexing itself might take some time. But after that you should have some improvements. As said by #Joe and #Patrick, it might hurt your modification time, but the selecting will be faster.
Ofcourse, there are some other ways of improving performance on inserting and updating. You could ofcourse batch updates if it is not important to have change visible immediatly.
The indexes will help dramatically with selects. Especially if the match up well with the commonly filtered fields. And you have a good simple primary key. They will help with both the time of the queries and the processing cycles.
The drawbacks are if you are very often updating/altering/deleting these records, especially the indexed fields. Even in this case though, it is often worth it.
How much you're going to be reporting (select statement) vs updating (should!) hugely affects both your initial design as well as your later adjustments once your db is in the wild. Since you already have what you have, testing will give you the answers you need. If you really do a lot of select queries, and a lot of updating, your solution might be to copy out data now and then to a reporting table. Then you can index like crazy with no ill effects.
You have actually asked a large question, and you should study up on this more. The general things I've mentioned above hold for most all relational dbs, but there are also particular behaviors of the particular databases (MySQL in your case), mainly in how they decide when and where to use indexes.
If you are looking for performance, indexes are the way to go. Indexes speed up your queries. If you have 7 Million records, your queries are probably taking many seconds possibley a minute depending on your memory size.
Generally speaking, I would create indexes that match the most frequent SELECT statements. Everyone talks about the negative impact of indexes on table size and speed but I would neglect those impacts unless you have a table for which you are doing 95% of the time inserts and updates but even then, if those inserts happen at night and you query during the day, go and create those indexes, your users during daytime will appreciate it.
What is the actual time impact to an insert or update statement if there is an additional index, 0.001 secondes maybe? If the index saves you many seconds per each query, I guess the additional time required to update index is well worth it.
The only time I ever had an issue with creating an index (it actually broke the program logic) was when we added a primary key to a table that was previously created (by someone else) without a primary key and the program was expecting that the SELECT statement returns the records in the sequence they were created. Creating the primary key changed that, the records when selecting without any WHERE clause were returned in a different sequence.
This is obviously a wrong design in the first place, nevertheless, if you have an older program and you encounter tables without primary key, I suggest to look at the code that reads that table before adding a primary key, just in case.
One more last thought about creating indexes, the choice of fields and the sequence in which the fields appear in the index have an impact on the performance of the index.
I had the same kind of problem that you describe.
I did a few changes and 1 query passed from 11sec to a few milliseconds
1- Upgraded to MariaDB 10.1
2- Changed ALL my DB to ARIA engine
3- Changed my.cnf to the strict mininum
4- Upgraded php 7.1 (but this one had a little impact)
5- with CentOS : "Yum update" in the terminal or via ssh (by keeping everything up to date)
1- MariaDB is the new Open source version of MYSQL
2- ARIA engine is the evolution of MYISAM
3- my.cnf have usually too much change that affect performance
Here an example
[mysqld]
performance-schema=1
general_log=0
slow_query_log=0
max_allowed_packet=268435456
By removing all extra options from the my.cnf, it's telling mysql to use default values.
In MYSQL 5 (5.1, 5.5, 5.6...) When I did that ; I only noticed a small difference.
But in MariaDB -> the small my.cnf like this did a BIG difference.
******* ALL of those changes ; the server hardware remained the same.
Hope it can help you
I'm trying to convert a 10million rows MySQL MyISAM table into InnoDB.
I tried ALTER TABLE but that made my server get stuck so I killed the mysql manually. What is the recommended way to do so?
Options I've thought about:
1. Making a new table which is InnoDB and inserting parts of the data each time.
2. Dumping the table into a text file and then doing LOAD FILE
3. Trying again and just keep the server non-responsive till he finishes (I tried for 2hours and the server is a production server so I prefer to keep it running)
4. Duplicating the table, Removing its indexes, then converting, and then adding indexes
Changing the engine of the table requires rewrite of the table, and that's why the table is not available for so long. Removing indexes, then converting, and adding indexes, may speed up the initial convert, but adding index creates a read lock on your table, so the effect in the end will be the same. Making new table and transferring the data is the way to go. Usually this is done in 2 parts - first copy records, then replay any changes that were done while copying the records. If you can afford disabling inserts/updates in the table, while leaving the reads, this is not a problem. If not, there are several possible solutions. One of them is to use facebook's online schema change tool. Another option is to set the application to write in both tables, while migrating the records, than switch only to the new record. This depends on the application code and crucial part is handling unique keys / duplicates, as in the old table you may update record, while in the new you need to insert it. (here transaction isolation level may also play crucial role, lower it as much as you can). "Classic" way is to use replication, which, as far as I know is also done in 2 parts - you start replication, recording the master position, then import dump of the database in the second server, then start it as a slave to catch up with changes.
Have you tried to order your data first by the PK ? e.g:
ALTER TABLE tablename ORDER BY PK_column;
should speed up the conversion.
Having a production table, with a very critical column (date) missing an index, are there any ways to apply said index without user impact?
The table currently gets around 5-10 inserts every second, so full table locking is out; redirecting those inserts to alternative table / database, even temporarily, is also denied (for corporate politics reasons). Any other ways?
As far as I know this is not possible with MyISAM. With 5-10 INSERTs per second you should consider InnoDB anyways, unless you're not reading that much.
Are you using replication, preferable in a Master-Master Setup? (You should!) If that is the case, you could CREATE INDEX on the standby server, switch roles and do the same, then switch back. Be sure to disable replication temporarily (when using master-master) to avoid replicating the CREATE INDEX to the active node.
Depending on whether you use that table primarily to archive Logs or similar, you might aswell look into the Archive Storage engine.