I'm experimenting with various indexing settings for my mysql database.
I wonder though, by removing or adding indexes is there any possibility to damage data rows in any way? Obviously I realise that if I make any application queries fail, that can cause bad rows. I'm more talking just about the structural queries themselves.
Or will I simply affect the efficiency of the database?
I just want to know if I have safety to experiment or if I have to be cautious?
The data isn't in phpmyadmin, it's in mysql. Adding/removing an index will not affect your data integrity by default. With a unique index, and using the ignore keyword it can.
That said - you should always have a backup of your data, it's easy to run a test like:
CREATE TABLE t1 LIKE t;
INSERT INTO t1 SELECT * FROM t;
ALTER TABLE t1 CREATE INDEX ...;
Then compare the difference in tables (perhaps a COUNT is fine in your case).
Adding/removing indexes is safe in terms of the rows in your table. However as you note, too many indexes or poorly constructed indexes can be (very) detrimental to performance. Likewise, adding indexes on large tables can be a very expensive process, and can bring a MySQL server to its knees, so you're better off not "experimenting" on production tables.
Related
I have 10 large read-only tables. I make a lot of queries that have the same format with different parameters, for example:
SELECT 'count' FROM table1 WHERE x='SOME VARIABLE';
SELECT 'count' FROM table1 WHERE x='SOME OTHER VARIABLE';
...
SELECT 'count' FROM table2 WHERE x='' AND y='';
SELECT 'count' FROM table2 WHERE x='' AND y='';
...
SELECT 'count' FROM table3 WHERE x='' AND y='' AND z='';
The variables used are different each query so I almost never execute the same query twice. Is it correct that query caching and row caching on the MySQL side would be wasteful and they should be disabled? Table caching seems like it would be a good thing.
On the client side, I am using prepared statements which I assume is good. If I enable Prepared statement caching (via Slick) though won't that hurt my performance since the parameters are so variable? Is there anything else I can do to optimize my performance?
Should auto-commit be off since I'm only doing selects and will never need to rollback?
Given that you are using the MYISAM engine and have tables which have hundreds of millions of active rows, I would take care less of how I query the cache (due to your low complexity, this is most likely the least problem), but more focus on the proper organization of the data within the database:
Prepared Statements are totally ok. It may be helpful to not prepare the statement over and over again. Instead, just reuse the existing prepared statement (some environments even allow to store prepared statements on the client side) with a new set of parameter values. However, this mainly only saves time, which is being used in the query cache. As the complexity of your query is quite low, it can be assumed that this won't be the biggest time consumer.
Key Caching (also called Key Buffering), however, is - as the name already suggests - key for your game! Most DB configurations of MySQL suffer greatly from wrong values in that area, as the buffers are way too small. In a nutshell, key caching makes sure that the references to the data (for instance in your indices) can be accessed in main memory. If they are not in memory, they need to be retrieved from the disk, which is slow. To see if your key cache is efficient, you should watch the key hit ratio, when your system is under load. Details about that is greatly explained at https://dba.stackexchange.com/questions/58182/tuning-key-reads-in-mysql.
If the caches become large or are being displaced frequently due to the usage of other tables, it may be helpful to create own key caches for your tables. For details, see https://dev.mysql.com/doc/refman/5.5/en/cache-index.html
If you always access large portions of your table via the same attributes, it may make sense to change the ordering of the data storage on the disk by using ALTER TABLE ... ORDER BY expr1, expr2, .... For details on this approach see also https://dev.mysql.com/doc/refman/5.5/en/optimizing-queries-myisam.html
Avoid using variable-length columns, such as VARCHAR, BLOB or TEXT. They might help to save some space, but especially comparing their values can become time-consuming. Please note, however, that already one single column of such a type will MySQL make switch to Dynamic column mode.
Run ANALYZE TABLE after huge data changes to keep the statistics up to date. If you have deleted huge areas, it might help to OPTIMIZE TABLE, helping to make sure that there are no large gaps around which need to be skipped when reading.
Use INSERT DELAYED to write changes asynchronously, if you do not need the reply. This will greatly improve your performance, if there are other SELECT statements around at the same point in time.
Alternatively, if you need the reply, you may use INSERT LOW_PRIORITY. Then the execution of the concurrent SELECTs are preferred compared to your INSERT. This may help to ease the pain of the fact a little, that MyISAM only supports table-level locking.
You may try to provide Index Hints to your queries, especially if there are multiple indices on your table which are overlapping each other. You should try to use that index which has the smallest width, but still covers the most attributes.
However, please note that in your case the impact must be quite small: You are not ordering/grouping or joining, so the query optimizer should already be very good at finding the best one. Simply check by using EXPLAIN on your SELECT statement to see, if the choice of the index used is reasonable.
In short, Prepared Statements are totally ok. Key Caching is key - and there are some other things you can do to help MySQL getting along with the whole bulk of data.
I have a table in my MySQL database with round 5M rows. Inserting rows to the table is too slow as MySQL updates index while inserting. How to stop index updating while inserting and do the indexing separately later?
Thanks
Kamrul
Sounds like your table might be over indexed. Maybe post your table definition here so we can have a look.
You have two choices:
Keep current indexes and remove unused indexes. If you have 3 indexes on a table for every single write to the table there will be 3 writes to the indexes. A index is only helpful during reads so you might want to remove unused indexes. During a load indexes will be updated which will slow down your load.
Drop you indexes before load then recreate them after load. You can drop your indexes before data load then insert and rebuild. The rebuild might take longer than the slow inserts. You will have to rebuild all indexes one by one. Also unique indexes can fail if duplicates are loaded during the load process without the indexes.
Now I suggest you take a good look at the indexes on the table and reduce them if they are not used in any queries. Then try both approaches and see what works for you. There is no way I know of in MySQL to disable indexes as they need the values insert to be written to their internal structures.
Another thing you might want to try it to split the IO over multiple drives i.e partition your table over several drives to get some hardware performance in place.
i was wondering, if i add one index for each field in every table of my DB, will that make my queries run faster?
or do i have to analyze my queries and create indexes only when required?
Adding an index on each column will probably make most of your queries faster, but it's not necessarily the best approach. It is better to tune your indexes to your specific queries, using EXPLAIN and performance measurements to guide you in adding the correct indexes.
In particular you need to understand when you shouldn't index a column, and when you need multi-column indexes.
I would advise reading the MySQL manual for optimization of SELECT statements which explains under what conditions indexes can be used.
The more indexes you have, the heavier inserting/updating gets. So it's a tradeoff. The select queries that cannot use an index now will get quicker ofcourse, but if you check what fields you're joining on (or using in a where) you will not trade off that much
(and, ofcourse, there is the disk-space, but most of the time I don't really care bout that: ) )
Another point is that MySql can only use a single index for a query, so if your query is
SELECT * FROM table WHERE status = 1 AND col1='foob' AND col2 = 'bar'
MySql will use 1 of the indexes, and filter out the rest reading the data from the table.
If you have queries like this, its better to create a composite index on (status, col1, col2)
Adding index on every field in every table is not smart.
You should add indexes ONLY on columns that you use in the WHERE clause in select OR on which you sort.
Often, the best results are achieved by using multi-column indexes that are specific to your SQL selects.
There are also a partial indexes with limit on the length of field which can also be used to optimize performance and reduce the index site.
Every unnecessary index will slow down the database during the insert because on every insert, every index has to be updated.
Also the more indexes you have, the more chances you have of data corruption. And lastly, indexes take extra storage space on disk, sometimes a lot of space.
Also MySQL tries to keep indexes in memory. If you have unnecessary indexes, there is a good change MySQL will end up using up the available memory with unnecessary indexes in which case your performance will degrade considerable.
Creating the right kind of indexes is probably the single most important optimization technique. That's why when someone asks something like this I thought it was a joke.
This question can only be asked by someone who have not read a single book on MySQL. Just get a good book and read it, then you will not have to ask questions like this.
In mysql there are lot of slow queries, only related to update and delete statements. The tables have 2 index columns and not heavily indexed tables. Each table having 30K records on average.
Please give your suggestions on how to overcome the slow queries related to update and delete queries. These kind of queries:
DELETE FROM <table2>
WHERE ID IN (SELECT ID
FROM <table1> WHERE ID2=100);
...or:
UPDATE <table1>
SET <colmunname>=0
WHERE id=1001;
Being that the tables are indexed, my first suggestion is to update the statistics for the tables using ANALYZE TABLE:
ANALYZE TABLE table1, table2
But beware:
During the analysis, the table is locked with a read lock for MyISAM, BDB, and InnoDB.
The fact that your problem only exists on updates and deletes tells me that you are probably indexing too much.
Indexes will vastly reduce the time that certain queries take, but will require extra work when inserting, updating, and deleting entries from your tables. Try eliminating indexes on columns that are often getting updated, especially if they don't often appear in a where clause in your SQL queries.
Managing indexes and foreign key relationships both incur overhead during update and delete operations. I would restore a copy of your prod db on a dev server, drop all foreign key constraints and all expect your primary key indexes and see the performance difference. Then you can add back your indexes until you have a better performance balance for your app.
First and foremost: Use stored procedures.
Next: If your db has optimizing capabilities -> use them.
Finally: Consider using no-relational dbs like CouchDB or Cassandra instead ;-)
I'm doing a lot of INSERTs via LOAD DATA INFILE on MySQL 5.0. After many inserts, say a few hundred millions rows (InnoDB, PK + a non-unique index, 64 bit Linux 4GB RAM, RAID 1), the inserts slow down considerably and appear IO bound. Are partitions in MySQL 5.1 likely to improve performance if the data flows into separate partition tables?
The previous answer is erroneous in his assumptions that this will decrease performance. Quite the contrary.
Here's a lengthy, but informative article and the why and how to do partitioning in MySQL:
http://dev.mysql.com/tech-resources/articles/partitioning.html
Partitioning is typically used, as was mentioned, to group like-data together. That way, when you decided to archive off or flat out destroy a partition, your tables do not become fragmented. This, however, does not hurt performance, it can actually increase it. See, it is not just deletions that fragment, updates and inserts can also do that. By partitioning the data, you are instructing the RDBMS the criteria (indeces) by which the data should be manipulated and queried.
Edit: SiLent SoNG is correct. DISABLE / ENABLE KEYS only works for MyISAM, not InnoDB. I never knew that, but I went and read the docs. http://dev.mysql.com/doc/refman/5.1/en/alter-table.html#id1101502.
Updating any indexes may be whats slowing it down. You can disable indexes while your doing your update and turn them back on so they can be generated once for the whole table.
ALTER TABLE foo DISABLE KEYS;
LOAD DATA INFILE ... ;
ALTER TABLE ENABLE KEYS;
This will cause the indexes to all be updated in one go instead of per-row. This also leads to more balanced BTREE indexes.
No improvement on MySQL 5.6
"MySQL can apply partition pruning to SELECT, DELETE, and UPDATE statements. INSERT statements currently cannot be pruned."
http://dev.mysql.com/doc/refman/5.6/en/partitioning-pruning.html
If the columns INSERT checks (primary keys, for instance) are indexed - then this will only decrease the speed: MySQL will have to additionally decide on partitioning.
All queries are only improved by adding indexes. Partitioning is useful when you have tons of very old data (e.g. year<2000) which is rarely used: then it'll be nice to create a partition for that data.
Cheers!