Which Drupal MySQL tables are safe to clear (and when)?

Which Drupal MySQL tables are safe to clear (and when)? - mysql

I'm adding modules to a Drupal site and am getting dreaded fatal memory errors. I went to clear the accesslog, cache, and watchdog tables, but am still getting such errors. I'm only running one database on this site and it's for Drupal use, but I am wondering what other tables I can free up. I'm sure some of the myriad tables listed in phpMyAdmin are more critical to Drupal than others. Can I blindly clear all tables with "Overhead" or would that be a major faux pas?
Any insight would be most excellent.

No, you can not simply clear some tables. It also doesn't help at all with out of memory errors anyway.
There are really only two ways to fight out of memory errors. a) Increase the memory limit, 64MB should be the minimum, with many modules, it might even need more. and b) disable some modules.

To add to Berdir's answer, a c) would be to profile memory usage and figure out what's eating memory, and if there's a way to optimize or cache it's output.
Install devel and turn on performance logging, it'll tell you what queries and modules are eating up the most memory. You can also use xdebug and a cachegrind utility to diagnose this in-depth, but it'll take a while to set up.
If you truncate database tables, the problem will just come back when more stuff fills up those tables.
Best thing to do is just to increase your PHP memory limit, though.

Related

Is enabling innodb_dedicated_server good for performance?

From MySQL8 documentation:
When innodb_dedicated_server is enabled, InnoDB automatically configures the following variables:
innodb_buffer_pool_size
innodb_log_file_size
innodb_log_files_in_group (as of MySQL 8.0.14)
innodb_flush_method
Only consider enabling innodb_dedicated_server if the MySQL instance resides on a dedicated server
where it can use all available system resources. Enabling innodb_dedicated_server is not recommended
if the MySQL instance shares system resources with other applications.
Assuming the server is dedicated for MySQL, does enabling innodb_dedicated_server actually give better performance than tuning those parameters on my own?

Short answer: No, it does not improve performance any more than setting those tuning options yourself.
The variable innodb_dedicated_server is explained in detail when the feature was announced (2017-08-24):
https://mysqlserverteam.com/plan-to-improve-the-out-of-the-box-experience-in-mysql-8-0/
It's just a shorthand for a number of tuning options. The new variable doesn't improve performance in any special way, it's exactly the same as setting those other tuning options yourself.
I wrote this comment on the blog when they announced the feature:
I’m sorry, but I don’t like this feature at all. I understand the goal
of improving the out-of-the-box experience for naive users, but I
don’t think this solution will be successful at this goal.
Trying to pre-tune a MySQL installation with some formula is a
one-size-fits-all solution, and these kinds of solutions are
unreliable. We can recall examples of other products that have tried
to do this, but eventually removed their auto-tuning features.
It’s not a good assumption that the buffer pool needs as much physical
RAM as you can afford. You already know this, because you need the
innodb_dedicated_server option. Rick mentioned the possibility that
the dataset is already smaller than RAM. In this case, adding more RAM
has little or no benefit.
Many naive users mistakenly believe (after reading some blog) that
increasing RAM allocation always increases performance. It’s difficult
to explain to them why this is not true.
Likewise innodb log file. We assume that bigger is better, because of
benchmarks showing that heavy write traffic benefits from bigger log
files, because of delaying checkpoints. But what if you don’t have
heavy write traffic? What if you use MySQL for a blog or a CMS that is
99% reads? The large log file is unnecessary. Sizing it for an assumed
workload or dataset size has a high chance of being the wrong choice
for tuning.
I understand the difficulty of asking users questions during
installation. I recently did a project automating MySQL provisioning
with apt. It was annoying having to figure out debconf to work around
the installation prompts that do exist (btw, please document MySQL’s
debconf variables!).
There’s also the problem that even if you do prompt the user for
information, they don’t know the answers to the questions. This is
especially true of the naive users that you’re targeting with this
feature.
If the installer asks “Do you use MySQL on a dedicated server?” do
they even know what this means? They might think “dedicated” is simply
the opposite of shared hosting.
If the installer asks “Do you want to use all available memory on this
system?” you will be surprised at how many users think “memory” refers
to disk space, not RAM.
In short: (1) Using formulas to tune MySQL is error-prone. (2) Asking
users to make choices without information is error-prone.
I have an alternative suggestion: Make it easier for users to become
less naive about their choices.
I think users need a kind of friendly cheat-sheet or infographic of
how to make tuning decisions. This could include a list of questions
about their data size and workload, and then a list of performance
indicators to monitor and measure, like buffer pool page create rate,
and log file write rate. Give tips on how to measure these things,
what config options to change, and then how to measure again to verify
that the change had the desired effect.
A simple monitoring tool would also be useful. Nothing so
sophisticated as PMM or VividCortex for long-term trending, but
something more like pt-mext for quick, ephemeral measurements.
The only thing the installation process needs to do is tell the user
that tuning is a thing they need to do (many users don’t realize
this), and refer them to the cheat-sheet documentation.

Just tuning.
It is a challenging task to provide "good" defaults for everything. The biggest impediment is not knowing how much of the machine's RAM and CPU will be consumed by other products (Java, WordPress, etc, etc) running on the same server.
A large number of MySQL servers are used by big players; they separate MySQL servers from webservers, etc. This makes it simple from them to tweak a small number of tunables quickly when deploying the server.
Meanwhile, less-heavy users get decent tuning out-of-the-box by leaving that setting out.

MySQL was shut down during Index - How to fix the index?

MySQL was shut down in the middle of an indexing operation.
It still works but some of the queries seem much slower than before.
Is there anything particular we can check?
Is it possible that an index gets half way through?
Thanks much

As a suggested in my comment, you could try a repair on the relevant table(s).
That said, there's a section of the MySQL manual dedicated to this precise topic, which details how to use the REPAIR <table> statement and indeed dump/re-import.
Is this doesn't make any difference, you may need to check the database settings (if it's a InnoDB engined table/database, it'll love being able to be resident in memory for example) and perhaps try to see what specific indexes are being used via an EXPLAIN on the queries that are causing pain.
There are also commercial tools such as New Relic that'll show what specific queries are being sluggish in quite a lot of detail as well as monitoring other aspects of your system, which may be worth exploring if this is a commercial project/web site.

How to measure mySQL bottlenecks?

What mySQL server variables should we be looking at and what thresholds are significant for the following problem scenarios:
CPU bound
Disk read bound
Disk write bound
And for each scenario, what solutions are recommended to improve them, short of getting better hardware or scaling the database to multiple servers?

This is a complicated area. The "thresholds" that will affect each of your three categories overlap quite a bit.
If you are having problems with your operations being CPU bound, then you definitely need to look at:
(a) The structure of your database - is it fully normalized. Bad DB structure leads to complex queries which hit the processor.
(b) Your indexes - is everything needed for your queries sufficiently indexed. Lack of indexes can hit both the processor and the memory VERY hard. To check indexes, do "EXPLAIN ...your query". Anything row in the resulting explanation that says it isn't using an index, you need to look at closely and if possible, add an index.
(c) Use prepared statements wherever possible. These can save the CPU from doing quite a bit of crunching.
(d) Use a better compiler with optimizations appropriate for your CPU. This is one for the dedicated types, but it can glean you the odd extra percent here and there.
If you are having problems with your operations being read bound
(a) Ensure that you are caching where possible. Check the configuration variables for query_cache_limit and query_cache_size. This isn't a magic fix, but raising these can help.
(b) As with above, check your indexes. Good indexes reduce the amount of data that needs to be read.
If you having problems with your operations being write bound
(a) See if you need all the indexes you currently have. Indexes are good, but the trade-off for them improving query time, is that maintaining those indexes can impact the time spent writing the data and keeping them up to date. Normally you want indexes if in doubt, but sometimes you're more interested in rapidly writing to a table than you are in reading from it.
(b) Make possible use of INSERT DELAYED to "queue" writes to the database. Note, this is not a magic fix and often inappropriate, but in the right circumstances can be of help.
(c) Check for tables that are heavily read from and written to at the same time, e.g. an access list that update's visitor's session data constantly and is read from just as much. It's easy to optimize a table for reading from, and writing to, but not really possible to design a table to be good at both. If you have such a case and it's a bottleneck, consider whether it's possible to split its functions or move any complex operations using that table to a temporary table that you can update as a block periodically.
Note, the only stuff in the above that has a major effect, is good query design / indexing. Beyond that, you want to start considering at better hardware. In particular, you can get a lot of benefit out of a RAID-0 array which doesn't do a lot for writing bound problems, but can do wonders for read-bound problems. And it can be a pretty cheap solution for a big boost.
You also missed two items off your list.
Memory bound. If you are hitting memory problems then you must check everything that can be usefully indexed is indexed. You can also look at greater connection pooling if for some reason you're using a lot of discrete connections to your DB.
Network bound. If you are hitting network bound problems... well you probably aren't, but if you are, you need another network card or a better network.
Note, that a convenient way to analyze your DB performance is to turn on the log_slow_queries option and set long_query_time to either 0 to get everything, or 0.3 or similar to catch anything that might be holding your database up. You can also turn on log-queries-not-using-indexes to see if anything interesting shows up. Note, this sort of logging can kill a busy live server. Try it on a development box to start.
Hope that's of some help. I'd be interested in anyone's comments on the above.

Using a MySQL database is slow

We have a dedicated MySQL server, with about 2000 small databases on it. (It's a Drupal multi-site install - each database is one site).
When you load each site for the first time in a while, it can take up to 30s to return the first page. After that, the pages return at an acceptable speed. I've traced this through the stack to MySQL. Also, when you connect with the command line mysql client, connection is fast, then "use dbname" is slow, and then queries are fast.
My hunch is that this is due to the server not being configured correctly, and the unused dbs falling out of a cache, or something like that, but I'm not sure which cache or setting applies in this case.
One thing I have tried is the innodb_buffer_pool size. This was set to the default 8M. I tried raising it to 512MB (The machine has ~ 2GB of RAM, and the additional RAM was available) as the reading I did indicated that more should give better performance, but this made the system run slower, so it's back at 8MB now.
Thanks for reading.

With 2000 databases you should adjust the table cache setting. You certainly have a lot of cache miss in this cache.
Try using mysqltunner and/or tunning_primer.sh to get other informations on potential issues with your settings.
Now drupal makes Database intensive work, check you Drupal installations, you are maybe generating a lot (too much) of requests.
About the innodb_buffer_pool_size, you certainly have a lot of pagination cache miss with a little buffer (8Mb). The ideal size is when all your data and indexes size can fit in this buffer, and with 2000 databases... well it is quite certainly a very little size but it will be hard for you to grow. Tunning a MySQL server is hard, if MySQL takes too much RAM your apache won't get enough RAM.
Solutions are:
check that you do not make the connexion with DNS names but with IP
(in case of)
buy more RAM
set MySQL on a separate server
adjust your settings
For Drupal, try to set the session not in the database but in memcache (you'll need RAM for that but it will be better for MySQL), modules for that are available. If you have Drupal 7 you can even try to set some of the cache tables in memcache instead of MySQL (do not do that with big cache tables).
edit: last thing, I hope you have not modified Drupal to use persistent database connexions, some modules allows that (or having an old drupal 5 which try to do it automatically). With 2000 database you would kill your server. Try to check mysql error log for "too many connections" errors.

Hello Rupertj as I read you are using tables type innodb, right?
innodb table is a bit slower than myisam tables, but I don't think it is a major problem, as you told, you are using drupal system, is that a kind of mult-sites, like a word-press system?
If yes, sorry about but this kind of systems, each time you install a plugin or something else, it grow your database in tables and of course in datas.. and it can change into something very very much slow. I have experiencied by myself not using Drupal but using Word-press blog system, and it was a nightmare to me and my friends..
Since then, I have abandoned the project... and my only advice to you is, don't install a lot of plugins in your drupal system.
I hope this advice help you, because it help me a lot in word-press.

This sounds like a caching issue in Drupal, not MYSQL. It seems there are a few very heavy queries, or many, many small ones, or both, that hammer the database-server. Once that is done, Drupal caches that in several caching layers. After which only one (or very few) queries are all that is needed to build up a page. Slow in the beginning, fast after that.

You will have to profile it to determine what the cause is, but the table cache seems like a likely suspect.
However, you should also be mindful of persistent connections - which should absolutely definitely, always be turned off (yes, for everyone, not just you). Apache / PHP persistent connections are a pessimisation that you and everyone else can generally do without.

Best storage engine for constantly changing data

I currently have an application that is using 130 MySQL table all with MyISAM storage engine. Every table has multiple queries every second including select/insert/update/delete queries so the data and the indexes are constantly changing.
The problem I am facing is that the hard drive is unable to cope, with waiting times up to 6+ seconds for I/O access with so many read/writes being done by MySQL.
I was thinking of changing to just 1 table and making it memory based. I've never used a memory table for something with so many queries though, so I am wondering if anyone can give me any feedback on whether it would be the right thing to do?

One possibility is that there may be other issues causing performance problems - 6 seconds seems excessive for CRUD operations, even on a complex database. Bear in mind that (back in the day) ArsDigita could handle 30 hits per second on a two-way Sun Ultra 2 (IIRC) with fairly modest disk configuration. A modern low-mid range server with a sensible disk layout and appropriate tuning should be able to cope with quite a substantial workload.
Are you missing an index? - check the query plans of the slow queries for table scans where they shouldn't be.
What is the disk layout on the server? - do you need to upgrade your hardware or fix some disk configuration issues (e.g. not enough disks, logs on the same volume as data).
As the other poster suggests, you might want to use InnoDB on the heavily written tables.
Check the setup for memory usage on the database server. You may want to configure more cache.
Edit: Database logs should live on quiet disks of their own. They use a sequential access pattern with many small sequential writes. Where they share disks with a random access work load like data files the random disk access creates a big system performance bottleneck on the logs. Note that this is write traffic that needs to be completed (i.e. written to physical disk), so caching does not help with this.

I've now changed to a MEMORY table and everything is much better. In fact I now have extra spare resources on the server allowing for further expansion of operations.

Is there a specific reason you aren't using innodb? It may yield better performance due to caching and a different concurrency model. It likely will require more tuning, but may yield much better results.
should-you-move-from-myisam-to-innodb

I think that that your database structure is very wrong and needs to be optimised, has nothing to do with the storage

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008