There are 10 InnoDB partitioned tables. MySQL is configured with option innodb-file-per-table=1 (innodb-file per table/partition - for some reasons). Tables size is abount 40GB each. They contains statictics data.
During normal operation, the system can handle the load. The accumulated data is processed every N minutes. However, if for some reason, there was no treatment for more than 30 minutes (eg, maintenance of the system - it is rare, but once a year is necessary to make changes), begin to lock timeout.
I will not tell you how to come to such an architecture, but it is the best solution - way was long.
Đ•ach time, making changes requires more and more time. Today, for example, a simple ALTER TABLE took 2:45 hours. This is unacceptable.
So, as I said, processing the accumulated data requires a lot of resources and SELECT statements are beginning to return lock timeout errors. Of course, the tables in the query are not involved, and the work goes to the results of queries to them. Total size of these 10 tables is about 400GB, and a few dozen small tables, the total size of which is comparable to (and maybe not yet) to the size of an big table. Problems with small tables there.
My question is: how can I solve the problem with a lock timeout error? A server is not bad - 8 core xeon, 64 RAM. And this is only the database server. Of course, the entire system is not located on the same machine.
There is an only reason why I get this errors: on data transfrom process from big tables to small.
Any ideas?
Related
I'm running MariaDB 10.2.31 on Ubuntu 18.4.4 LTS.
On a regular basis I encounter the following conundrum - especially when starting out in the morning, that is when my DEV environment has been idle for the night - but also during the day from time to time.
I have a table (this applies to other tables as well) with approx. 15.000 rows and (amongst others) an index on a VARCHAR column containing on average 5 to 10 characters.
Notably, most columns including this one are GENERATED ALWAYS AS (JSON_EXTRACT(....)) STORED since 99% of my data comes from a REST API as JSON-encoded strings (and conveniently I simply store those in one column and extract everything else).
When running a query on that column WHERE colname LIKE 'text%' I find query-result durations of i.e. 0.006 seconds. Nice. When I have my query EXPLAINed, I can see that the index is being used.
However, as I have mentioned, when I start out in the morning, this takes way longer (14 seconds this morning). I know about the query cache and I tried this with query cache turned off (both via SET GLOBAL query_cache_type=OFF and RESET QUERY CACHE). In this case I get consistent times of approx. 0.3 seconds - as expected.
So, what would you recommend I should look into? Is my DB sleeping? Is there such a thing?
There are two things that could be going on:
1) Cold caches (overnight backup, mysqld restart, or large processing job results in this particular index and table data being evicted from memory).
2) Statistics on the table go stale and the query planner gets confused until you run some queries against the table and the statistics get refreshed. You can force an update using ANALYZE TABLE table_name.
3) Query planner heisenbug. Very common in MySQL 5.7 and later, never seen it before on MariaDB so this is rather unlikely.
You can get to the bottom of this by enablign the following in the config:
log_output='FILE'
log_slow_queries=1
log_slow_verbosity='query_plan,explain'
long_query_time=1
Then review what is in the slow log just after you see a slow occurrence. If the logged explain plan looks the same for both slow and fast cases, you have a cold caches issue. If they are different, you have a table stats issue and you need to cron ANALYZE TABLE at the end of the over night task that reads/writes a lot to that table. If that doesn't help, as a last resort, hard code an index hint into your query with FORCE INDEX (index_name).
Enable your slow query log with log_slow_verbosity=query_plan,explain and the long_query_time sufficient to catch the results. See if occasionally its using a different (or no) index.
Before you start your next day, look at SHOW GLOBAL STATUS LIKE "innodb_buffer_pool%" and after your query look at the values again. See how many buffer pool reads vs read requests are in this status output to see if all are coming off disk.
As #Solarflare mentioned, backups and nightly activity might be purging the innodb buffer pool of cached data and reverting bad to disk to make it slow again. As part of your nightly activites you could set innodb_buffer_pool_dump_now=1 to save the pages being hot before scripted activity and innodb_buffer_pool_load_now=1 to restore it.
Shout-out and Thank you to everyone giving valuable insight!
From all the tips you guys gave I think I am starting to understand the problem better and beginning to narrow it down:
First thing I found was my default innodb_buffer_pool_size of 134 MB. With the sort and amount of data I'm processing this is ridiculously low - so I was able to increase it.
Very helpful post: https://dba.stackexchange.com/a/27341
And from the docs: https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-resize.html
Now that I have increased it to close to 2GB and am able to monitor its usage and RAM usage in general (cli: cat /proc/meminfo) I realize that my 4GB RAM is in fact on the low side of things. I am nowhere near seeing any unused overhead (buffer usage still at 99% and free RAM around 100MB).
I will start to optimize RAM usage of my daemon next and see where this leads - but this will not free enough RAM altogether.
#danblack mentioned innodb_buffer_pool_dump_now and innodb_buffer_pool_load_now. This is an interesting approach to maybe use whenever the daemon accesses the DB as I would love to separate my daemon's buffer usage from the front end's (apparently this is not possible!). I will look into this further but as my daemon is running all the time (not only at night) this might not be feasible.
#Gordan Bobic mentioned "refreshing" DBtables by using ANALYZE TABLE tableName. I found this to be quite fast and incorporated it into the daemon after each time it does an extensive read/write. This increases daemon run times by a few seconds but this is no issue at all. And I figure I can't go wrong with it :)
So, in the end I believe my issue to be a combination of things: Too small buffer size, too small RAM, too many read/write operations for that environment (evicting buffered indexes etc.).
Also I will have to learn more about memory allocation etc and optimize this better (large-pages=1 etc).
I have a server that receives data from thousands of locations all over the world. Periodically, that server connects to my DB server and inserts records with multi-insert, some 11,000 rows at a time per multi, and up to 6 insert statements. When this happens, all 6 process lock the table being inserted into.
What I am trying to figure out is what causes the locking? Am I better off limiting my multi-insert to, say 100 rows at a time and doing them end to end? What do I use for guidelines?
The DB server has 100GB RAM and 12 processors. It is very lightly used but when these inserts come in, everyone freezes up for a couple minutes which disrupts peopel running reports, etc.
Thanks for any advice. I know I need to stagger the inserts, I am just asking what is a recommended way to do this.
UPDATE: I was incorrect. I spoke to the programmer and he said that there is a perl program running that sends single inserts to the server, as rapidly as it can. NOT a multi-insert. There are (currently) 6 of these perl processes running simultaneously. One of them is doing 91000 inserts, one at a time. Perhaps since we have a lot of RAM, a multi-insert would be better?
Your question lacks a bunch of details about how the system is structured. In addition, if you have a database running on a server with 100 Gbytes of RAM, you should have access to a professional DBA, and not rely on internet forums.
But, as lad2025 suggests in a comment, staging tables can probably solve your problem. Your locking is probably caused by indexes, or possibly by triggers. The suggestion would be to load the data into a staging table. Then, leisurely load the data from the staging table into the final table.
One possibility is doing 11,000 inserts, say one per second (that would require about three hours). Although there is more overhead in doing the inserts, each would be its own transaction and the locking times would be very short.
Of course, only inserting 1 record at a time may not be optimal. Perhaps 10 or 100 or even 1000 would suffice. You can manage the inserts using the event scheduler.
And, this assumes that the locking scales according to the volume of the input data. That is an assumption, but I think a reasonable one in the absence of other information.
I am running a mysql server on a mac pro, 64GB of ram, 6 cores. Table1 in my schema has 330 million rows. Table2 has 65,000 rows. (I also have several other tables with a combined total of about 1.5 billion rows, but they are not being used in the operation I am attempting, so I don't think they are relevant).
I am trying to do what I would have thought was a relatively simple update statement (see below) to bring some data from Table2 into Table1. However, I am having a terrible time with mysql blowing through my system ram, forcing me into swaps, and eventually freezing up the whole system so that mysql becomes unresponsive and I need to restart my computer. My update statement is as below:
UPDATE Table1, Table2
SET
Table1.Column1 = Table2.Column1,
Table1.Column2 = Table2.Column2,
Table1.Column3 = Table2.Column3,
Table1.Column4 = Table2.Column4
WHERE
(Table1.Column5 = Table2.Column5) AND
(Table1.Column6 = Table2.Column6) AND
(Table1.Column7 = Table2.Column7) AND
(Table1.id between 0 AND 5000000);
Ultimately, I want to perform this update for all 330 million rows in Table1. I decided to break it up into batches of 5 million lines each though because
(a) I was getting problems with exceeding lock size and
(b) I thought it might help with my problems of blowing through ram.
Here are some more relevant details about the situation:
I have created indexes for both Table1 and Table2 over the combination of Column5, Column6, Column7 (the columns whose values I am matching on).
Table1 has 50 columns and is about 60 GB total.
Table2 has 8 columns and is 3.5 MB total.
I know that some people might recommend foreign keys in this situation, rather than updating table1 with info from table2, but (a) I have plenty of disk space and don't really care about using it to maximum efficiency (b) none of the values in any of these tables will change over time and (c) I am most concerned about speed of queries run on table1, and if it takes this long to get info from table2 to table1, I certainly don't want to need to repeat the process for every query I run on table1.
In response to the problem of exceeding maximum lock table size, I have experimented with increasing innodb_buffer_pool_size. I've tried a number of values. Even at something as low as 8 GB (i.e. 1/8th of my computer's ram, and I'm running almost nothing else on it while doing this), I am still having this problem of the mysqld process using up basically all of the ram available on the system and then starting to pull ram allocation from the operating system (i.e. my kernel_task starts showing up as using 30GB of ram, whereas it usually uses around 2GB).
The problem with the maximum locks seems to have been largely resolved; I no longer get this error, though maybe that's just because now I blow through my memory and crash before I can get there.
I've experimented with smaller batch sizes (1 million rows, 100,000 rows). These seem to work maybe a bit better than the 5 million row batches, but they still generally have the same problems, maybe only a bit slower to develop. And, performance seems terrible - for instance, at the rate I was going on the 100,000 batch sizes, it would have taken about 7 days to perform this update.
The tables both use InnoDB
I generally set SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; although I don't know if it actually helps or not (I am the only user accessing this DB in any way, so I don't really care about locking and would do away with it entirely if I could)
I notice a lot of variability in the time it takes batches to run. For instance, on the 1 million row batches, I would observe times anywhere between 45 seconds and 20 minutes.
When I tried running something that just found the matching rows and then put only two column values for those into a new table, I got much more consistent times (about 2.5 minutes per million lines). Thus, it seems that my problems might somehow stem from the fact maybe that I'm updating values in the table that I am doing the matching on, even though the columns that I'm updating are different from those I am matching on.
The columns that I am matching on and updating just contain INT and CHAR types, none with more than 7 characters max.
I ran a CHECK TABLE diagnostic and it came back ok.
Overall, I am tremendously perplexed why this would be so difficult. I am new to mysql and databases in general. Since Table2 is so small, I could accomplish this same task, much faster I believe, in python using a dictionary lookup. I would have thought though that databases would be able to handle this better, since handling and updating big datasets is what they are designed for.
I ran some diagnostics on the queries using Mysql workbench and confirmed that there are NOT full table scans being performed.
It really seems something must be going wrong here though. If the system has 64 GB of ram, and that is more than the entire size of the two tables combined (though counting index size it is a bit more than 64 GB for the two tables), and if the operation is only being applied on 5 million out of 330 million rows at a time, it just doesn't make sense that it should blow out the ram.
Therefore, I am wondering:
Is the syntax of how I am writing this update statement somehow horribly bad and inefficient such that it would explain the horrible performance and problems?
Are there some kind of parameters beyond the innodb_buffer_pool_size that I should be configuring, either to put a firmer cap on the ram mysql uses or to get it to more effectively use resources?
Are there other sorts of diagnostics I should be running to try to detect problems with my tables, schema, etc.?
What is a "reasonable" amount of time to expect an update like this to take?
So, after consulting with several people knowledgeable about such matters, here are the solutions I came up with:
I brought my innodb_buffer_pool_size down to 4GB, i.e. 1/16th of my total system memory. This finally seemed to be enough to reliably stop MySQL from blowing through my 64GB of RAM.
I simplified my indexes so that they only contained exactly the columns I needed, and made sure that all indexes I was using were small enough to fit into RAM (with plenty of room to spare for other uses of RAM by MySQL as well).
I learned to accept that MySQL just doesn't seem to be built for particularly large data sets (or, at least not on a single machine, even if a relatively big machine like what I have). Thus, I accepted that manually breaking up my jobs into batches would often be necessary, since apparently the machinery of MySQL doesn't have what it takes to make the right decisions about how to break a job up on its own, in order to be conscientious about system resources like RAM.
Sometimes, when doing jobs along the lines of this, or in general, on my moderately large datasets, I'll use MySQL to do my updates and joins. Other times, I'll just break the data up into chunks and then do the joining or other such operations in another program, such as R (generally using a package like data.table that handles largish data relatively efficiently).
I was also advised that alternatively, I could use something like Pig of Hive on a Hadoop cluster, which should be able to better handle data of this size.
The Situation:
I use a (php) cronjob to keep my database up-to-date. the affected table contains about 40,000 records. basically, the cronjob deletes all entries and inserts them afterwards (with different values ofc). I have to do it this way, because they really ALL change, because they are all interrelated.
The Problem:
Actually, everything works fine. The cronjob is doin' his job within 1.5 to 2 seconds (again, for about 40k inserts - i think this is adequate). MOSTLY.. But sometimes, the query takes up to 60, 90 or even 120 seconds!
I indexed my database. And I think query is good working, due to the fact it only needs 2 seconds mots of the time. I close the connection via mysql_close();
Do you have any ideas? If you need more information please tell me.
Thanks in advance.
Edit: Well, it seems like there was no problem with the inserts. it was a complex SELECT query, that made some trouble. Tho, thanks to everyone who answered!
[Sorry, apparently I haven't mastered the formatting yet]
From what I read, I can conclude that your cronjob is using bulk-insert statements. If you know when cronjob works, I suggest you to start a Database Engine Tuning Advisor session and see what other processes are running while the cronjob do its things. A bulk-insert has some restrictions with the number of fields and the number of rows at once. You could read the subtitles of this msdn http://technet.microsoft.com/en-us/library/ms188365.aspx
Performance Considerations
If the number of pages to be flushed in a
single batch exceeds an internal threshold, a full scan of the buffer
pool might occur to identify which pages to flush when the batch
commits. This full scan can hurt bulk-import performance. A likely
case of exceeding the internal threshold occurs when a large buffer
pool is combined with a slow I/O subsystem. To avoid buffer
overflows on large machines, either do not use the TABLOCK hint (which
will remove the bulk optimizations) or use a smaller batch size
(which preserves the bulk optimizations). Because computers vary, we
recommend that you test various batch sizes with your data load to
find out what works best for you.
I'm trying to understand an issue I am having with a MySQL 5.5 server.
This server hosts a number of databases. Each day at a certain time a process runs a series of inserts into TWO tables within this database. This process lasts from 5 to 15 minutes depending on the amount of rows being inserted.
This process runs perfectly. But it has a very unexpected side effect. All other inserts and update's running on tables unrelated to the two being inserted to just sit and wait until the process has stopped. Reads and writes outside of this database work just fine and SELECT statements too are fine.
So how is it possible for a single table to block the rest of a database but not the entire server (due to loading)?
A bit of background:-
Tables being inserted to are MyISAM with 10 - 20 million rows.
MySQL is Percona V5.5 and is serving one slave both running on
Debian.
No explicit locking is called for by the process inserting the
records.
None of the Insert statements do not select data from any other
table. They are also INSERT IGNORE statements.
ADDITIONAL INFO:
While this is happening there are no LOCK table entries in PROCESS LIST and the processor inserting the records causing this problem does NOT issue any table locks.
I've already investigated the usual causes of table locking and I think I've rules them out. This behaviour is either something to do with how MySQL works, a quirk of having large database files or possibly even something to do with the OS/File System.
After a few weeks of trying things I eventually found this: Yoshinori Matsunobu Blog - MyISAM and Disk IO Scheduler
Yoshinori demonstrates that changing the scheduler queue to 100000 (from the default 128) dramatically improves the throughput of MyISAM on most schedulers.
After making this change to my system there were no longer any dramatic instances of database hang on MyISAM tables while this process was running. There was slight slowdown as to be expected with the volume of data but the system remained stable.
Anyone experiencing performance issues with MyISAM should read Yoshinori's blog entry and consider this fix.