MySQL - InnoDB buffer pool, disable/workaround - mysql

I'm working on a system that incluces exporting large amounts of data into csv files. We are using InnoDB for the our tables. InnoDB buffers previous queries/results in some manor.
Now on a production enviroment that is a really good thing but while testing the performance of an export in my dev enviroment it is not.
The buffer pool size seems to be Around 128MB.
I couldn't find much about this on google except that you can change some MySQL settings when the server boots up.
Anyone knows a workaround of maybe there is a sql statement that prevents it from being put into the buffer?

It's a non-problem (since 5.1.41)
It is impossible to prevent any InnoDB activity from going through the buffer_pool. It is too deeply engrained in the design.
The buffer_pool caches data and index blocks, not queries/results. The Query cache plays with queries/results. But the QC should normally be disabled for production systems.
innodb_old_blocks_pct (default = 37, meaning % of buffer_pool) prevents wiping out the buffer pool from certain operations such as the reads needed for your 'export'.
See http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_old_blocks_pct
and the links in that section.

and what about set the buffer pool to a very small value (ex: 1MB)

Related

What is the difference between mysql query cache and buffer pool?

İ am trying to understand the mysql architecture and I came acrosa two notions.
The first one is query cache, which I understood that it stores the queries that were run at least once, and if the query processor sees the query cached there, it no longer goes to the parser and takes the results directly to the cache.
But then, there is also the buffer pool, part of the Storage Engine buffer manager, which kinda does the same thing from my understanding.
So my question would be, if there is a cache in the logical layer, why do we need one in the physical layer also? İ am thinking that if a query is found in the query cache it will never be searched in the buffer pool, and if the query is not found in cache, then it will never be also retreived from the buffer pool. Am I missing something?
For query cache, you got it spot on. Its based on the raw text of the query mapping to the exact query results. It has major scaling problems which is why MySQL-8.0 removed it.
innodb buffer pool, is a storage of the low level data and index pages of the database. It ensures that all the recently used data is off disk and able to be queried without resorting to the much slower (by comparison to ram) storage.
So buffer pools serve all queries on the same data, while query caches only serve a particular query (at a large scaleability cost).
Adding some context to #danblack's answer, query cache stores the query and actual data associated with the query. But in buffer pool which we call as innodb_buffer_pool stores the physical (01,10) or low-level data or say pages. Whenever query executes it checks in the buffer pool and if required data is not present then it will proceed towards the disk (i.e. your secondary storage) and puts data in the buffer pool.
With query cache, there is a disadvantage of invalidating query cache if query cache size being set quite high without analyzing the situations. By "invalidating query cache" I mean marking the data or entry in query cache as invalid because the underlying table has been changed by DML statements. I have personally experienced many times for example under "show processlist" when replication is stuck for long at this particular state i.e. invalidation query cache and once it invalidates all the entries, things start catching up.
"Why do we need one in the physical layer?"
It is because having data in query cache can seriously impact the performance IF underlying table changes quite often which can affect the overall database performance. So if your table is not changing frequently query cache is useful. But now the concept of query cache has been removed in MySQL 8 (which is not a part of the discussion).
Bufferpool is only used to store pages coming from the secondary store.
CPU can not fetch data from secondary storage so the Database management system makes a pool in RAM and then CPU keeps access data from this buffer pool from RAM.
and DBMS uses a replacement algorithm to replace pages from this buffer pool.
Cache of data is something else.
There are other data structs and techniques for data cache.

mysql Exclude schema from buffering

How can i exclude a complete schema from buffering or cacheing?
Each Query for this schema shouldt never buffered in query cache or innoDB Buffer.
Since you tagged your question innodb, I assume you want to exclude buffering pages for a particular schema in the InnoDB Buffer Pool.
There are no options to control the schema or tables that get stored in the buffer pool. In fact, any page read by a query must be stored in the buffer pool, at least while you're querying it.
InnoDB will automatically load pages into the buffer pool when you query them. InnoDB will also automatically evict pages if the space is needed for some other page by a subsequent query. The pages are managed by an LRU (least recently used) algorithm, which makes it more likely for an infrequently-used page to be evicted.
But InnoDB goes one step further. In the old days, there was a risk that a big table-scan would evict all the pages, even if your table-scan was a once-per-day query (like those run by mysqldump). So InnoDB tries to make the buffer pool scan-resistant by tracking pages that are newcomers to the buffer pool, or those which have "seniority" because they have been read many times. The senior pages are less likely to be evicted by newcomers.
All the above should help to explain why you probably don't need to control which schemas can use the buffer pool. InnoDB makes a good effort to make sure the pages you need are in RAM, and those you don't need aren't.
For the disabling of query cache for the specific schema - generally it's not possible, however, you can turn off query cache for your connection using
SET SESSION query_cache_type = OFF;
It will completely turn off query cache for the current session.
Or you can include SQL_NO_CACHE to your select queries.
As for the InnoDB buffer pool - I don't think it's possible as there are no schema specific configuration values for it.

MySQL fetch time issue

I've two different MySQL servers with the same database (a copy), both with Ubuntu x64, 4Gb RAM. Both are virtual machines hosted in the same VMWare server.
The first is our old server with MySQL 5.6.33-0ubuntu0.14.04.1-log, and the new one have the version 5.7.17-0ubuntu0.16.04.1 installed.
I'm comparing the performance of some SQL scripts and I noticed that the new server have bigger fetch times with the exact same SQL. Can you help to determinate possible causes?
Maybe the 5.7 engine analyses the SQL in a different and less efficient way?
Maybe some MySQL configuration need to be tuned differently? I only changed innodb_buffer_pool_size = 2G and innodb_buffer_pool_instances = 2 (same as the old server)
Ideas?
Thx
I suspect your problem is that your buffer pool is allocated, but not yet full of data. As you run queries, it has to fetch data from disk, which is much slower than RAM. As you run those queries again and again, the data required will already be in the buffer pool, and MySQL will take advantage of that. Data that is already in the buffer pool can be read without touching the disk.
You can check how much is in your buffer pool. Here's an example from my test instance (I put "..." because the output is long, and I'm showing an excerpt).
mysql> SHOW ENGINE INNODB STATUS\G
...
----------------------
BUFFER POOL AND MEMORY
----------------------
...
Buffer pool size 65528
Free buffers 64173
Database pages 1339
...
These numbers are in "pages" of 16KB each. You can see I have 64*1024 pages = 1GB allocated, but nearly all of it is free, i.e. unoccupied by data. Only 2% of my buffer pool pages have data in them. It's likely that if I run queries now, it will have to read from the disk to load data. Unless perhaps I have very little data in my database on disk too, and it only fills 2% of my buffer pool even when it's fully loaded.
Anyway, assuming you have more data than the size of your buffer pool, it will gradually fill the buffer pool as you run queries. Then you'll see the ratio of "Database pages" to "Free buffers" change over time (I don't know why they say both pages and buffers, since they refer to the same thing). Subsequent queries should run faster.

Understanding DB level caching in RAM for POSTGRESS and MYSQL

Imagine we have a MYSQL DB that's data size is 500 MB.
If I will set the innodb_buffer_pool_size at 500MB (or more), is it correct to think that all the data will be cached in RAM, and my queries won't touch disk?
Is effective_cache_size in POSTGRESS is the same as MYSQL's buffer_pool and it also can help avoid reading from disc?
I believe you are on the right track in regards to MySQL innoDB tables. But you must remember that when measuring the size of a database, there are two components: data length and index length.
MySQL database size.
You also have no control over which databases are loaded into memory. If you want to guarantee a particular DB is loaded, then you must make sure the buffer pool is large enough to hold all of them, with some room to spare just in case.
MySQL status variables can then be used to see how the buffer pool is functioning.
I also highly recommend you use the buffer pool load/save variables so that the buffer pool is saved on shutdown and reloaded on startup of the MySQL server. Those variables are available from version 5.6 and up, I believe.
Also, check this out in regards to sizing your buffer pool.
Is "effective_cache_size", a parameter to indicate the planner as to what OS is actually doing ?
http://www.cybertec.at/2013/11/effective_cache_size-better-set-it-right/
and for caching the tables, do we not need to configure "shared_buffers" ?
And with regards to MySQL, yes the "innodb_buffer_pool" size will cache the data for Innodb tables and preventing disc reads. Make sure its configured adequate to hold all the data in memory.

MySQL InnoDB Engine Restart

I have a very large table with around 1M records. Due to bad performance, I have optimized the queries and needed to change the index.
I changed it using ALTER, now I am really not sure how this works in InnoDB. Do I need to restart MySQL server? If I need to restart MySQL server, how do I keep data integrity between tables (so that I don't miss data which is there in memory and did not get written to DB)?
I Googled and found that in the case of MySQL restart, I need to use global variable innodb_fast_shutdown -- what does it do when I set it and what if I don't? It is not very clear.
I am new to MySQL area with InnoDB. Any help is really appreciated.
So changed it using ALTER, now i am really not sure about how this works in innodb?
You are saying you added the index with ALTER TABLE ... ADD INDEX ... (or ADD KEY -- they are two ways of asking for exactly the same thing) presumably?
Once the ALTER TABLE finishes executing and your mysql> prompt returns, there is nothing else needed. At that point, the table has its new index and the index is fully populated.
You're done, and there is no need to restart the server.
Since you mentioned it, I'll also try to help clear up your misconceptions about innodb_fast_shutdown and the memory/disk divide in InnoDB.
InnoDB makes a one-time request for a block of memory the size of innodb_buffer_pool_size from the operating system when MySQL server starts up, in this example from the MySQL error log from one of my test servers:
130829 11:27:30 InnoDB: Initializing buffer pool, size = 4.0G
This is where InnoDB stores table and index data in memory, and the best performance is when this pool is large enough for all of your data and indexes. When rows are read, the pages from the tablespace files are read into the buffer pool first, then data extracted from there. If changes are made, the changes are written to the in-memory copies of table data and indexes in the buffer pool, and eventually they are flushed to disk. Pages in the pool are either "clean" -- meaning they are identical to what's on disk, because they've not been changed since they were loaded, or if changed, the changes have already been written to disk -- or "dirty" meaning they do not match what is on disk.
However, InnoDB is ACID-compliant -- and this could not be true if it only wrote the changes in memory and the changes were not persisted immediately somewhere prior to the in-memory changes even being made ... and that "somewhere" is the redo log -- on disk -- that stores what changes to be made in memory, immediately, in a format that allows this operation to be much faster than updating the actual tablespace files themselves in real-time would be.
In turn, the innodb_fast_shutdown variable determines whether MySQL finishes up everything written to the redo log before shutdown -- or after it starts back up. It works fine, either way, but if you need to shut the server down faster, it's faster and perfectly safe to let it pick everything up later, no matter what changes you have made.
Importantly, I don't know what you have read, but in routine operations, you never need to mess with the value of innodb_fast_shutdown unless you are shutting down in preparation for doing an upgrade to your version of MySQL server (and then it is primarily a safety precaution). The data on disk is always consistent with what is in memory, either because the tablespace files are already consistent with the memory representation of the data, or because the pending changes to the tablespace files are safely stored in the redo log, where they will be properly processed when the server comes back online.
In the case of ALTER TABLE anything pending for the table prior to the ALTER would have already been take care of, since InnoDB typically rebuilds entire the table in response to this command, so the only possible "pending" changes would be DML that occurred after the ALTER.