Kudu's MaintenanceMgr thread is very high on disk IO, read and write almost identical - kudu

This problem has been affected by online use, but when the program reads data in batches, the scanner is prone to timeout or "java.io.IOException: Couldn't get scan data" exception.
Can someone help answer and help optimize this question, thank you.
Because I have 12 storage disks, I have previously set up 12 maintenance manager threads, but found that the disk IO is too high, causing the disk to be fully loaded. It was later changed to 4 threads, which can alleviate it, but it still takes up a lot.The average read and write of each disk is above 70m/s.
# This is my tserver configuration
--fs_wal_dir=/data1/kudu/tserver
--fs_data_dirs=/data2/kudu/tserver,/data3/kudu/tserver,/data4/kudu/tserver,/data5/kudu/tserver,/data6/kudu/tserver,/data7/kudu/tserver,/data8/kudu/tserver,/data9/kudu/tserver,/data10/
kudu/tserver,/data11/kudu/tserver,/data12/kudu/tserver,/data13/kudu/tserver
--rpc_service_queue_length=30000
--scanner_ttl_ms=600000
--scanner_batch_size_rows=10000
--scanner_default_batch_size_bytes=10485760
--scanner_max_batch_size_bytes=83886080
--rpc_num_service_threads=32
--consensus_rpc_timeout_ms=600000
--consensus_max_batch_size_bytes=10485760
--maintenance_manager_num_threads=4
--block_cache_capacity_mb=5120
disk io picture

Related

MySQL queries very slow - occasionally

I'm running MariaDB 10.2.31 on Ubuntu 18.4.4 LTS.
On a regular basis I encounter the following conundrum - especially when starting out in the morning, that is when my DEV environment has been idle for the night - but also during the day from time to time.
I have a table (this applies to other tables as well) with approx. 15.000 rows and (amongst others) an index on a VARCHAR column containing on average 5 to 10 characters.
Notably, most columns including this one are GENERATED ALWAYS AS (JSON_EXTRACT(....)) STORED since 99% of my data comes from a REST API as JSON-encoded strings (and conveniently I simply store those in one column and extract everything else).
When running a query on that column WHERE colname LIKE 'text%' I find query-result durations of i.e. 0.006 seconds. Nice. When I have my query EXPLAINed, I can see that the index is being used.
However, as I have mentioned, when I start out in the morning, this takes way longer (14 seconds this morning). I know about the query cache and I tried this with query cache turned off (both via SET GLOBAL query_cache_type=OFF and RESET QUERY CACHE). In this case I get consistent times of approx. 0.3 seconds - as expected.
So, what would you recommend I should look into? Is my DB sleeping? Is there such a thing?
There are two things that could be going on:
1) Cold caches (overnight backup, mysqld restart, or large processing job results in this particular index and table data being evicted from memory).
2) Statistics on the table go stale and the query planner gets confused until you run some queries against the table and the statistics get refreshed. You can force an update using ANALYZE TABLE table_name.
3) Query planner heisenbug. Very common in MySQL 5.7 and later, never seen it before on MariaDB so this is rather unlikely.
You can get to the bottom of this by enablign the following in the config:
log_output='FILE'
log_slow_queries=1
log_slow_verbosity='query_plan,explain'
long_query_time=1
Then review what is in the slow log just after you see a slow occurrence. If the logged explain plan looks the same for both slow and fast cases, you have a cold caches issue. If they are different, you have a table stats issue and you need to cron ANALYZE TABLE at the end of the over night task that reads/writes a lot to that table. If that doesn't help, as a last resort, hard code an index hint into your query with FORCE INDEX (index_name).
Enable your slow query log with log_slow_verbosity=query_plan,explain and the long_query_time sufficient to catch the results. See if occasionally its using a different (or no) index.
Before you start your next day, look at SHOW GLOBAL STATUS LIKE "innodb_buffer_pool%" and after your query look at the values again. See how many buffer pool reads vs read requests are in this status output to see if all are coming off disk.
As #Solarflare mentioned, backups and nightly activity might be purging the innodb buffer pool of cached data and reverting bad to disk to make it slow again. As part of your nightly activites you could set innodb_buffer_pool_dump_now=1 to save the pages being hot before scripted activity and innodb_buffer_pool_load_now=1 to restore it.
Shout-out and Thank you to everyone giving valuable insight!
From all the tips you guys gave I think I am starting to understand the problem better and beginning to narrow it down:
First thing I found was my default innodb_buffer_pool_size of 134 MB. With the sort and amount of data I'm processing this is ridiculously low - so I was able to increase it.
Very helpful post: https://dba.stackexchange.com/a/27341
And from the docs: https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-resize.html
Now that I have increased it to close to 2GB and am able to monitor its usage and RAM usage in general (cli: cat /proc/meminfo) I realize that my 4GB RAM is in fact on the low side of things. I am nowhere near seeing any unused overhead (buffer usage still at 99% and free RAM around 100MB).
I will start to optimize RAM usage of my daemon next and see where this leads - but this will not free enough RAM altogether.
#danblack mentioned innodb_buffer_pool_dump_now and innodb_buffer_pool_load_now. This is an interesting approach to maybe use whenever the daemon accesses the DB as I would love to separate my daemon's buffer usage from the front end's (apparently this is not possible!). I will look into this further but as my daemon is running all the time (not only at night) this might not be feasible.
#Gordan Bobic mentioned "refreshing" DBtables by using ANALYZE TABLE tableName. I found this to be quite fast and incorporated it into the daemon after each time it does an extensive read/write. This increases daemon run times by a few seconds but this is no issue at all. And I figure I can't go wrong with it :)
So, in the end I believe my issue to be a combination of things: Too small buffer size, too small RAM, too many read/write operations for that environment (evicting buffered indexes etc.).
Also I will have to learn more about memory allocation etc and optimize this better (large-pages=1 etc).

Improve performance of mysql LOAD DATA / mysqlimport?

I'm batching CSV 15GB (30mio rows) into a mysql-8 database.
Problem: the task takes about 20min, with approxy throughput of 15-20 MB/s. While the harddrive is capable of transfering files with 150 MB/s.
I have a RAM disk of 20GB, which holds my csv. Import as follows:
mysqlimport --user="root" --password="pass" --local --use-threads=8 mytable /tmp/mydata.csv
This uses LOAD DATA under the hood.
My target table does not have any indexes, but approx 100 columns (I cannot change this).
What is strange: I tried tweaking several config parameters as follows in /etc/mysql/my.cnf, but they did not give any significant improvement:
log_bin=OFF
skip-log-bin
innodb_buffer_pool_size=20G
tmp_table_size=20G
max_heap_table_size=20G
innodb_log_buffer_size=4M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_autoinc_lock_mode=2
Question: does LOAD DATA / mysqlimport respect those config changes? Or does it bypass? Or did I use the correct configuration file at all?
At least a select on the variables shows they are correctly loaded by the mysql server. For example show variables like 'innodb_doublewrite' shows OFF
Anyways, how could I improve import speed further? Or is my database the bottleneck and there is no way to overcome the 15-20 MB/s threshold?
Update:
Interestingly if I import my csv from harddrive into the ramdisk, performance is almost the same (just a little bit better, but never over 25 MB/s). I also tested the same amount of rows, but only with a few (5) columns. And there I'm getting to about 80 MB/s. So clearly the number of columns is the bottleneck? But why do more columns slow down this process?
MySQL/MariaDB engine have little parallelization when making bulk inserts. It can only use one CPU core per LOAD DATA statement. You may probably monitor CPU utilization during load to see one core is fully utilized and it can provide only so much of output data - thus leaving disk throughput underutilized.
The most recent version of MySQL has new parallel load feature: https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-utilities-parallel-table.html . It looks promising but probably hasn't received much feedback yet. I'm not sure it would help in your case.
I saw various checklists on the internet that recommended having higher values in the following config params: log_buffer_size, log_file_size, write_io_threads, bulk_insert_buffer_size . But the benefits were not very pronounced when I performed comparison tests (maybe 10-20% faster than just innodb_buffer_pool_size being large enough).
This could be normal. Let's walk through what is being done:
The csv file is being read from a RAM disk, so no IOPs are being used.
Are you using InnoDB? If so, the data is going into the buffer_pool. As blocks are being built there, they are being marked 'dirty' for eventual flushing to disk.
Since the buffer_pool is large, but probably not as large as the table will become, some of the blocks will need to be flushed before it finishes reading all the data.
After all the data is read, and the table is finished, the dirty blocks will gradually be flushed to disk.
If you had non-unique indexes, they would similarly be written in a delayed manner to disk (cf 'Change buffering'). The change_buffer, by default occupies 25% of the buffer_pool.
How large is the resulting table? It may be significantly larger, or even smaller, than the 15GB of the csv file.
How much time did it take to bring the csv file into the ram disk? I proffer that that was wasted time and it should have been read from disk while doing the LOAD DATA; that I/O can be overlapped.
Please SHOW GLOBAL VARIABLES LIKE 'innodb%';; there are several others that may be relevant.
More
These are terrible:
tmp_table_size=20G
max_heap_table_size=20G
If you have a complex query, 20GB could be allocated in RAM, possibly multiple times!. Keep those to under 1% of RAM.
If copying the csv from hard disk to ram disk runs slowly, I would suspect the validity of 150 MB/s.
If you are loading the table once every 6 hours, and it takes 1/3 of an hour to perform, I don't see the urgency of making it faster. OTOH, there may be something worth looking into. If that 20 minutes is downtime due to the table being locked, that can be easily eliminated:
CREATE TABLE t LIKE real_table;
LOAD DATA INFILE INTO t ...; -- not blocking anyone
RENAME TABLE real_table TO old, t TO real_table; -- atomic; fast
DROP TABLE old;

Is search speed achieved with fast data access or fast index access?

From MySQL doc:
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
(create_definition,...)
{DATA|INDEX} DIRECTORY [=] 'absolute path to directory'
My table is for search only and takes 8G of disk space (4G data + 4G index) with 80M rows
I can't use ENGINE = Memory to store the whole table into memory but I can store either the data or the index in a RAM drive through the DIRECTORY table options
From a theorical knoledge, is it better to store the data or the index in RAM?
MySQL's default storage engine is InnoDB. As you run queries against an InnoDB table, the portion of that table or indexes that it reads are copied into the InnoDB Buffer Pool in memory. This is done automatically. So if you query the same table later, chances are it's already in memory.
If you run queries against other tables, it load those into memory too. If the buffer pool is full, it will evicting some data that belongs to your first table. This is not a problem, since it was only a copy of what's on disk.
There's no way to specifically "lock" a table on an index in memory. InnoDB will load either data or index if it needs to. InnoDB is smart enough not to evict data you used a thousand times, just for one other table requested one time.
Over time, this tends to balance out, using memory for your most-frequently queried subset of each table and index.
So if you have system memory available, allocate more of it to your InnoDB Buffer Pool. The more memory the Buffer Pool has, the more able it is to store all the frequently-queried tables and indexes.
Up to the size of your data + indexes, of course. The content copied from the data + indexes is stored only once in memory. So if you have only 8G of data + indexes, there's no need to give the buffer pool more and more memory.
Don't allocate more system memory to the buffer pool than your server can afford. Overallocating memory leads to swapping memory for disk, and that will be bad for performance.
Don't bother with the {DATA|INDEX} DIRECTORY options. Those are for when you need to locate a table on another disk volume, because you're running out of space. It's not likely to help performance. Allocating more system memory to the buffer pool will accomplish that much more reliably.
but I can store either the data or the index in a RAM drive through the DIRECTORY table options...
Short answer: let the database and OS do it.
Using a RAM disk might have made sense 10-20 years ago, but these days the software manages caching disk to RAM for you. The disk itself has its own RAM cache, especially if it's a hybrid drive. The OS will cache file system access in RAM. And then MySQL itself will do its own caching.
And if it's an SSD that's already extremely fast, so a RAM cache is unlikely to show much improvement.
So making your own RAM disk isn't likely to do anything that isn't already happening. What you will do is pull resources away from the OS and MySQL that they could have managed smarter themselves likely slowing everything on that machine down.
What you're describing a micro-optimization. This is attempting to make individual operations faster. They tend to add complexity and degrade the system as a whole. And there are limits to how much optimizing you can do with micro-optimizations. For example, if you have to search 1,000,000 rows, and it takes 1ms per row, that's 1,000,000 ms. If you make it 0.9ms per row then it's 900,000 ms.
What you want to focus on is algorithmic optimization, improvements to the algorithm. These tend to make the code simpler and less complex, though often the data structures need to be more thought out, because you're doing less work. Take those same 1,000,000 rows and add an index. Instead of looking at 1,000,000 rows you'll spend, say, 100 ms to look at the index.
The numbers are made up, but I hope you get the point. If "what you want is speed", algorithmic optimizations will take you where no micro-optimization will.
There's also the performance of the code using the database to consider, it is often the real bottleneck using unoptimized queries, poor patterns for fetching related data, and not taking advantage of caching.
Micro-optimizations, with their complexities and special configurations, tend to make algorithmic optimizations more difficult. So you might be slowing yourself down in the long run by worrying about micro-optimizations now. Furthermore, you're doing this at the very start when you only have fuzzy ideas about how this thing will be used or perform or where the bottlenecks will be.
Spend your time optimizing your data structures and indexes, not minute details of your database storage. Once you've done that, if it still isn't fast enough, then look at tweaking settings.
As a side note, there is one possible benefit to playing with DIRECTORY. You can put the data and index on separate physical drives. Then both can be accessed simultaneously with the full I/O throughput of each drive.
Though you've just made it twice as likely to have a disk failure, and complicated backups. You're probably better off with an SSD and/or RAID.
And consider whether a cloud database might actually out-perform any hardware you might be able to afford.

Neo4j batch insertion with .CSV files taking huge amount of time to sort&index

I'm trying to create a database with data collected from google n-grams. It's actually a lot of data, but after the creation of the CSV files the insertion was pretty fast. The problem is that, immediately after the insertion, the neo4j-import tool indexes the data, and this step its taking too much time. It's been more than an hour and it looks like it achieved 10% of progress.
Nodes
[*>:9.85 MB/s---------------|PROPERTIES(2)====|NODE:198.36 MB--|LABE|v:22.63 MB/s-------------] 25M
Done in 4m 54s 828ms
Prepare node index
[*SORT:295.94 MB-------------------------------------------------------------------------------] 26M
This is the console info atm. Does anyone have a suggestion about what to do to speed up this process?
Thank you. (:
Indexing takes a long time depending on number of nodes. I tried indexing with 10 million nodes and it took around 35 minutes, but you can still try these settings :
Increase your page cache size which is stored in '/var/lib/neo4j/conf/neo4j.properties' file (in my ubuntu system). Edit the following line
dbms.pagecache.memory=4g
according to your RAM, allocate size, here, 4g means 4gb space. Also, you can try changing java memory size which is stored in neo4j-wrapper.conf
wrapper.java.initmemory=1024
wrapper.java.maxmemory=1024
You can also read neo4j documentation on this - http://neo4j.com/docs/stable/configuration-io-examples.html

What is MySQL "Key Efficiency"

MySQL Workbench reports a value called "Key Efficiency" in association with server health. What does this mean and what are its implications?
From MySQL.com, "Key Efficiency" is:
...an indication of the number of key_read_requests that resulted in actual key_reads.
Ok, so what does that mean. What does it tell me about how I'm supposed to tune the server?
"Key Efficiency" is an indication of how much value you are getting from the index caches held within MySQL's memory. If your key efficiency is high, then most often MySQL is performing key lookups from within memory space, which is much faster than having to retrieve the relevant index blocks from disk.
The way to improve key efficiency is to dedicate more of your system memory to MySQL's index caches. How you do this depends on the storage engine you use. For MyISAM, increase the value of key-buffer-size. For InnoDB, increase the value of innodb-buffer-pool-size.
However, as Michael Eakins points out, the operating system also holds caches of disk blocks which it has accessed recently. The more memory that your operating system has available, the more disk blocks it can cache. Further, the disk drives themselves (and disk controllers in some cases), also have caches - which again can speed up retrieving data from disk. The hierarchy is a bit like this:
fastest - retrieving index data from within MySQL's index cache. The cost is a few memory operations.
retrieving index data that is held in the OS file system cache. The cost is a system call (for the read), and some memory operations.
retrieving index data that is held in the disk system cache (controller and drives). The cost is a system call (for the read), communication with the disk device, and some memory operations.
slowest - retrieving index data from the disk surface. The cost is a system call, communication with the device, physical movement of the disk (arm movement + rotation).
In practice, the difference between 1 and 2 is almost unnoticeable unless your system is very busy. Also, it is unlikely (unless your system has less spare RAM than your disk controller) that scenario 3 will come into play.
I have used servers with MyISAM tables with relatively small index caches (512MB), but massive system memory (64GB) and have found it difficult to demonstrate the value of increasing the size of the index cache. I guess it depends on what else is happening on your server. If all you are running is a MySQL data base, it is quite likely that the OS cache will be quite effective. However, if you run other jobs on the same server and these use lots of memory / disk accesses, then these might evict valuable cached index blocks leading to MySQL hitting disk more often.
An interesting exercise (if you have time) is to tinker with your system to make it run slower. Running a standard workload on large tables, reduce the MySQL buffers until the impact becomes noticeable. Flush your file system cache by pumping huge amounts (greater than RAM) of irrelevant data through your file system ( cat large-file > /dev/null ). Watch iostat as your queries run.
"Key Efficiency" is NOT a measure of how good your keys are. Well designed keys will have a much larger impact on performance than high "Key Efficiency". MySQL does not have much to help you there, unfortunately.
Key_read_requests is the number of requests to read a key block from the cache. While
key_reads is the number of physical reads of a key block from disk. So these 2 variables
can increase independently.
(http://bugs.mysql.com/bug.php?id=28384)
Which is still as clear as mud.
On to the next bit of explaination:
A partially valid use of Key_reads
There is a partially valid reason to
examine Key_reads, assuming that we
care about the number of physical
reads that occur, because we know that
disks are very slow relative to other
parts of the computer. And here's
where I return to what I called
"mostly factual" above, because
Key_reads actually aren't physical
disk reads at all. If the requested
block of data isn't in the operating
system's cache, then a Key_read is a
disk read -- but if it is cached, then
it's just a system call. However,
let's make our first hard-to-prove
assumption:
Hard-to-prove assumption #1: A
Key_read might correspond to a
physical disk read, maybe. If we take
that assumption as true, then what
other reason might we have for caring
about Key_reads? This assumption leads
to "a cache miss is significantly
slower than a cache hit," which makes
sense. If it were just as fast to do a
Key_read as a Key_read_request, what
use would the key buffer be anyway?
Let's trust MyISAM's creators on this
one, because they designed a cache hit
to be faster than a miss.
(http://planet.mysql.com/entry/?id=23679)