I have some code that writes out to NVMe SSD like this:
writer = pa.ipc.new_file(pa.OSFile(target_filepath, 'wb'), arrow_table.schema)
for batch in batches:
writer.write(batch)
writer.close()
It is writing to a NVMe SSD. I would like to know some general guidelines as to how to improve this performance. I'd suppose that if the batch is larger this is in general more efficient? What is the recommended size (MB) of the batch you'd want to use to max out SSD write throughput? Can you even doing that with one process?
Related
I'm batching CSV 15GB (30mio rows) into a mysql-8 database.
Problem: the task takes about 20min, with approxy throughput of 15-20 MB/s. While the harddrive is capable of transfering files with 150 MB/s.
I have a RAM disk of 20GB, which holds my csv. Import as follows:
mysqlimport --user="root" --password="pass" --local --use-threads=8 mytable /tmp/mydata.csv
This uses LOAD DATA under the hood.
My target table does not have any indexes, but approx 100 columns (I cannot change this).
What is strange: I tried tweaking several config parameters as follows in /etc/mysql/my.cnf, but they did not give any significant improvement:
log_bin=OFF
skip-log-bin
innodb_buffer_pool_size=20G
tmp_table_size=20G
max_heap_table_size=20G
innodb_log_buffer_size=4M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_autoinc_lock_mode=2
Question: does LOAD DATA / mysqlimport respect those config changes? Or does it bypass? Or did I use the correct configuration file at all?
At least a select on the variables shows they are correctly loaded by the mysql server. For example show variables like 'innodb_doublewrite' shows OFF
Anyways, how could I improve import speed further? Or is my database the bottleneck and there is no way to overcome the 15-20 MB/s threshold?
Update:
Interestingly if I import my csv from harddrive into the ramdisk, performance is almost the same (just a little bit better, but never over 25 MB/s). I also tested the same amount of rows, but only with a few (5) columns. And there I'm getting to about 80 MB/s. So clearly the number of columns is the bottleneck? But why do more columns slow down this process?
MySQL/MariaDB engine have little parallelization when making bulk inserts. It can only use one CPU core per LOAD DATA statement. You may probably monitor CPU utilization during load to see one core is fully utilized and it can provide only so much of output data - thus leaving disk throughput underutilized.
The most recent version of MySQL has new parallel load feature: https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-utilities-parallel-table.html . It looks promising but probably hasn't received much feedback yet. I'm not sure it would help in your case.
I saw various checklists on the internet that recommended having higher values in the following config params: log_buffer_size, log_file_size, write_io_threads, bulk_insert_buffer_size . But the benefits were not very pronounced when I performed comparison tests (maybe 10-20% faster than just innodb_buffer_pool_size being large enough).
This could be normal. Let's walk through what is being done:
The csv file is being read from a RAM disk, so no IOPs are being used.
Are you using InnoDB? If so, the data is going into the buffer_pool. As blocks are being built there, they are being marked 'dirty' for eventual flushing to disk.
Since the buffer_pool is large, but probably not as large as the table will become, some of the blocks will need to be flushed before it finishes reading all the data.
After all the data is read, and the table is finished, the dirty blocks will gradually be flushed to disk.
If you had non-unique indexes, they would similarly be written in a delayed manner to disk (cf 'Change buffering'). The change_buffer, by default occupies 25% of the buffer_pool.
How large is the resulting table? It may be significantly larger, or even smaller, than the 15GB of the csv file.
How much time did it take to bring the csv file into the ram disk? I proffer that that was wasted time and it should have been read from disk while doing the LOAD DATA; that I/O can be overlapped.
Please SHOW GLOBAL VARIABLES LIKE 'innodb%';; there are several others that may be relevant.
More
These are terrible:
tmp_table_size=20G
max_heap_table_size=20G
If you have a complex query, 20GB could be allocated in RAM, possibly multiple times!. Keep those to under 1% of RAM.
If copying the csv from hard disk to ram disk runs slowly, I would suspect the validity of 150 MB/s.
If you are loading the table once every 6 hours, and it takes 1/3 of an hour to perform, I don't see the urgency of making it faster. OTOH, there may be something worth looking into. If that 20 minutes is downtime due to the table being locked, that can be easily eliminated:
CREATE TABLE t LIKE real_table;
LOAD DATA INFILE INTO t ...; -- not blocking anyone
RENAME TABLE real_table TO old, t TO real_table; -- atomic; fast
DROP TABLE old;
From MySQL doc:
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
(create_definition,...)
{DATA|INDEX} DIRECTORY [=] 'absolute path to directory'
My table is for search only and takes 8G of disk space (4G data + 4G index) with 80M rows
I can't use ENGINE = Memory to store the whole table into memory but I can store either the data or the index in a RAM drive through the DIRECTORY table options
From a theorical knoledge, is it better to store the data or the index in RAM?
MySQL's default storage engine is InnoDB. As you run queries against an InnoDB table, the portion of that table or indexes that it reads are copied into the InnoDB Buffer Pool in memory. This is done automatically. So if you query the same table later, chances are it's already in memory.
If you run queries against other tables, it load those into memory too. If the buffer pool is full, it will evicting some data that belongs to your first table. This is not a problem, since it was only a copy of what's on disk.
There's no way to specifically "lock" a table on an index in memory. InnoDB will load either data or index if it needs to. InnoDB is smart enough not to evict data you used a thousand times, just for one other table requested one time.
Over time, this tends to balance out, using memory for your most-frequently queried subset of each table and index.
So if you have system memory available, allocate more of it to your InnoDB Buffer Pool. The more memory the Buffer Pool has, the more able it is to store all the frequently-queried tables and indexes.
Up to the size of your data + indexes, of course. The content copied from the data + indexes is stored only once in memory. So if you have only 8G of data + indexes, there's no need to give the buffer pool more and more memory.
Don't allocate more system memory to the buffer pool than your server can afford. Overallocating memory leads to swapping memory for disk, and that will be bad for performance.
Don't bother with the {DATA|INDEX} DIRECTORY options. Those are for when you need to locate a table on another disk volume, because you're running out of space. It's not likely to help performance. Allocating more system memory to the buffer pool will accomplish that much more reliably.
but I can store either the data or the index in a RAM drive through the DIRECTORY table options...
Short answer: let the database and OS do it.
Using a RAM disk might have made sense 10-20 years ago, but these days the software manages caching disk to RAM for you. The disk itself has its own RAM cache, especially if it's a hybrid drive. The OS will cache file system access in RAM. And then MySQL itself will do its own caching.
And if it's an SSD that's already extremely fast, so a RAM cache is unlikely to show much improvement.
So making your own RAM disk isn't likely to do anything that isn't already happening. What you will do is pull resources away from the OS and MySQL that they could have managed smarter themselves likely slowing everything on that machine down.
What you're describing a micro-optimization. This is attempting to make individual operations faster. They tend to add complexity and degrade the system as a whole. And there are limits to how much optimizing you can do with micro-optimizations. For example, if you have to search 1,000,000 rows, and it takes 1ms per row, that's 1,000,000 ms. If you make it 0.9ms per row then it's 900,000 ms.
What you want to focus on is algorithmic optimization, improvements to the algorithm. These tend to make the code simpler and less complex, though often the data structures need to be more thought out, because you're doing less work. Take those same 1,000,000 rows and add an index. Instead of looking at 1,000,000 rows you'll spend, say, 100 ms to look at the index.
The numbers are made up, but I hope you get the point. If "what you want is speed", algorithmic optimizations will take you where no micro-optimization will.
There's also the performance of the code using the database to consider, it is often the real bottleneck using unoptimized queries, poor patterns for fetching related data, and not taking advantage of caching.
Micro-optimizations, with their complexities and special configurations, tend to make algorithmic optimizations more difficult. So you might be slowing yourself down in the long run by worrying about micro-optimizations now. Furthermore, you're doing this at the very start when you only have fuzzy ideas about how this thing will be used or perform or where the bottlenecks will be.
Spend your time optimizing your data structures and indexes, not minute details of your database storage. Once you've done that, if it still isn't fast enough, then look at tweaking settings.
As a side note, there is one possible benefit to playing with DIRECTORY. You can put the data and index on separate physical drives. Then both can be accessed simultaneously with the full I/O throughput of each drive.
Though you've just made it twice as likely to have a disk failure, and complicated backups. You're probably better off with an SSD and/or RAID.
And consider whether a cloud database might actually out-perform any hardware you might be able to afford.
InnoDB often assumes the use of spindle hard drives for storing data, so it makes the best effort to reduce random access, and opt for sequential access whenever possible. Nowadays, a lot of MySQL instances in production use SSD Flash drives, so the benefit of these design decisions are gone, and may cause some overhead.
Does the current development of InnoDB take this into consideration? Is there any configuration option for tuning the performance for SSD drives?
There are a few tunables that directly impact SSD versus spinning drives:
innodb_flush_neighbors = OFF -- since there is no "rotational delay"
innodb_random_read_ahead = OFF
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_page_size - Using a smaller size _may_ help if _all_ tables are
accessed randomly _and_ have small rows. (not for the faint of heart)
I don't have "good" numbers for the numeric values, and they depend on the performance characteristics of the SSD/Flash.
There may be more settings. I don't know, for example, about innodb_read_io_threads and innodb_write_io_threads. (Default 4, max 64.)
A Battery-Backed Write Cache on the RAID controller makes writes essentially instantaneous. This is also a factor
8.0 (the latest version) has some "cost" parameters that apply to disk versus RAM access. They are manually tunable. (Sorry, I don't have the details.) I have implored the developers to make them self-tuning.
Keep in mind that if you are "at the limit" of such tunables, you won't have much room before the system collapses.
Keep in mind that optimizing indexes and queries often gives the biggest bang for your buck.
We just switched over to Google Compute Engine and are having major issues with disk speed. It's been about 5% of Linode or worse. It's never exceeded 20M/s for writing and 10M/s for reading. Most of the time it's 15M/s for writing and 5M/s for reading.
We're currently running a n1-highmem-4 (4 vCPU, 26 GB memory) machine. CPU & memory aren't the bottleneck. Just running a script that reads rows from PostgreSQL database, processes them, then writes back to PostgreSQL. It's just for a common job to update database row in batch. Tried running 20 processes to take advantage of multi-core but the overall progress is still slow.
We're thinking disk may be bottleneck because traffic is abnormally low.
Finally we decided to do benchmarking. We found it's not only slow but seems to have a major bug which is reproducible:
create & connect to instance
run the benchmark at least three times:
dd if=/dev/zero bs=1024 count=5000000 of=~/5Gb.file
We found it becomes extremely slow and aren't able to finish the benchmarking at all.
Persistent Disk performance is proportional to the size of the disk itself and the VM that it is attached to. The larger the disk (or the VM), the higher the performance, so in essence, the price you are paying for the disk or the VM pays not only for the disk/CPU/RAM but also for the IOPS and throughput.
Quoting the Persistent Disk documentation:
Persistent disk performance depends on the size of the volume and the
type of disk you select. Larger volumes can achieve higher I/O levels
than smaller volumes. There are no separate I/O charges as the cost of
the I/O capability is included in the price of the persistent disk.
Persistent disk performance can be described as follows:
IOPS performance limits grow linearly with the size of the persistent disk volume.
Throughput limits also grow linearly, up to the maximum bandwidth for the virtual machine that the persistent disk is attached to.
Larger virtual machines have higher bandwidth limits than smaller virtual machines.
There's also a more detailed pricing chart on the page which shows what you get per GB of space that you buy (data below is current as of August 2014):
Standard disks SSD persistent disks
Price (USD/GB per month) $0.04 $0.025
Maximum Sustained IOPS
Read IOPS/GB 0.3 30
Write IOPS/GB 1.5 30
Read IOPS/volume per VM 3,000 10,000
Write IOPS/volume per VM 15,000 15,000
Maximum Sustained Throughput
Read throughput/GB (MB/s) 0.12 0.48
Write throughput/GB (MB/s) 0.09 0.48
Read throughput/volume per VM (MB/s) 180 240
Write throughput/volume per VM (MB/s) 120 240
and a concrete example on the page of what a particular size of a disk will give you:
As an example of how you can use the performance chart to determine
the disk volume you want, consider that a 500GB standard persistent
disk will give you:
(0.3 × 500) = 150 small random reads
(1.5 × 500) = 750 small random writes
(0.12 × 500) = 60 MB/s of large sequential reads
(0.09 × 500) = 45 MB/s of large sequential writes
I need to improve I/O performance for my database. I'm using the "2xlarge" HW described below & considering upgrading to the "4xlarge" HW (http://aws.amazon.com/ec2/instance-types/). Thanks for the help!
Details:
CPU usage is fine (usually under 30%), uptime load averages anywhere from 0.5 to 2.0 (but I believe I'm supposed to divide that by the number of CPU's) so that looks okay as well. However, the I/O is bad: iostat show favorable service times, but the time spent in queue (I suppose this means waiting to access the disk) is far too high. I've configured MySQL to flush to disk every 1sec instead of every write, which helps, but not enough. Profiling shows there are a handful of tables that are the culprits for most of the load (both read && write operations). Queries are already indexed & optimized, but not partitioned. Average MySQL states are: Sending Data # 45%, statistics # 20%, Updating # 15%, Sorting result # 8%.
Questions:
How much performance will I get by upgrading HW?
Same question, but if I partition the high-load tables?
Machines:
m2.2xlarge
64-bit
4 vCPU
13 ECU
34.2 Gb Mem
EBS-Optimized
Network Performance: "Moderate"
m2.4xlarge
64-bit
6 vCPU
26 ECU
68.4 Gb Mem
EBS-Optimized
Network Performance: "High"
In my experience, the biggest boost in MySQL performance comes from IO. You have alot of RAM. Try setting up a ram drive and point the tmpdir to it.
I have several MySQL servers that are very busy. My settings are below - maybe this can help you tweak your settings.
My Setup is:
-Dual 2.66 CPU 8 core box with a 6-drive Raid-1E array - 1.3TB.
-innodblogs on a separate SSD drives.
-tmpdir is on a 2GB tempfs partition.
-32GB of ram
InnoDB settings:
innodb_thread_concurrency=16
innodb_buffer_pool_size = 22G
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 400M
innodb_log_files_in_group=8
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2 (This is a slave machine - 1 is not required fo my purposes)
innodb_flush_method=O_DIRECT
Current Queries per second avg: 5185.650
I am using Percona Server, which is quite a bit faster that other MySQLs from my testing.