How do I determine maximum transaction size in MySQL? - mysql

Saw the same question posited for PostgreSQL here; wondering if anyone knows (a) the MySQL flavour of the response and (b) which MySQL options I would examine to determine/influence the answer.
I don't need an absolute answer btw, but if I were to propose inserting, say, 200,000 rows of ~2Kb each would you consider that very straightforward, or pushing the limit a bit?
Assume MySQL is running on a well specced Linux box with 4Gb of RAM, shedloads of disk space, and an instance tuned by someone who generally knows what they're doing!
Cheers
Brian

For Innodb the transaction size will be limited by the size of the redo log (ib_logfile*), so if you plan to commit very large transactions make sure you set innodb_log_file_size=256M or more. The drawback is that it will take longer to recover in case of crash.
But for the record Innobase employees recommend keeping you transactions short

There are no transaction limits built inside SQL servers. The limit is the hardware running it, physical RAM, free space on the hard disk.
We run successfully imports of millions of data.

Related

How to reduce the startup time for MySQL 5.7 with many databases?

I have MySQL 5.7.24 running on a Windows VM. It has a few thousand databases (7000). I understand this is not the recommended set up for MySQL but some business requirements have necessitated this multi-tenant db structure and I cannot change that unfortunately.
The server works fine when it is running but the startup time can get pretty long, almost 20-30 mins after a clean shutdown of the MySQL service and 1+ hours after a restart of the Windows VM.
Is there any way to reduce the startup time?
In my configuration, I observed that innodb_file_per_table = ON (which is the default for MySQL 5.7 I believe) and so I think that at startup it is scanning every .ibd file.
Would changing innodb_file_per_table = OFF and then altering each table to get rid of the .ibd files be a viable option. One thing to note is that in general, every database size is pretty small and even with 7000 databases, the total size of the data is about 60gb only. So to my understanding, innodb_file_per_table = ON is more beneficial when there are single tables that can get pretty large which is not the case for my server.
Question: Is my logic reasonable and could this innodb_file_per_table be the reason for the slow startup? Or is there some other config variable that I can change so that each .ibd file is not scanned before the server starts accepting connections.
Any help to guide me in the right direction would be much appreciated. Thanks in advance!
You should upgrade to MySQL 8.0.
I was working on a system with the same problem as yours. In our case, we had about 1500 schemas per MySQL instance, and a little over 100 tables per schema. So it was about 160,000+ tables per instance. It caused lots of problems trying to use innodb_file_per_table, because the mysqld process couldn't work with that many open file descriptors efficiently. The only way to make the system work was to abandon file-per-table, and move all the tables into the central tablespace.
But that causes a different problem. Tablespaces never shrink, they only grow. The only way to shrink a tablespace is to move the tables to another tablespace, and drop the big one.
One day one of the developers added some code that used a table like a log, inserting a vast number of rows very rapidly. I got him to stop logging that data, but by then it was too late. MySQL's central tablespace had expanded to 95% of the size of the database storage, leaving too little space for binlogs and other files. And I could never shrink it without incurring downtime for our business.
I asked him, "Why were you writing to that table so much? What are you doing with the data you're storing?" He shrugged and said casually, "I dunno, I thought the data might be interesting sometime, but I had no specific use for them." I felt like strangling him.
The point of this story is that one naïve developer can cause a lot of inconvenience if you disable innodb_file_per_table.
When MySQL 8.0 was being planned, the MySQL Product Manager solicited ideas for scalability criteria. I told him about the need to support instances with a lot of tables, like 160k or more. MySQL 8.0 included an all-new implementation of internal code for handling metadata about tables, and he asked the engineers to test the scalability with up to 1 million tables (with file-per-table enabled).
So the best solution to your problem is not to turn off innodb_file_per_table. That will just lead to another kind of crisis. The best solution is to upgrade to 8.0.
Re your comment:
As far as I know, InnoDB does not open tables at startup time. It opens tables when they are first queried.
Make sure you have table_open_cache and innodb_open_files tuned for your scale. Here is some reading:
https://dev.mysql.com/doc/refman/5.7/en/table-cache.html
https://www.percona.com/blog/2009/11/18/how-innodb_open_files-affects-performance/
https://www.percona.com/blog/2018/11/28/what-happens-if-you-set-innodb_open_files-higher-than-open_files_limit/
https://www.percona.com/blog/2017/10/01/one-million-tables-mysql-8-0/
I hope you are using an SSD for storage, not a spinning disk. This makes a huge difference when doing a lot of small I/O operations. SSD storage devices have been a standard recommendation for database servers for about 10 years.
Also this probably doesn't help you but I gave up on using Windows around 2007. Not as a server nor a desktop.

Why is the import process very slow in Solr 5.3.x?

I'm using solr 5.3.1's DataImportHandler to import IMDB data which I imported into MySQL.
However it takes a couple of seconds even minutes to get one document processed. My table contains 10M+ rows so this is going to take forever. I have materialized all data and it only take a few minutes for MySQL to get all row processed.
What could have caused this poor performance?
#yangrui
Unfortunately there is no single answer to your question on why indexing is slow. 24G is a lot of heap but depending on the actual size of your index it may or may not be enough.
Commit policy modification should also help in case you are committing too frequently. SOLR does a lot its magic of making documents available for searches when a 'commit' / 'autocommit' happens. However the when a commit does happen it is a resource hungry operation.
One other thing that is not obvious is the actual unallocated RAM available on the server. By unallocated I mean additional RAM on the server apart from the RAM that is associated with the JVM as Heap.
I suggest going through this documentation https://wiki.apache.org/solr/SolrPerformanceProblems#RAM
I suspect that you may not have enough RAM on your machine.
Hope this helps.

MySQL changing large table to InnoDB

I have a MySQL server running on CentOS which houses a large (>12GB) DB. I have been advised to move to InnoDB for performance reasons as we are experiencing lockups where the application that relies on the DB becomes unresponsive when the server is busy.
I have been reading around and can see that the ALTER command that changes the table to InnoDB is likely to take a long time and hammer the server in the process. As far as I can see, the only change required is to use the following command:
ALTER TABLE t ENGINE=InnoDB
I have run this on a test server and it seems to complete fine, taking about 26 minutes on the largest of the tables that needs to be converted.
Having never run this on a production system I am interested to know the following:
What changes are recommended to be made to the MySQL config to take advantage of additional performance of InnoDB tables? The server currently has 3GB assigned to InnoDB cache - was thinking of increasing this to 15GB once the additional RAM is installed.
Is there anything else I should do to the server with this change?
I would really recommend using either Percona MySQL or MariaDB. Both have tools that will help you get the most out of InnoDB, as well as some tools to help you diagnose and optimize your database further (for example, Percona's Online Schema Change tool could be used to alter your tables without downtime).
As far as optimization of InnoDB, I think most would agree that innodb_buffer_pool_size is one of the most important parameters to tune (and typically people set it around 70-80% of total available memory, but that's not a magic number). It's not the only important config variable, though, and there's really no magic run_really_fast setting. You should also pay attention to innodb_buffer_pool_instances (and there's a good discussion about this topic on https://dba.stackexchange.com/questions/194/how-do-you-tune-mysql-for-a-heavy-innodb-workload)
Also, you should definitely check out the tips offered in the MySQL documentation itself (http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb.html). It's also a good idea to pay attention to your InnoDB hit ratio (Rolado over at DBA Stackexchange has a great answer on this topic, eg, https://dba.stackexchange.com/questions/65341/innodb-buffer-pool-hit-rate) and analyze your slow query logs carefully. Towards that later end, I would definitely recommend taking a look at Percona again. Their slow query analyzer is top notch and can really give you a leg up when it comes to optimizing SQL performance.

At what point does MySQL INNODB fine tuning become a requirement?

I had a look at this:
http://www.mysqlperformanceblog.com/2009/01/12/should-you-move-from-myisam-to-innodb/
and:
http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
These answer a lot of my questions regarding INNODB vs MyISAM. There is no doubt in my mind that INNODB is the way I should go. However, I am working on my own and for development I have created a LAMP (ubuntu 10.10 x64) VM server. At present the server has 2 GB memory and a single SATA 20GB drive. I can increase both of these amounts without too much trouble to about 3-3.5 GB memory and a 200GB drive.
The reasons I hesitate to switch over to INNODB is:
A) The above articles mention that INNODB will vastly increase the size of the tables, and he recommends much larger amounts of RAM and drive space. While in a production environment I don't mind this increase, in a development environment, I fear I can not accommodate.
B) I don't really see any point in fine tuning the INNODB engine on my VM. This is likely something I will not even be allowed to do in my production environment. The articles make it sound like INNODB is doomed to fail without fine tuning.
My question is this. At what point is INNODB viable? How much RAM would I need to run INNODB on my server (with just my data for testing. This server is not open to anyone but me)? and also is it safe for me to assume that a production environment that will not allow me to fine tune the DB has likely already fine tuned it themselves?
Also, am I overthinking/overworrying about things?
IMHO, it becomes a requirement when you have tens of thousands of rows, or when you can forecast the rate of growth for data.
You need to focus on tuning the innodb buffer pool and the log file size. Also, make sure you have innodb_file_per_table enabled.
To get an idea of how big to make the innodb buffer pool in KB, run this query:
SELECT SUM(data_length+index_length)/power(1024,1) IBPSize_KB
FROM information_schema.tables WHERE engine='InnoDB';
Here it is in MB
SELECT SUM(data_length+index_length)/power(1024,2) IBPSize_MB
FROM information_schema.tables WHERE engine='InnoDB';
Here it is in GB
SELECT SUM(data_length+index_length)/power(1024,3) IBPSize_GB
FROM information_schema.tables WHERE engine='InnoDB';
I wrote articles about this kind of tuning
First Article
Second Article
Third Article
Fourth Article
IF you are limited by the amount of RAM on your server, do not surpass more than 25% of the installed for the sake of the OS.
I think you may be over thinking things. Its true that INNODB loves ram but if your database is small I don't think you'll have many problems. The only issue I have had with MYSQL or any other database is that as the data grows so do the requirements for accessing it quickly. You can also use compression on the tables to keep them smaller but INNODB is vastly better than MYISAM at data integrity.
I also wouldn't worry about tuning your application until you run into a bottleneck. Writing efficient queries and database design seems to be more important than memory unless you're working with very large data sets.

Best storage engine for constantly changing data

I currently have an application that is using 130 MySQL table all with MyISAM storage engine. Every table has multiple queries every second including select/insert/update/delete queries so the data and the indexes are constantly changing.
The problem I am facing is that the hard drive is unable to cope, with waiting times up to 6+ seconds for I/O access with so many read/writes being done by MySQL.
I was thinking of changing to just 1 table and making it memory based. I've never used a memory table for something with so many queries though, so I am wondering if anyone can give me any feedback on whether it would be the right thing to do?
One possibility is that there may be other issues causing performance problems - 6 seconds seems excessive for CRUD operations, even on a complex database. Bear in mind that (back in the day) ArsDigita could handle 30 hits per second on a two-way Sun Ultra 2 (IIRC) with fairly modest disk configuration. A modern low-mid range server with a sensible disk layout and appropriate tuning should be able to cope with quite a substantial workload.
Are you missing an index? - check the query plans of the slow queries for table scans where they shouldn't be.
What is the disk layout on the server? - do you need to upgrade your hardware or fix some disk configuration issues (e.g. not enough disks, logs on the same volume as data).
As the other poster suggests, you might want to use InnoDB on the heavily written tables.
Check the setup for memory usage on the database server. You may want to configure more cache.
Edit: Database logs should live on quiet disks of their own. They use a sequential access pattern with many small sequential writes. Where they share disks with a random access work load like data files the random disk access creates a big system performance bottleneck on the logs. Note that this is write traffic that needs to be completed (i.e. written to physical disk), so caching does not help with this.
I've now changed to a MEMORY table and everything is much better. In fact I now have extra spare resources on the server allowing for further expansion of operations.
Is there a specific reason you aren't using innodb? It may yield better performance due to caching and a different concurrency model. It likely will require more tuning, but may yield much better results.
should-you-move-from-myisam-to-innodb
I think that that your database structure is very wrong and needs to be optimised, has nothing to do with the storage