We are trying to put several Tb in to MySQL Cluster, unfortunately the index does not fit in to memory.
Are there way to overcome this limitations of mysql?
are there way in mysql process range operations in parallel?
My data has a 3D points: (id x y z idkey someblob) inside the MYisam with 128 partitions. The NDBCLuster was unable load the data due to memory limits.
indexing goes over idkey(this is pre calculated peano-hilbert key).The total row count is about 10^9.
Thanks Arman.
EDIT
my setup is 2 datanodes 2 mysqld one mdm.
8Gb RAM per ndb with 4 cores.
The whole system has 30Tb Raid6.
The system is linux Scientific LInux 6.0, the cluster is 7.1 compiled from source.
It sounds like MySQL is ill-suited for the task (sorry). I would check out Tokyo Tyrant, maybe MongoDB or any other distributed key-value storage system. There are also specialized commercial products.
MongoDB is able to swap out some of it's indexes to the HD. I guess your problem is that MySQL just can't do that (I'm not a MySQL-guy though).
Maybe you can try to modify your config.ini file.
DataMemory=15000M
IndexMemory=2560M
But if two values are too high, you will encounter this bug:
Unable to use due to bitmap pages missaligned!!
So I'm still trying to solve it. Good luck.
I faced the same issue when I was loading only DB tables' structure. Which means DataMemory or IndexMemory were not of help here.
Also the number of tables didn't reach the limit in MaxNoOfTables so it is not the issue as well.
The solution for me here was to increase the values for MaxNoOfOrderedIndexes and MaxNoOfUniqueHashIndexes which reflect the max number of indexes you can have in the cluster. So if there are many indexes in your DB try to increase those variables accordingly.
Of course, a rolling restart must be done after that change to take effect!
Related
I have largish (InnoDB) tables in a database; apparently the users are capable of making SELECTs with JOINs that result in temporary, large (and thus on-disk) tables. Sometimes, those are so large that they exhaust disk space, leading to all sorts of weird issues.
Is there a way to limit temp table maximum size for an on-disk table, so that the table doesn't overgrow the disk? tmp_table_size only applies to in-memory tables, despite the name. I haven't found anything relevant in the documentation.
There's no option for this in MariaDB and MySQL.
I ran into the same issue as you some months ago, I searched a lot and I finally partially solved it by creating a special storage area on the NAS for themporary datasets.
Create a folder on your NAS or a partition on an internal HDD, it will be by definition limited in size, then mount it, and in the mysql ini, assign the temporary storage to this drive: (choose either windows/linux)
tmpdir="mnt/DBtmp/"
tmpdir="T:\"
mysql service should be restarted after this change.
With this approach, once the drive is full, you still have "weird issues" with on-disk queries, but the other issues are gone.
There was a discussion about an option disk-tmp-table-size, but it looks like the commit did not make it through review or got lost for some other reason (at least the option does not exist in the current code base anymore).
I guess your next best try (besides increasing storage) is to tune MySQL to not make on-disk temp tables. There are some tips for this on DBA. Another attempt could be to create a ramdisk for the storage of the "on-disk" temp tables, if you have enough RAM and only lack disk storage.
While it does not answer the question for MySQL, MariaDB has tmp_disk_table_size and potentially also useful max_join_size settings. However, tmp_disk_table_size is only for MyISAM or Aria tables, not for InnoDB. Also, max_join_size works only on the estimated row count of the join, not the actual row count. On the bright side, the error is issued almost immediately.
I am fairly new to MySQL. I have a database consisting of a few hundred table files. When I run a report I notice (through ProcMon) that MySQL is opening and closing the tables hundreds of thousands of times! That greatly affects performance. Is there some setting to direct MySQL to keep table files open until MySQL is shut down? Or at least to reduce the file thrashing?
Thanks.
Plan A: Don't worry about it.
Plan B: Increase table_open_cache to a few thousand. (See SHOW VARIABLES LIKE 'table_open_cache';) If that value won't stick, check the Operating System to see if it is constraining thing (ulimit).
Plan C: It is rare to see an application that need over a hundred tables. Ponder what the application is doing. (WP, for example, uses 12 tables per user. This does not scale well.)
I noticed that my database server supports the Memory database engine. I want to make a database I have already made running InnoDB run completely in memory for performance.
How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality.
Assuming you understand the consequences of using the MEMORY engine as mentioned in comments, and here, as well as some others you'll find by searching about (no transaction safety, locking issues, etc) - you can proceed as follows:
MEMORY tables are stored differently than InnoDB, so you'll need to use an export/import strategy. First dump each table separately to a file using SELECT * FROM tablename INTO OUTFILE 'table_filename'. Create the MEMORY database and recreate the tables you'll be using with this syntax: CREATE TABLE tablename (...) ENGINE = MEMORY;. You can then import your data using LOAD DATA INFILE 'table_filename' INTO TABLE tablename for each table.
It is also possible to place the MySQL data directory in a tmpfs in thus speeding up the database write and read calls. It might not be the most efficient way to do this but sometimes you can't just change the storage engine.
Here is my fstab entry for my MySQL data directory
none /opt/mysql/server-5.6/data tmpfs defaults,size=1000M,uid=999,gid=1000,mode=0700 0 0
You may also want to take a look at the innodb_flush_log_at_trx_commit=2 setting. Maybe this will speedup your MySQL sufficently.
innodb_flush_log_at_trx_commit changes the mysql disk flush behaviour. When set to 2 it will only flush the buffer every second. By default each insert will cause a flush and thus cause more IO load.
Memory Engine is not the solution you're looking for. You lose everything that you went to a database for in the first place (i.e. ACID).
Here are some better alternatives:
Don't use joins - very few large apps do this (i.e Google, Flickr, NetFlix), because it sucks for large sets of joins.
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because
the server might be able to optimize it better—a fact that is not
specific to MySQL Server alone.
-The MySQL Manual
Make sure the columns you're querying against have indexes. Use EXPLAIN to confirm they are being used.
Use and increase your Query_Cache and memory space for your indexes to get them in memory and store frequent lookups.
Denormalize your schema, especially for simple joins (i.e. get fooId from barMap).
The last point is key. I used to love joins, but then had to run joins on a few tables with 100M+ rows. No good. Better off insert the data you're joining against into that target table (if it's not too much) and query against indexed columns and you'll get your query in a few ms.
I hope those help.
If your database is small enough (or if you add enough memory) your database will effectively run in memory since it your data will be cached after the first request.
Changing the database table definitions to use the memory engine is probably more complicated than you need.
If you have enough memory to load the tables into memory with the MEMORY engine, you have enough to tune the innodb settings to cache everything anyway.
"How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality."
In direct response to this part of your question, you can issue an ALTER TABLE tbl engine=InnoDB; and it'll recreate the table in the proper engine.
In place of the Memory storage engine, one can consider MySQL Cluster. It is said to give similar performance but to support disk-backed operation for durability. I've not tried it, but it looks promising (and been in development for a number of years).
You can find the official MySQL Cluster documentation here.
Additional thoughts :
Ramdisk - setting the temp drive MySQL uses as a RAM disk, very easy to set up.
memcache - memcache server is easy to set up, use it to store the results of your queries for X amount of time.
one question about NDBCLUSTER.
I inherited the writing of a web site basing on NDBCLUSTER 5.1 solution (LAMP platform).
Unfortunately, who designed the former solution didn't realize that this database engine has strong limits. One, the maximum number of fields a table can have is 128. The former programmer conceived tables with 369 fields in a single row, one for each day of the year plus some key field (he originally worked with MyISAM engine). Ok it must be refactored, anyways, I know.
What is more, the engine needs a lot of tuning: maximum number of attributes for a table (which defaults to 1000, a bit too few) and many other parameters, the misinterpretation or underestimation of which can lead to serious problems once you're in production with your database and you're forced to change something.
Even the fact that disk storage for NDBCLUSTER tables is kind of aleatory if not precisely configured: even if specified in CREATE TABLE statements, the engine seems to prefer keeping data in memory - which explains the speed - but can be a pain if your table on node 1 should suddenly collapse (as it did during testing). All table data lost on all nodes and table corrupted after 1000 records only.
We were on a server with 8Gb RAM, and the table had just 27 fields.
Please note that no ndb_mgm operation for nodes shutdown ran to compromise table data. It simply fell down, full stop. Our provider didn't understand why.
So the question is: would you recommend NDBCLUSTER as a stable solution for a large scale web service database?
We're talking about a database which should contain several millions of records, thousands of tables and thousands of catalogues.
If not that which database would you recommend as the best to accomplish the task of making a nation-level scale web service.
Thanks in advance.
I have a terrible experience with NDBCLUSTER. It's good replacement for memcached with range invalidation, nothing more. The stability and configurability does not exist for this solution. You can not force all processes to listen on specific ports, backup was working but I have to edit bkp files in vim to restore database etc..
We have a database that was backed up from a Linux 64 bit version of MySql that we have restored onto a Windows 32bit version of MySql.
We have a table with about 4.5 gig of data - the main space being consumed by a BLOB field containing file data. The table itself only has about 6400 records in it.
The following query executes on the Linux box in no time at all; but on the windows box, it takes about 5 minutes to run and in the process, the server is unresponsive to anything else:
select id from fileTable where cid=1234
Is there some sort of optimization we need to do? Is there some sort of special considerations that need to be met when going from Linux to Windows or from 64 bit to 32 bit?
If you copied over the binary database files *.frm *.MYD *.MYI
Then you should check this out:
http://dev.mysql.com/doc/refman/5.1/en/mysql-upgrade.html
If you exported the DBs content then:
Check the version numbers and changelogs.
Also consider if there are performance differences between the two systems! (maybe this is normal)
I'd say the 4.5 Gig of data will be easier accessible via a 64bit number space (which has a upper limit of 17.2 gigabyte in directly addressable memory).
32 bit addressable memory is 4Gb, so if your database is bigger than this then possibly MySQL cannot handle the indexing as quickly.
Is there a way of re-indexing or optimising your database so that its paged in a more 32bit friendly way? Altering the largest table to a different storage engine may do this (use ALTER or phpMyAdmin).
You may also have lost some indexing on the ID field after the move. try running your select again, with EXPLAIN at the front;
EXPLAIN select id from filetable where cid=1234
if the table produced after the query contains mosty NULLs then you need to re-configure the keys in your table.
Alternatively, check how much RAM is on your 32bit box, is it comparable to the 64 bit box?
Obviously, back it all up and do tests on a non-live server first ;)
I hope this helps you get on the right track.
Just do OPTIMIZE on all your tables and check if you (still) have an index on cid.