How to put a 188MB MYISAM table into memory - mysql

For performance reason i will put a 188MB table (rebuild every day on disk) with ~ 550.000 datasets into MEMORY table. Whenever i tried this, i run into HEAP error ...
My server has 1.3GB free RAM (only 32BIt 4 GB)

Have you checked the configured mysql heap table size? Have a look at this:
mysql> show variables like "%heap%";
+---------------------+----------+
| Variable_name | Value |
+---------------------+----------+
| max_heap_table_size | 16777216 |
+---------------------+----------+
1 row in set (0.02 sec)
The default value is 16MB.

Related

MySQL LIMIT X, Y slows down as I increase X

I have a db with around 600 000 listings, while browsing these on a page with pagination, I use this query to limit records:
SELECT file_id, file_category FROM files ORDER BY file_edit_date DESC LIMIT 290580, 30
On first pages LIMIT 0, 30 it loads in few ms, same for LIMIT 30,30, LIMIT 60,30, LIMIT 90,30, etc. But as I move forward to the end of the pages, the query takes around 1 second to execute.
Indexes are probably not related, it also happens if I run this:
SELECT * FROM `files` LIMIT 400000,30
Not sure why.
Is there a way to improve this ?
Unless there is a better solution, would it be a bad practice to just load all records and loop over them in the PHP page to see if the record is inside the pagination range and print it ?
Server is an i7 with 16GB ram;
MySQL Community Server 5.7.28;
files table is around 200 MB
here is the my.cnf if it matters
query_cache_type = 1
query_cache_size = 1G
sort_buffer_size = 1G
thread_cache_size = 256
table_open_cache = 2500
query_cache_limit = 256M
innodb_buffer_pool_size = 2G
innodb_log_buffer_size = 8M
tmp_table_size=2G
max_heap_table_size=2G
You may find that adding the following index will help performance:
CREATE INDEX idx ON files (file_edit_date DESC, file_id, file_category);
If used, MySQL would only need a single index scan to retrieve the number of records at some offset. Note that we include the columns in the select clause so that the index may cover the entire query.
LIMIT was invented to reduce the size of the result set, it can be used by the optimizer if you order the result set using an index.
When using LIMIT x,n the server needs to process x+n rows to deliver a result. The higher the value for x, the more rows have to be processed.
Here is the explain output from a simple table, having an unique index on column a:
MariaDB [test]> explain select a,b from t1 order by a limit 0, 2;
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------+
| 1 | SIMPLE | t1 | index | NULL | PRIMARY | 4 | NULL | 2 | |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.00 sec)
MariaDB [test]> explain select a,b from t1 order by a limit 400000, 2;
+------+-------------+-------+-------+---------------+---------+---------+------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+--------+-------+
| 1 | SIMPLE | t1 | index | NULL | PRIMARY | 4 | NULL | 400002 | |
+------+-------------+-------+-------+---------------+---------+---------+------+--------+-------+
1 row in set (0.00 sec)
When running the statements above (without EXPLAIN) the execution time for LIMIT 0 is 0.01 secs, for LIMIT 400000 0.6 secs.
Since MariaDB doesn't support LIMIT in a subquery, you could split your SQL statements in to two statements:
The first statement retrieves the id's (and needs to read the index file only), the second statement uses the id's retrieved from first statement:
MariaDB [test]> select a from t1 order by a limit 400000, 2;
+--------+
| a |
+--------+
| 595312 |
| 595313 |
+--------+
2 rows in set (0.08 sec)
MariaDB [test]> select a,b from t1 where a in (595312,595313);
+--------+------+
| a | b |
+--------+------+
| 595312 | foo |
| 595313 | foo |
+--------+------+
2 rows in set (0.00 sec)
Caution: I am about to use some strong language. Computers are big and fast, and they can handle bigger stuff than they could even a decade ago. But, as you are finding out, there are limits. I'm going to point out multiple limits that you have threatened; I will try to explain why the limits may be a problem.
Settings
query_cache_size = 1G
is terrible. Whenever a table is written to, the QC scans the 1GB looking for any references to that table in order to purge entries in the QC. Decrease that to 50M. This, alone, will speed up the entire system.
sort_buffer_size = 1G
tmp_table_size=2G
max_heap_table_size=2G
are bad for a different reason. If you have multiple connections performing complex queries, lots of RAM could be allocated for each, thereby chewing up RAM, leading to swapping, and possibly crashing. Don't set them higher than about 1% of RAM.
In general, do not blindly change values in my.cnf. The most important setting is innodb_buffer_pool_size, which should be bigger than your dataset, but no bigger than 70% of available RAM.
load all records
Ouch! The cost of shoveling all that data from MySQL to PHP is non-trivial. Once it gets to PHP, it will be stored in structures that are not designed for huge amounts of data -- 400030 (or 600000) rows might take 1GB inside PHP; this would probably blow out its "memory_limit", leading PHP crashing. (OK, just dying with an error message.) It is possible to raise that limit, but then PHP might push MySQL out of memory, leading to swapping, or maybe running out of swap space. What a mess!
OFFSET
As for the large OFFSET, why? Do you have a user paging through the data? And he is almost to page 10,000? Are there cobwebs covering him?
OFFSET must read and step over 290580 rows in your example. That is costly.
For a way to paginate without that overhead, see http://mysql.rjweb.org/doc.php/pagination .
If you have a program 'crawling' through all 600K rows, 30 at a time, then the tip about "remember where you left off" in that link will work very nicely for such use. It does not "slow down".
If you are doing something different; what is it?
Pagination and gaps
Not a problem. See also: http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks which is more aimed at walking through an entire table. It focuses on an efficient way to find the 30th row going forward. (This is not necessarily any better than remembering the last id.)
That link is aimed at DELETEing, but can easily be revised toSELECT`.
Some math for scanning a 600K-row table 30 rows at a time:
My links: 600K rows are touched. Or twice that, if you peek forward with LIMIT 30,1 as suggested in the second link.
OFFSET ..., 30 must touch (600K/30)*600K/2 rows -- about 6 billion rows.
(Corollary: changing 30 to 100 would speed up your query, though it would still be painfully slow. It would not speed up my approach, but it is already quite fast.)

How to calculate amount of work performed by the page cleaner thread each second?

I try to tune InnoDB Buffer Pool Flushing parameters.
In MySQL 5.7 manual
innodb_lru_scan_depth * innodb_buffer_pool_instances = amount of work performed by the page cleaner thread each second
My question is : How can I calculate the amount of work performed by the page cleaner thread each second?
Run the SQL command:
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_flushed'
Once every second. Compare the value to the previous second.
The difference of that value from one second to the next is the number of dirty pages the page cleaner requested to flush to disk.
Example:
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_flushed';
+----------------------------------+-----------+
| Variable_name | Value |
+----------------------------------+-----------+
| Innodb_buffer_pool_pages_flushed | 496786650 |
+----------------------------------+-----------+
...wait a moment...
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_flushed';
+----------------------------------+-----------+
| Variable_name | Value |
+----------------------------------+-----------+
| Innodb_buffer_pool_pages_flushed | 496787206 |
+----------------------------------+-----------+
So in the moment I waited, the page cleaner flushed 556 pages.
The upper limit of this work is a complex calculation, involving several InnoDB configuration options. Read my answer to How to solve mysql warning: "InnoDB: page_cleaner: 1000ms intended loop took XXX ms. The settings might not be optimal "? for a description of how it works.

Mysql query failed, but didn't clean up the disk space

I was trying to run the following statement with the hope to create a join of two existing tables.
create table CRS_PAIR
select concat_ws(',', a.TESTING_ID, b.TRAINING_ID, a.TESTING_C) as k, concat_ws(',', a.YTG, b.YTG) as YTG
from CRS_TESTING a, CRS_TRAINING b
where a.TESTING_C=b.TRAINING_C;
Currently the size of these two tables are:
mysql> SELECT table_name, round(((data_length + index_length) / (1024*1024)),2) as "size in megs" FROM information_schema.tables WHERE table_schema = "crs";
+----------------+---------------+
| table_name | size in megs |
+----------------+---------------+
| CRS_TESTING | 36.59 |
| CRS_TRAINING | 202.92 |
+----------------+---------------+
After a little over a day, The query finished and I got the following result.
140330 2:53:50 [ERROR] /usr/sbin/mysqld: The table 'CRS_PAIR' is full
140330 2:53:54 InnoDB: ERROR: the age of the last checkpoint is 9434006,
InnoDB: which exceeds the log group capacity 9433498.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.
It turned out that the size of /var/lib/mysql has grown to 246GB in disk space, and the disk run out of space. However, for some reason, the CRS_PAIR table does not show up in the shell. Even when I try to get the size of all databases.
mysql> SELECT table_schema "Data Base Name", sum( data_length + index_length ) / (1024 * 1024) "Data Base Size in MB" FROM information_schema.TABLES GROUP BY table_schema ;
+--------------------+----------------------+
| Data Base Name | Data Base Size in MB |
+--------------------+----------------------+
| crs | 1426.4531 |
| information_schema | 0.0088 |
| mysql | 0.6453 |
| performance_schema | 0.0000 |
+--------------------+----------------------+
4 rows in set (0.74 sec)
This is the show tables command.
mysql> show tables;
+----------------+
| Tables_in_crs |
+----------------+
| CRS_TESTING |
| CRS_TRAINING |
some other tables
+----------------+
9 rows in set (0.00 sec)
CRS_PAIR is not there.
May I ask if anyone can help me figure out where this mysterious table went to so that I can clean up my disk space?
If you don't have innodb_file_per_table set (or set to 0) then InnoDB is going to put all your InnoDB tables into the pool file (usually /var/lib/mysql/ibdata1), expanding it as required to fit in written data. However, the engine never does any space reclamation. That means the ibdata1 file always grows, it never shrinks.
The only way to reduce the size of this file is to backup your data, shutdown MySQL, delete it, restart MySQL and then reload your data.

MySQL Slow Log issues - long_query_time does not go into effect

I have followed a few tutorials in tracking down slow queries through the slow query log.
I have tried changing long_query_time to the value of 1 for testing purposes, but whatever I do, a query only makes it into the log when the default time of 10 is reached.
I tried:
set ##GLOBAL.long_query_time = 1;
set global long_query_time = 1;
When using either of these commands:
show variables like '%long%';
show global variables like '%long%';
I get the result that the variable was changed.
I have the exact same query running, just adding more LEFT JOIN entries to make it run longer. Whenever the query runs 10 seconds or longer, it is logged, but it does NOT show up in the log when it runs less than that, even though all my variables appear to say they are changed.
I am logged into MySQL as root as I make these changes.
I restarted Apache and MySQL, still no dice.
My version information is:
Server version: 5.1.63-log SUSE MySQL RPM
When I query both the session and the global variables (I tried both), I get this:
mysql> show variables like '%long%';
+--------------------+----------+
| Variable_name | Value |
+--------------------+----------+
| long_query_time | 1.000000 |
| max_long_data_size | 1048576 |
+--------------------+----------+
2 rows in set (0.00 sec)
mysql> show global variables like '%long%';
+--------------------+----------+
| Variable_name | Value |
+--------------------+----------+
| long_query_time | 1.000000 |
| max_long_data_size | 1048576 |
+--------------------+----------+
2 rows in set (0.00 sec)
The general logging feature is obviously on, and it is redirected to TABLE or I wouldn't get an entry in the log at all.
The setting log_queries_not_using_indexes if turned on starts logging EVERY query even if it does not take 1 second to execute.
What am I missing?
Thanks!
The configuration below turns MySQL to log queries which execution time is more than half second:
slow_query_log = 1
long_query_time = 0.5
log-slow-queries = /var/log/mysql/log-slow-queries.log
log_queries_not_using_indexes = 0

Mysql table lock instead of row locks

Mysql (5.5) Innodb in this certain case is putting table lock rather than row locks.
And this is causing failure of other insert queries to the table. Also this is a part of a larger transaction.
Insert into table x(x1,x2)
Select y1,y2 from y
where 'big sql case based conditions'
Now the select query select only part of table (based on which user) and not full table.
But mysql innodb is putting table locks.
Is there any way I can avoid this? Any help will be appreciated.
I think you are using tx in REPEATABLE READ mode. could you check out this?
mysql> show session variables like '%isol%';
+---------------+-----------------+
| Variable_name | Value |
+---------------+-----------------+
| tx_isolation | REPEATABLE-READ |
+---------------+-----------------+
If so, change it to 'READ COMMITTED' like this:
mysql> set session transaction isolation level read committed;
Query OK, 0 rows affected (0.00 sec)
mysql> show session variables like '%isol%';
+---------------+----------------+
| Variable_name | Value |
+---------------+----------------+
| tx_isolation | READ-COMMITTED |
+---------------+----------------+
Then, client A starts INSERT INTO .. SELECT and insert a row from client B. I think client B's INSERT would succeed.