tmp table not growing when running ALTER to add new column - mysql

I have about a 25 GB table that I have to add a column to.
I run a script and when I execute it, I can see the temp table in the data directory but it stays stuck at about 480K. I can see in processlist that the ALTER is running and there are no issues.
If I kill the script after a long period of activity, then in processlist the query remains in "killed" state and the tmp file will start growing until the query is LITERALLY killed (ie., goes from "killed" state in processlist to disappearing off of the processlist altogether).
When I run the following (before killing the query):
select * from global_temporary_tables\G
it doesn't show any rows being added either.
Is there anything else I can do?

Firstly, what your "ps" output report may show has nothing to do with anything. Don't rely upon what "ps" says: it includes stale data.
If the process has been killed (SIGKILL, not SIGTERM), I guarantee you it is no longer delivering any output to anywhere. If it's been SIGTERMed, it depends what signal handlers you've attached. I'm going to hazard a wild guess that you haven't registered any signal handlers.
Most production DBMS set up storage in chunks. X amount of space is obtained, which may contain "slack" room that enables rows and/or columns to be added (I do NOT say that the two mechanisms are identical). Just because something didn't grow in a manner that you could perceive doesn't mean that the changes weren't made. Why not check out the data dictionary, interrogating the current structure of the table.
Did you COMMIT your changes? In some DBMS, DDL operations are regarded as committable/rollbackable (yecch) events.

Related

MySQL "pileup" when importing rows

I have the following cron process running every hour to update global game stats:
Create temporary table
For each statistic, insert rows into the temporary table (stat key, user, score, rank)
Truncate main stats table
Copy data from temporary table to main table
The last step causes massive backlog in queries. Looking at SHOW PROCESSLIST I see a bunch of updating-status queries that are stuck until the copy completes (which may take up to a minute).
However I did notice that it's not like it has consecutive query IDs piling up, many queries get completed just fine. So it almost seems like it's a "thread" that gets stuck or something. Also of note is that the stuck updates have nothing in common with the ongoing copy (different tables, etc)
So:
Can I have cron connect to MySQL on a dedicated "thread" such that its disk activity (or whatever it is) doesn't lock other updates, OR
Am I misinterpreting what's going on, and if so how can I find out what the actual case is?
Let me know if you need any more info.
MySQL threads are not perfectly named. If you're a Java dev, for example, you might make some untrue assumptions about MySQL threads based on your Java knowledge.
For some reason that's hard to diagnose from a distance, your copy step is blocking some queries from completing. If you're curious about which ones try doing
SHOW FULL PROCESSLIST
and try to make sense of the result.
In the meantime, you might consider a slightly different approach to refreshing these hourly stats.
create a new, non temporary table, calling it something like stats_11 for the 11am update. If the table with that name already existed, drop the old one first.
populate that table as needed.
add the indexes it needs. Sometimes populating the table is faster if the indexes aren't in place while you're doing it.
create or replace view stats as select * from stats_11
Next hour, do the same with stats_12. The idea is to have your stats view pointing to a valid stats table almost always.
This should reduce your exposure time to the stats-table building operaiton.
If the task is to completely rebuild the table, this is the best:
CREATE TABLE new_stats LIKE stats;
... fill up new_stats by whatever means ...
RENAME TABLE stats TO old_stats, new_stats TO stats;
DROP TABLE old_stats;
There is zero interference because table real is always available and always has a complete set of rows. (OK, RENAME does take a minuscule amount of time.)
No VIEWs, no TEMPORARY table, no copying the data over, no need for 24 tables.
You could consider doing the task "continually", rather than hourly. This becomes especially beneficial if the table gets so big that the hourly cron job takes more than one hour!

Mysql (MariaDB) How to figure out why PROCESSLIST MEMORY_USAGE continuously growing unexpectedly

I have a database with a few tables in it. One of the tables has ~3000 rows each with ~20 columns. Every 30 seconds one of the rows in the table is UPDATE'ed with new information. I'm having a problem where sometimes (infrequently) I will notice the memory being used by the process that is updating the rows will start increasing "indefinitely" (I stop the process before it stops growing, but I'm sure it stops at some upper limit). The database is not growing during this time. Only existing rows are being updated.
I'm looking for ideas on what could cause the memory usage to start going up so that I can prevent it from happening. Since most of the time the memory usage stays the same despite running the same update process I'm not sure what condition is triggering the failure state (growing memory usage) so that I can recreate the failure on demand.
The table is using the Memory engine and I've seen the same failure using the InnoDB engine.
The MEMORY_USAGE I'm looking at is in the table returned by the below query. Are there other mysql variables I can look at to get a better idea of what specifically is using up the memory?
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST
I found my bug. To anyone else who ends up here remember to call mysql_free_result() (I had a case where I wasn't).

Too many unknown writes in MySQL

I have a MySQL database in my production environment.Which had about 430 million row, of which 190 million rows were not of any use, so I started deleting these rows range by range, in night, as it would have affected my apps performance in daytime.
Now when I am seeing in my monitoring app, I am seeing 100%IO, of which maximum is write (12-30MB/s). (400-500 writes/sec)
But when I am checking process list I don't find any INSERT or UPDATE query or any rollback.
What can be the possible issue or how can I find any hidden query which may be writing in MySQL.
(In IOTP, I found that write operations are being done by mysqld only)
One more thing, I can see write with 80MB/s in IOTOP , but when I am checking directory size in / , I don't see any rise in any directory size.
Back away slowly... and wait.
InnoDB doesn't change the actual data in the tablespace files with each DML query.
It writes the changes to memory, of course, and then the redo log, at which point they are effectively "live" and safely persisted to disk... but they are not yet applied to the actual data (tablespace) files. InnoDB then syncs the changes to the data files in the background but in the mean time, other queries use a combination of the tablespace and log contents to determine what the "official" table data currently contains. This is, of course, an oversimplification, but MVCC necessarily means the physical data is a superset, though not necessarily a proper superset, of the logical data.
That's very likely to be the explanation for what you are seeing now.
It makes sense that free/used disk space isn't changing, because finalizing the deletion of those rows will only really be marking the space inside the tablespace files as unused. They shouldn't grow or shrink.
Resist the temptation to try to "fix" it and whatever you do, don't restart the server... because the best possible outcome is that it will pick up where it left off because it still has work to do.
SHOW ENGINE INNODB STATUS takes some practice to interpret but will likely be another key to the puzzle.
Is the delete operation still undergoing? DELETE can be extremely slow and generate a lot of writes. It is often better to create a new identical table and copy the rows you want to KEEP over to it and then switch it with the production table instead of delete stuff in the production table directly.
If the DELETE has already finished and you suspect that there are other queries running, you can enable query log for a few seconds and see which queries are executed:
TRUNCATE TABLE mysql.general_log;
SET GLOBAL log_output = 'TABLE';
SET GLOBAL general_log = 'ON';
SELECT SLEEP(10);
SET GLOBAL general_log = 'OFF';
Then SELECT from mysql.general_log to see which queries executed during the 10 seconds sleep.

MySQL UPDATES get progressively slower

I have an application using a MySQL database hosted on one machine and 6 clients running on other machines that read and write to it over a local network.
I have one main work table which contains about 120,000 items in rows to be worked on. Each client grabs 40 unallocated work items from the table (marking them as allocated), does the work and then writes back the results to the same work table. This sequence continues until there is no more work to do.
The above is a picture that shows the amount of time taken to write back each block of 40 results to the table from one of the clients using UPDATE queries. You can see that the duration is fairly small for most of the time but suddenly the duration goes up to 300 sec and stays there until all work completes. This rapid increase in time to execute the queries towards the end is what I need help with.
The clients are not heavily loaded. The server is a little loaded but it has 16GB of RAM, 8 cores and is doing nothing other than hosting this db.
Here is the relevant SQL code.
Table creation:
CREATE TABLE work (
item_id MEDIUMINT,
item VARCHAR(255) CHARACTER SET utf8,
allocated_node VARCHAR(50),
allocated_time DATETIME,
result TEXT);
/* Then insert 120,000 items, which is quite fast. No problem at this point. */
INSERT INTO work VALUES (%s,%s,%s,NULL,NULL,NULL);
Client allocating 40 items to work on:
UPDATE work SET allocated_node = %s, allocated_time=NOW()
WHERE allocated_node IS NULL LIMIT 40;
SELECT item FROM work WHERE allocated_node = %s AND result IS NULL;
Update the row with the completed result (this is the part that gets really slower after a few hours of running):
/* The chart above shows the time to execute 40 of these for each write back of results */
UPDATE work SET result = %s WHERE item = %s;
I'm using MySQL on Ubuntu 14.04, with all the standard settings.
The final table is about 160MB, and there are no indexes.
I don't see anything wrong with my queries and they work fine apart from the whole thing taking twice as long as it should overall.
Can someone with experience in these matters suggest any configuration settings I should change in MySQL to fix this performance issue or please point out any issues with what I'm doing that might explain the timing in the chart.
Thanks.
Without an index the complete table is scanned. If the item id gets larger a greater amount of the table has to be scanned to get the row to be updated.
I would try an index perhaps even the primary key for item_id?
Still the increase of duration seems too high for such a machine and relativly small database.
Given that more details would be required for a proper diagnosing (see below), I see two potential performance decrease possibilities here.
One is that you're running into a Schlemiel the Painter's Problem which you could ameliorate with
CREATE INDEX table_ndx ON table(allocated_node, item);
but it looks unlikely with so low a cardinality. MySQL shouldn't take so long to locate unallocated nodes.
A more likely explanation could be that you're running into a locking conflict of some kind between clients. To be sure, during those 300 seconds in which the system is stalled, run
SHOW FULL PROCESSLIST
from an administrator connection to MySQL. See what it has to say, and possibly use it to update your question. Also, post the result of
SHOW CREATE TABLE
against the tables you're using.
You should be doing something like this:
START TRANSACTION;
allocate up to 40 nodes using SELECT...FOR UPDATE;
COMMIT WORK;
-- The two transactions serve to ensure that the node selection can
-- never lock more than those 40 nodes. I'm not too sure of that LIMIT
-- being used in the UPDATE.
START TRANSACTION;
select those 40 nodes with SELECT...FOR UPDATE;
<long work involving those 40 nodes and nothing else>
COMMIT WORK;
If you use a single transaction and table level locking (even implicitly), it might happen that one client locks all others out. In theory this ought to happen only with MyISAM tables (that only have table-level locking), but I've seen threads stalled for ages with InnoDB tables as well.
Your 'external locking' technique sounds fine.
INDEX(allocated_node) will help significantly for the first UPDATE.
INDEX(item) will help significantly for the final UPDATE.
(A compound index with the two columns will help only one of the updates, not both.)
The reason for the sudden increase: You are continually filling in big TEXT fields, making the table size grow. At some point the table is so big that it cannot be cached in RAM. So, it goes from being cached to being a full table scan.
...; SELECT ... FOR UPDATE; COMMIT; -- The FOR UPDATE is useless since the COMMIT happens immediately.
You could play with the "40", though I can't think why a larger or smaller number would help.

MySQL query slowing down until restart

I have a service that sits on top of a MySQL 5.5 database (INNODB). The service has a background job that is supposed to run every week or so. On a high level the background job does the following:
Do some initial DB read and write in one transaction
Execute UMQ (described below) with a set of parameters in one transaction.
If no records are returned we are done!
Process the result from UMQ (this is a bit heavy so it is done outside of any DB
transaction)
Write the outcome of the previous step to DB in one transaction (this
writes to tables queried by UMQ and ensures that the same records are not found again by UMQ).
Goto step 2.
UMQ - Ugly Monster Query: This is a nasty database query that joins a bunch of tables, has conditions on columns in several of these tables and includes a NOT EXISTS subquery with some more joins and conditions. UMQ includes ORDER BY also has LIMIT 1000. Even though the query is bad I have done what I can here - there are indexes on all columns filtered on and the joins are all over foreign key relations.
I do expect UMQ to be heavy and take some time, which is why it's executed in a background job. However, what I'm seeing is rapidly degrading performance until it eventually causes a timeout in my service (maybe 50 times slower after 10 iterations).
First I thought that it was because the data queried by UMQ changes (see step 4 above) but that wasn't it because if I took the last query (the one that caused the timeout) from the slow query log and executed it myself directly I got the same behavior only until I restated the MySQL service. After restart the exact query on the exact same data that took >30 seconds before restart now took <0.5 seconds. I can reproduce this behavior every time by restoring the database to it's initial state and restarting the process.
Also, using the trick described in this question I could see that the query scans around 60K rows after restart as opposed to 18M rows before. EXPLAIN tells me that around 10K rows should be scanned and the result of EXPLAIN is always the same. No other processes are accessing the database at the same time and the lock_time in the slow query log is always 0. SHOW ENGINE INNODB STATUS before and after restart gives me no hints.
So finally the question: Does anybody have any clue of why I'm seeing this behavior? And how can I analyze this further?
I have the feeling that I need to configure MySQL differently in some way but I have searched and tested like crazy without coming up with anything that makes a difference.
Turns out that the behavior I saw was the result of how the MySQL optimizer uses InnoDB statistics to decide on an execution plan. This article put me on the right track (even though it does not exactly discuss my problem). The most important thing I learned from this is that MySQL calculates statistics on startup and then once in a while. This statistics is then used to optimize queries.
The way I had set up the test data the table T where most writes are done in step 4 started out as empty. After each iteration T would contain more and more records but the InnoDB statistics had not yet been updated to reflect this. Because of this the MySQL optimizer always chose an execution plan for UMQ (which includes a JOIN with T) that worked well when T was empty but worse and worse the more records T contained.
To verify this I added an ANALYZE TABLE T; before every execution of UMQ and the rapid degradation disappeared. No lightning performance but acceptable. I also saw that leaving the database for half an hour or so (maybe a bit shorter but at least more than a couple of minutes) would allow the InnoDB statistics to refresh automatically.
In a real scenario the relative difference in index cardinality for the tables involved in UMQ will look quite different and will not change as rapidly so I have decided that I don't really need to do anything about it.
thank you very much for the analysis and answer. I've been searching this issue for several days during ci on mariadb 10.1 and bacula server 9.4 (debian buster).
The situation was that after fresh server installation during a CI cycle, the first two tests (backup and restore) runs smoothly on unrestarted mariadb server and only the third test showed that one particular UMQ took about 20 minutes (building directory tree during restore process from the table with about 30k rows).
Unless the mardiadb server was restarted or table has been analyzed the problem would not go away. ANALYZE TABLE or the restart changed the cardinality of the fields and internal query processing exactly as stated in the linked article.