I have a database table that is giving me those headaches, errors when inserting lots of data. Let me break down what exactly happens and I'm hoping someone will have some insight into how I can get this figured out.
Basically I have a table that has 11+ million records in it and it's growing everyday. We track how times a user is viewing a video and their progress in that video. You can see below what the structure is like. Our setup is a master db with two slaves attached to it. Nightly we run a cron script to compile some statistical data out of this table and compile them into a couple other tables we use just for reporting. These cron scripts only do SELECT statements on the slave and will do the insert into our statistical tables on the master (so it'll propagate down). Like clockwork every time we run this script it will lock up our production table. I thought moving the SELECT to a slave would fix this issue and since we aren't even writing into the main table with the cron but rather other tables, I'm now perplexed what could possibly cause this locking up.
It's almost as if it seems that every time a large read on the main table (master or slave) it locks up the master. As soon as the cron is complete, the table goes back to normal performance.
My question is several levels about INNODB. I've had thoughts that it might be indexing that would cause this issue but maybe it's other variables on INNODB settings that I'm not fully understanding. As you would imagine, I want to keep the master from getting this lockup. I don't really care if the slave is pegged out during this script run as long as it won't effect my master db. Is this something that can happen with Slave/Master relationships in MYSQL?
The tables that are getting the compiled information to are stats_daily, stats_grouped for reference.
The biggest issue I have here, to restate a little, is that I don't understand what can cause the locking like this. Taking the reads off the master and just doing inserts into another table doesn't seem like it should do anything on the master original table. I can watch the errors start streaming in, however, 3 minutes after the script starts and it will end immediately when the script stops.
The table I'm working with is below.
CREATE TABLE IF NOT EXISTS `stats` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`VID` int(10) unsigned NOT NULL DEFAULT '0',
`UID` int(10) NOT NULL DEFAULT '0',
`Position` smallint(10) unsigned NOT NULL DEFAULT '0',
`Progress` decimal(3,2) NOT NULL DEFAULT '0.00',
`ViewCount` int(10) unsigned NOT NULL DEFAULT '0',
`DateFirstView` int(10) unsigned NOT NULL DEFAULT '0', // Use unixtimestamps
`DateLastView` int(10) unsigned NOT NULL DEFAULT '0', // Use unixtimestamps
PRIMARY KEY (`ID`),
KEY `VID` (`VID`,`UID`),
KEY `UID` (`UID`),
KEY `DateLastView` (`DateLastView`),
KEY `ViewCount` (`ViewCount`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=15004624 ;
Does anyone have any thoughts or ideas on this?
UPDATE:
The errors I get from the master DB
MysqlError: Lock wait timeout exceeded; try restarting transaction
Uncaught exception 'Exception' with message 'invalid query UPDATE stats SET VID = '13156', UID = '73859', Position = '0', Progress = '0.8', ViewCount = '1', DateFirstView = '1375789950', DateLastView = '1375790530' WHERE ID = 14752456
The update query fails because of the locking. The query is actually valid. I'll get 100s of these and afterwards I can randomly copy/paste these queries and they will work.
UPDATE 2
Queries and Explains from Cron Script
Query Ran on the Slave (leaving php variables in curly brackets for reference):
SELECT
VID,
COUNT(ID) as ViewCount,
DATE_FORMAT(FROM_UNIXTIME(DateLastView), '%Y-%m-%d') AS YearMonthDay,
{$today} as DateModified
FROM stats
WHERE DateLastView >= {$start_date} AND DateLastView <= {$end_date}
GROUP BY YearMonthDay, VID
EXPLAIN of the SELECT Stat
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stats range DateLastView DateLastView 4 NULL 25242 Using where; Using temporary; Using filesort
That result set is looped and inserted into the compiled table. Unfortunately I don't have support for batched inserts with this (I tried) so I have to loop through these one at a time instead of sending a batch of 100 or 500 to the server at a time. This is inserted into the master DB.
foreach ($results as $result)
{
$query = "INSERT INTO stats_daily (VID, ViewCount, YearMonthDay, DateModified) VALUES ({$result->VID}, {$result->ViewCount}, '{$result->YearMonthDay}', {$today} );
DoQuery($query);
}
The GROUP BY is the culprit. Apparently MySQL decides to use a temporary table in this case (perhaps because the table has exceeded some limit) which is very inefficient.
I ran into similar problems, but no clear solution. You could consider splitting your stats table into two tables, a 'daily' and a 'history' table. Run your query on the 'daily' table which only contains entries from the latest 24 hours or whatever your interval is, then clean up the table.
To get the info into your permanent 'history' table, either write your stats into both tables from code, or copy them over from daily into history before cleanup.
Related
I have a large table called "queue". It has 12 million records right now.
CREATE TABLE `queue` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`userid` varchar(64) DEFAULT NULL,
`action` varchar(32) DEFAULT NULL,
`target` varchar(64) DEFAULT NULL,
`name` varchar(64) DEFAULT NULL,
`state` int(11) DEFAULT '0',
`timestamp` int(11) DEFAULT '0',
`errors` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `idx_unique` (`userid`,`action`,`target`),
KEY `idx_userid` (`userid`),
KEY `idx_state` (`state`)
) ENGINE=InnoDB;
Multiple PHP workers (150) use this table simultaneously.
They select a record, perform a network request using the selected data and then delete the record.
I get mixed execution times from the select and delete queries. Is the delete command locking the table?
What would be the best approach for this scenario?
SELECT record + NETWORK request + DELETE the record
SELECT record + NETWORK request + MARK record as completed + DELETE completed records using a cron from time to time (I don't want an even bigger table).
Note: The queue gets new records every minute but the INSERT query is not the issue here.
Any help is appreciated.
"Don't queue it, just do it". That is, if the tasks are rather fast, it is better to simply perform the action and not queue it. Databases don't make good queuing mechanisms.
DELETE does not lock an InnoDB table. However, you can write a DELETE that seems that naughty. Let's see your actual SQL so we can work in improving it.
12M records? That's a huge backlog; what's up?
Shrink the datatypes so that the table is not gigabytes:
action is only a small set of possible values? Normalize it down to a 1-byte ENUM or TINYINT UNSIGNED.
Ditto for state -- surely it does not need a 4-byte code?
There is no need for INDEX(userid) since there is already an index (UNIQUE) starting with userid.
If state has only a few value, the index won't be used. Let's see your enqueue and dequeue queries so we can discuss how to either get rid of that index or make it 'composite' (and useful).
What's the current value of MAX(id)? Is it threatening to exceed your current limit of about 4 billion for INT UNSIGNED?
How does PHP use the queue? Does it hang onto an item via an InnoDB transaction? That defeats any parallelism! Or does it change state. Show us the code; perhaps the lock & unlock can be made less invasive. It should be possible to run a single autocommitted UPDATE to grab a row and its id. Then, later, do an autocommitted DELETE with very little impact.
I do not see a good index for grabbing a pending item. Again, let's see the code.
150 seems like a lot -- have you experimented with fewer? They may be stumbling over each other.
Is the Slowlog turned on (with a low value for long_query_time)? If so, I wonder what is the 'worst' query. In situations like this, the answer may be surprising.
i count page view statistics in Mysql and sometimes get deat lock.
How can resolve this problem? Maybe i need remove one of key?
But what will happen with reading performance? Or is it not affect?
Table:
CREATE TABLE `pt_stat` (
`stat_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`post_id` int(11) unsigned NOT NULL,
`stat_name` varchar(50) NOT NULL,
`stat_value` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`stat_id`),
KEY `post_id` (`post_id`),
KEY `stat_name` (`stat_name`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
Error: "Deadlock found when trying to get lock; try restarting transaction".
UPDATE pt_stat SET stat_value = stat_value + 1 WHERE post_id = "21500" AND stat_name = 'day_20170111';
When dealing with deadlocks, the first thing to do, always, is to see whether you have complex transactions deadlocking against eachother. This is the normal case. I assume based on your question that the update statement, however, is in its own transaction and therefore there are no complex interdependencies among writes from a logical database perspective.
Certain multi-threaded databases (including MySQL) can have single statements deadlock against themselves due to write dependencies within threads on the same query. MySQL is not alone here btw. MS SQL Server has been known to have similar problems in some cases and workloads. The problem (as you seem to grasp) is that a thread updating an index can deadlock against another thread that updates an index (and remember, InnoDB tables are indexes with leaf-nodes containing the row data).
In these cases there are three things you can look at doing:
If the problem is not severe, then the best option is generally to retry the transaction in case of deadlock.
You could reduce the number of background threads but this will affect both read and write performance, or
You could try removing an index (key). However, keep in mind that unindexed scans on MySQL are slow.
I have a dating website in which i send daily alerts and log alerts in ALERTS_LOG.
CREATE TABLE `ALERTS_LOG` (
`RECEIVERID` mediumint(11) unsigned NOT NULL DEFAULT '0',
`MATCHID` mediumint(11) unsigned NOT NULL DEFAULT '0',
`DATE` smallint(6) NOT NULL DEFAULT '0',
KEY `RECEIVER` (`RECEIVER`),
KEY `USER` (`USER`)
) ENGINE=MRG_MyISAM DEFAULT CHARSET=latin1 INSERT_METHOD=LAST UNION=(`ALERTS_LOG110`,`ALERTS_LOG111`,`ALERTS_LOG112`)
Logic Of Insertion : I have create merge table and each sub tables like ALERTS_LOG110 store 0-15 days record. On every 1st and 16th i create a new table and change definition of mergeMyisam.
Example : INSERT_METHOD=LAST UNION=(ALERTS_LOG111,ALERTS_LOG112,ALERTS_LOG113).
Advantage :
Deletion of is super fast.
Issues with this approach:
1. When i change definition, i often got site down issue as when i change the definition, indexes need to get on cache and all select queries got stuck.
2. Locking issue because of too many inserts and select.
So, can I look MongoDB for solving this issue?
No, not really. Re-engineering your application to use two different database types because of performance on this log table seems like a poor choice.
It's not really clear why you have so many entries being logged, but on the face of it look might like to look into partitioning in MySQL and partition your table by day or week and then drop those partitions. Deletion is still super fast and there would be no downtime for it because you won't be changing object names every day.
I have two databases:
Database A
CREATE TABLE `jobs` (
`job_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`in_b`, tinyint(1) DEFAULT 0,
PRIMARY KEY (`url_id`),
KEY `idx_inb` (`in_b`),
)
Database B
CREATE TABLE `jobs_copy` (
`job_id` int(11) unsigned NOT NULL,
`created` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`url_id`)
)
Performance Issue
I am performing a query where I get a batch of jobs (100 jobs) from Database A and create a copy in Database B, then mark them as in_b with a:
UPDATE jobs SET in_b=1 WHERE job_id IN (1,2,3.....)
This worked fine. The rows were being transferred fairly quickly until I reached job_id values > 2,000,000. The select query to get a batch of jobs was still quick (4ms), but the update statement was much slower.
Is there a reason for this? I searched MySQL Docs / Stackoverflow to see if converting the "IN" to a "OR" query would improve this query, but the general consensus was that a "ON" query will be faster in most cases.
If anyone has any insight as to why this is happening and how I can avoid this slowdown as I reach 10mil + rows, I would be extremely grateful.
Thanks in advance,
Ash
P.S. I am completing these update/select/insert through two RESTful services (one attached to each DB) but this is a constant from job_id 1 to through 2mil etc.
Your UPDATE query is progressively slowing down because it's having to read many rows from your large table to find the rows it needs to process. It's probably doing a so-called full table scan because there is no suitable index.
Pro tip: when a query starts out running fast, but then gets slower and slower over time, it's a sign that optimization (possibly indexing) is required.
To optimize this query:
UPDATE jobs SET in_b=1 WHERE job_id IN (1,2,3.....)
Create an index on the job_id column, as follows.
CREATE INDEX job_id_index ON jobs(job_id)
This should allow your query to locate the records which it needs to update very quickly with its IN (2,3,6) search filter.
I am using magento and having a lot of slowness on the site. There is very, very light load on the server. I have verified cpu, disk i/o, and memory is light- less than 30% of available at all times. APC caching is enabled- I am using new relic to monitor the server and the issue is very clearly insert/updates.
I have isolated the slowness to all insert and update statements. SELECT is fast. Very simple insert / updates into tables take 2-3 seconds whether run from my application or the command line mysql.
Example:
UPDATE `index_process` SET `status` = 'working', `started_at` = '2012-02-10 19:08:31' WHERE (process_id='8');
This table has 9 rows, a primary key, and 1 index on it.
The slowness occurs with all insert / updates. I have run mysqltuner and everything looks good. Also, changed innodb_flush_log_at_trx_commit to 2.
The activity on this server is very light- it's a dv box with 1 GB RAM. I have magento installs that run 100x better with 5x the load on a similar setup.
I started logging all queries over 2 seconds and it seems to be all inserts and full text searches.
Anyone have suggestions?
Here is table structure:
CREATE TABLE IF NOT EXISTS `index_process` (
`process_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`indexer_code` varchar(32) NOT NULL,
`status` enum('pending','working','require_reindex') NOT NULL DEFAULT 'pending',
`started_at` datetime DEFAULT NULL,
`ended_at` datetime DEFAULT NULL,
`mode` enum('real_time','manual') NOT NULL DEFAULT 'real_time',
PRIMARY KEY (`process_id`),
UNIQUE KEY `IDX_CODE` (`indexer_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
First: (process_id='8') - '8' is char/varchar, not int, so mysql convert value first.
On my system, I had long times (greater than one second) to update users.last_active_time.
The reason was that I had a few queries that long to perform. As I joined them for the users table. This resulted in blocking of the table to read. Death lock by SELECT.
I rewrote query from: JOIN to: sub-queries and porblem gone.