mysql deletion efficiency - mysql

I have a table with large amount of data. The data need to be updated frequently: delete old data and add new data. I have two options
whenever there is an deletion event, I delete the entry immediately
I marked delete the entries and use an cron job to delete at unpeak time.
any efficiency difference between the two options? or any better solution?

Both delete and update can have triggers, this may affect performance (check if that's your case).
Updating a row is usually faster than deleting (due to indexes etc.)
However, in deleting a single row in an operation, the performance impact shouldn't be that big. If your measurements show that the database spends significant time deleting rows, then your mark-and-sweep approach might help. The key word here is probably measured - unless the single deletes are significantly slower than updates, I wouldn't bother.

You should use low_priority_updates - http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_low_priority_updates. This will give higher priority to your selects than insert/delete/update operations. I used this in production and got a pretty decent speed improvement. The only problem I see with it is that you will lose more data in case of a crashing server.

With MySQL, deletes are simply marked for deletion internally, and when the CPU is (nearly) idle, MySQL then updates the indexes.
Still, if this is a problem, and you are deleting many rows, consider using DELETE QUICK. This tells InnoDB to not update the index, just leave it marked as deleted, so it can be reused.
To recover the unused index space, simply OPTIMIZE TABLE nightly.
In this case, there's no need to implement the functionality in your application that MySQL will do internally.

Related

Is it better to use deactive status column instead of deleting rows for mysql performance?

Recently i watched a video about CRUD operations in mysql and one of the things comes to my attention in that video, commentator claimed deleting rows bad for mysql index performance instead of that we should use a status column.
So, is there a really difference between those two ?
Deleting a row is indeed quite expensive, more expensive than setting a new value to a column. Some people don't ever delete a row from their databases (though it's sometimes due to preserving history, not performance considerations).
I usually do delayed deletions: when my app needs to delete a row, it doesn't actually delete, but sets a status instead. Then later, during low traffic period, I execute those deletions.
Some database engines need their data files to be compacted every once in a while, since they cannot reuse the space from deleted records. I'm not sure if InnoDB is one of those.
I guess the strategy is that deleting a row affects all indexes, whereas modifying a 'status' column might not affect any indexes (since you probably wouldn't index that column due to the low cardinality).
Still, when deleting rows, the impact on indexes is minimal. Inserting affects index performance when it fills up a page, causing the index to be rebuilt. This doesn't happen with deletes. With deletes, the index records are merely marked for deletion.
MySQL will later (when load is low) purge deleted rows from the indexes. So, deletes are already cached. Why double the effort?
Your deletes do need indexes just like your selects and updates in order to quickly find the record to delete. So, don't blame slow deletes that are due to missing or bad indexes on MySQL index performance. Your delete statement's WHERE clause should be able to utilize an index. With InnoDB, this is also important to ensure that just a single index record is locked instead having to lock all of the records or a range.

Update IN MYSQL InnoDB million records

MYSQL Innodb Update Issue:
Once I receive a response (status) for a record ,I need to update the response to a very large table (Approximate 1 million records and will keep increasing),and this will keep happen may be 100 times per second. May I know will there any performance issue? OR any setting I can modify to avoid table locking or query slowing issue.
Thanks.
It sounds like a design issue.
Instead storing the flag (which the status-record update changes) for million data-records, you should store a reference in data-records pointing to the status-record. So, when you update the status-record, no further db operation required. Also, when you're scanning through the data-records, you should JOIN for the status-records (if it's needed to display). If status-record change occurs often, it's better than update millions of data-records.
Maybe, I'm wrong, you should explain the db (structure, table record counts) for more accurate answers.
If you store your table using the MyISAM storage engine, then your table will lock with every update.
However, the InnoDB storage engine is capable of locking individual rows.
If you need to UPDATE multiple records simultaneously, InnoDB may be better.
Any indexes you have on the database (especially clustered indexes) will slow your writes down.
Indexes speed up reading, but they slow down writing. Most databases get read more than written to, but it sounds like yours gets written to much more.

Problematic performance with continuous UPDATE / INSERT in Mysql

Currently we have a database and a script which has 2 update and 1 select, 1 insert.
The problem is we have 20,000 People who run this script every hour. Which cause the mysql to run with 100% cpu.
For the insert, it's for logging, we want to log all the data to our mysql, but as the table scale up, application become slower and slower. We are running on InnoDB, but some people say it should be MyISAM. What should we use? In this log table, we do sometimes pull out the log for statistical purpose. 40->50 times a day only.
Our solution is to use Gearman [http://gearman.org/] to delay insert to the database. But how about the update.
We need to update 2 table, 1 from the customer to update the balance(balance = balance -1), and the other is to update the count from another table.
How should we make this faster and more CPU efficient?
Thank you
but as the table scale up, application become slower and slower
This usually means that you're missing an index somewhere.
MyISAM is not good: in addition to being non ACID compliant, it'll lock the whole table to do an insert -- which kills concurrency.
Read the MySQL documentation carefully:
http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html
Especially "innodb_flush_log_at_trx_commit" -
http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html
I would stay away from MyISAM as it has concurrency issues when mixing SELECT and INSERT statements. If you can keep your insert tables small enough to stay in memory, they'll go much faster. Batching your updates in a transaction will help them go faster as well. Setting up a test environment and tuning for your actual job is important.
You may also want to look into partitioning to rotate your logs. You'd drop the old partition and create a new one for the current data. This is much faster than than deleting the old rows.

MySQL: Why is DELETE more CPU intensive than INSERT?

I'm currently taking the course "Performance Evaluation" at university, and we're now doing an assignment where we are testing the CPU usage on a PHP and MySQL-database server. We use httperf to create custom traffic, and vmstat to track the server load. We are running 3000 connections to the PHP-server, for both INSERT and DELETE (run separately).
Numbers show that the DELETE operation is a lot more CPU intensive than INSERT — and I'm just wondering why?
I initially thought INSERT required more CPU usage, as indexes would need to be recreated, data needed to be written to disk, etc. But obviously I'm wrong, and I'm wondering if anyone can tell me the technical reason for this.
At least with InnoDB (and I hope they have you on this), you have more operations even with no foreign keys. An insert is roughly this:
Insert row
Mark in binary log buffer
Mark commit
Deletions do the following:
Mark row removed (taking the same hit as an insertion -- page is rewritten)
Mark in binary log buffer
Mark committed
Actually go remove the row, (taking the same hit as an insertion -- page is rewritten)
Purge thread tracks deletions in binary log buffer too.
For that, you've got twice the work going on to delete rather than insert. A delete requires those two writes because it must be marked as removed for all versions going forward, but can only be removed when no transactions remain which see it. Because InnoDB only writes full blocks, to the disk, the modification penalty for a block is constant.
DELETE also requires data to be written to disk, plus recalculation of indexes, and in addition, a set of logical comparisons to find the record(s) you are trying to delete in the first place.
Delete requires more logic than you think; how much so depends on the structure of the schema.
In almost all cases, when deleting a record, the server must check for any dependencies upon that record as a foreign key reference. That, in a nutshell, is a query of the system tables looking for table definitions with a foreign key ref to this table, then a select of each of those tables for records referencing the record to be deleted. Right there you've increased the computational time by a couple orders of magnitude, regardless of whether the server does cascading deletes or just throws back an error.
Self-balancing internal data structures would also have to be reorganized, and indexes would have to be updated to remove any now-empty branches of the index trees, but these would have counterparts in the Insert operations.

How to improve MySQL INSERT and UPDATE performance?

Performance of INSERT and UPDATE statements in our database seems to be degrading and causing poor performance in our web app.
Tables are InnoDB and the application uses transactions. Are there any easy tweaks that I can make to speed things up?
I think we might be seeing some locking issues, how can I find out?
You could change the settings to speed InnoDB inserts up.
And even more ways to speed up InnoDB
...and one more optimization article
INSERT and UPDATE get progressively slower when the number of rows increases on a table with an index. Innodb tables are even slower than MyISAM tables for inserts and the delayed key write option is not available.
The most effective way to speed things up would be to save the data first into a flat file and then do LOAD DATA , this is about 20x faster.
The second option would be create a temporary in memory table, load the data into it and then do a INSERT INTO SELECT in batches. That is once you have about 100 rows in your temp table, load them into the permanent one.
Additionally you can get an small improvement in speed by moving the Index file into a separate physical hard drive from the one where the data file is stored. Also try to move any bin logs into a different device. Same applies for the temporary file location.
I would try setting your tables to delay index updates.
ALTER TABLE {name} delay_key_write='1'
If you are not using indexes, they can help improve performance of update queries.
I would not look at locking/blocking unless the number of concurrent users have been increasing over time.
If the performance gradually degraded over time I would look at the query plans with the EXPLAIN statement.
It would be helpful to have the results of these from the development or initial production environment, for comparison purposes.
Dropping or adding an index may be needed,
or some other maintenance action specified in other posts.