MySQL partitioning for table with huge inserts and deletes - mysql

I am having a table in which we have some 20 million entries inserted(blind insertion without any constraints) per day. We have two foreign keys and one of it is a reference id to a table with some 10 million entries.
I am planning to delete all the data in this table older than a month, because this data is not needed anymore. But the problem is that with the huge number of insertions happening, if i start deleting, the table will be locked and insertions will be blocked.
I wanted to know if we can use partitioning on the table based on month. This way, i was hoping that when i try deleting all the data older than 2 months, this data should be in a different partition and insertions should be happening in a different partition, and the delete lock will not be blocking the read lock.
Please tell me if this is possible. I am fairly new to using DB, so please let me know if there is something wrong with my thought.

From the MySQL documentation
For InnoDB and BDB tables, MySQL uses table locking only if you
explicitly lock the table with LOCK TABLES. For these storage engines,
avoid using LOCK TABLES at all, because InnoDB uses automatic
row-level locking and BDB uses page-level locking to ensure
transaction isolation.
I'm not sure you even have an issue. Have you tested this and seen locking issues, or are you just theorizing about them right now?

MySQL has partitioning as of version 5.1.
You can run this query to verify if your version of MySQL supports partitioning:
SHOW VARIABLES LIKE 'have_partitioning';
Then you can read the manual to learn how to use it:
http://dev.mysql.com/doc/refman/5.5/en/partitioning.html

Related

What are the current differences between MyISAM and InnoDB storage engines specifically in MySQL 5.7?

I saw so many questions and answers on this topic MyISAM vs InnoDB on stackoverflow itself.
But, all of the questions and answers are too old and not related to the current stable version of MySQL 5.7.x
By the time so much development must have been done in both MyISAM and InnoDB.
So, I need those differences available presently with version 5.7.x
So, please don't mark my question duplicate and someone please explain the differences these storage engines have currently as well as the differences they have since past.
Also, please explain at what situation which storage engine should be chosen for a table.
Can different tables belonging to the same schema have different storage engines i.e. few tables will have InnoDB and few ones will have MyISAM.
If yes, then how the JOIN queries would get execute between tables with MyISAM and InnoDB?
Is it true that MySQL is going to remove MyISAM storage engine from the future version?
Your assumption that MyISAM has been receiving new development is not correct. MyISAM is not receiving any significant new development. MySQL is clearly moving in the direction of phasing out MyISAM, and using MyISAM is discouraged.
Oracle Corp. has not announced any specific date or version by which they will remove MyISAM. My guess is that MyISAM will never be fully removed, because there are too many sites that wouldn't be able to upgrade, without doing expensive testing to make sure their specific app won't experience any regression issues by converting to InnoDB.
But you might notice that in the MySQL 5.7 manual, the section on MyISAM has been demoted to Alternative Storage Engines, which should be a clue that it's receiving less priority.
In MySQL 5.7, MyISAM is still used for some of the system tables, like mysql.user, mysql.db, etc. But new system tables introduced in 5.6 and 5.7 are InnoDB. All system tables are InnoDB in MySQL 8.0.
MyISAM still does not support any of the properties of ACID. There are no transactions, no consistency features, and no durable writes. See my answer to MyISAM versus InnoDB.
MyISAM still does not support foreign keys, for what it's worth. But I seldom see real production sites using foreign keys even with InnoDB.
MyISAM supports only table-level locking (except for some INSERT appending to the end of a table, as noted in the manual).
MySQL 5.7 supports both fulltext indexes and spatial indexes in both MyISAM and InnoDB. These features are not reasons to continue using MyISAM as they once were.
Both logical backup tools like mysqldump and physical backup tools like Percona XtraBackup can't back up MyISAM tables without acquiring a global lock.
You asked if you could create a variety of tables with different storage engines in the same schema. Yes, you can, and this is the same as it has been for many versions of MySQL.
You asked if you can join tables of different storage engines (by the way, tables don't need to be in the same schema to be joined). Yes, you can join such tables, MySQL takes care of all the details. This is the same as it has been for many versions of MySQL.
But some weird cases can come up when you do this, like what if you update a MyISAM table and an InnoDB table in a transaction, and then roll back? The changes in the InnoDB table are rolled back, but the changes in the MyISAM table are not rolled back, so your data integrity can be broken if you aren't careful. This is also the same as it has been for many versions of MySQL.
Cases where MyISAM has an advantage over InnoDB is a short list, and it's getting shorter.
Some table-scan queries and bulk inserts are faster in MyISAM. InnoDB is better at indexed searches.
MyISAM may use less storage space than the equivalent data stored in an uncompressed InnoDB table. You can further compact MyISAM tables with myisampack, but this makes the MyISAM table read-only.
There are other options these days for compact storage of data in transactional storage engines, for example InnoDB table compression, or MyRocks.
SELECT COUNT(*) FROM MyTable queries (with no WHERE clause) are very fast in MyISAM, because the accurate count of rows is persisted in the MyISAM metadata. InnoDB (or other MVCC implementations) doesn't keep this count persisted, because every transaction viewing the table might "see" a different row count. Only a storage engine that has table-level locking and no transaction isolation like MyISAM, can optimize this case.
Auto-increment that numbers independently for each distinct value in another key column. Again, this requires table-level locking, so it's not supported in InnoDB.
CREATE TABLE MyTable (
group_id INT NOT NULL,
seq_id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY (group_id, seq_id)
) ENGINE=MyISAM;
It's still easy to move a MyISAM table from server to server, because the .MYD and .MYI files are self-contained. You can kind of do something similar with InnoDB tables, but you have to use the intricate feature of transportable tablespaces. But this easy-to-move-tables quality of MyISAM no longer works in MySQL 8.0, because of their new data dictionary feature.
Under certain load, MyISAM might be a better choice for internal_tmp_disk_storage_engine, which defaults to InnoDB in MySQL 5.7. If you run lots of queries that create temp tables on disk (in-memory temp tables won't benefit), it can put a strain on the InnoDB engine. But you'd have to have a high query rate for this to matter, and if your queries create so many temp tables on disk, you should try to optimize the queries differently.
MyISAM allows you to set multiple key caches, and define caches for specific tables. But the MyISAM key caches are only for index structures, not for data.
References:
https://www.percona.com/blog/2016/10/11/mysql-8-0-end-myisam/
https://www.percona.com/blog/2017/12/04/internal-temporary-tables-mysql-5-7/
http://jfg-mysql.blogspot.com/2017/08/why-we-still-need-myisam.html
I had this question for a job quiz and got it right: (referring the new version):
MyISAM and InnoDB are two different storage engins that handle CRUD operations differently.
Locking: When approching a row inside a MyISAM storage engin, all the table will be locked by other sessions until the change is commited, unlike InnoDB, which locks only the specific selected row(/s). The lock is released until the session is commited. Locking a table or a row causes suspention by other sessions that try to interact with the same table or row to prevent wrong data manipulations in the table for example.
Transactions: InnoDB supports transactions, unlike MyISAM. Transactions are a colection of 2 or more commands like SELECT, INSERT, UPDATE and DELETE, to a single operation until complishion.
Atomic Operations: When setting a transaction in an InnoDB and
the operation is incompleted - it terminates all the changes and
restore the DB as it was (all or nothin'), so for example, if in the
middle of a transaction there is a syntax error in the code /
datatype mismatch or anything that might interupt the bundle of
commands to finish its operation - all the changes wont be applied,
thanks transactions atomicy. On the other hand, when using an
MyISAM storage engin, if a bundle of commands "breaks" (for any
reason), the operation stops immediately and all the
tables/rows/data that were affected will remain affected, which
might cause a corrupt data in the database (...and a headache).
B. Running an operation on MyISAM are set on the spot,
whereas InnoDB allows you to use the "ROLLBACK"s to discard any
change, which comes best in handy when running transactions.
Transaction Logs: When creating a transaction without a
transaction log in between, you can apply any changes on the table/s
in the DB, and if the table have a clustered index (for example),
the data will have to search where exactly it has to be inserted and
only then apply the change. In a case where there is a transaction
log in between the DB and the transaction, the changes will be sent
to the transaction log first and will set its order in the table
before sending the change to the DB - which will be less time
consuming. The DB saves logs from all the transactions that were
made, which can help to choose to restore any transaction previously
made, and recover all changes. When set to a "simple" recovery model- transactions are deleted from the transactions log and wont be able to recover data (used usually on DEV environments). When set to
"full" recovery model, all transactions are saved and listed, ready
to be restored - this is used usually on production environments
which might cause problems like preformance issues - so backing them
up and deleting from the server could be a solution. When set to a
"bulk-logged" recovery model saved transaction logs only for
specific "important" changes and commands (import,export,
insert-select, select-into, reorganaizing/rebuilding indexes), and
might prevent preformance issues.
Foreign keys: MyISAM dosn't use foreign keys, unlike InnoDB. When a table column has a foregin key set to point on an other table column, when any update/delete occures on the pointed table, it will know that the changes have to be applied on the other table pointing at it. This create a some kind of a link between the two table and keep data in sync. Setting tables with FKs might require more effort which might be considered as a disadvantage (?).
FULLTEXT indexing: InnoDB doesn't support FULLTEXT indexing in its previous versions - MyISAM does support it. Switching to MyISAM wont be the best solution so just update MySQL to a verion which does support FULLTEXT indexing.
FULLTEXT indexing can take texts like titles, comments, ect' - and search it (this should be a better option than the "LIKE" command in this case).
Spatial data types: Supported only on InnoDB.
To sum all up, InnoDB will be usually more reliable in terms of data handling, validity & recovery. For newer versions InnoDB will support FULLTEXT indexing for mainly searches - when using older versions with no option to update MySQL, using MyISAM will be great.

Slow MySQL table

I am currently trying to figure out why the site I am working on (Laravel 4.2 framework) is really slow at times, and I think it has to do with my database setup. I am not a pro at all so I would assume that where the problem is
My sessions table has roughly 2.2 million records in it, when I run show processlist;, all the queries that take the longest relate to that table.
Here is a picture for example:
Table structure
Surerly I am doing something wrong or it's not index properly? I'm not sure, not fantastic with databases.
We don't see the complete SQL being executed, so we can't recommend appropriate indexes. But if the only predicate on the DELETE statements is on the last_activity column i.e.
DELETE FROM `sessions` WHERE last_activity <= 'somevalue' ;
Then performance of the DELETE statement will likely be improved by adding an index with a leading column of somevalue, e.g.
CREATE INDEX sessions_IX1 ON sessions (last_activity);
Also, if this table is using MyISAM storage engine, then DML statements cannot execute concurrently; DML statements will block while waiting to obtain exclusive lock on the table. The InnoDB storage engine uses row level locking, so some DML operations can be concurrent. (InnoDB doesn't eliminate lock contention, but locks will be on rows and index blocks, rather than on the entire table.)
Also consider using a different storage mechanism (other than MySQL database) for storing and retrieving info for web server "sessions".
Also, is it necessary (is there some requirement) to persist 2.2 million "sessions" rows? Are we sure that all of those rows are actually needed? If some of that data is historical, and isn't specifically needed to support the current web server sessions, we might consider moving the historical data to another table.

mysql db engine when to use

I am trying to find out which MySQL table engine is best for each of our table and requirements.
The tables with many reads(SELECT queries) are MyISAM.
The tables with many writes(INSERT/UPDATE queries) are InnoDB. These are the only two types that we used, but now we have different scenarios and we do not know which DB engine is best.
1)We have a table users that we UPDATE/SELECT very often, like 1 row every second for SELECT and 1 row every 1 second for UPDATE, but the INSERTS are rare, like 1 every 300 seconds. For this we chose MyISAM.
2)We have a table users_data where we INSERT data as often as we do it in table users, like every 300 seconds, but we do not UPDATE this table too often, but we read from it once every 1 second. For this we chose MyISAM
3)We have a table transactions where we INSERT data very often, like 1 row every 4-5 seconds, and we SELECT large packs from this every 20-30 seconds (we make many SUM's often from this table based on userid). For this we chose MyISAM.
4)We have a table transactions_logs where we store id (which is the same as transactions table), merchant name, email and we INSERT data very often, like 1 row every 4-5 seconds, but we read this very rarely. For this we chose InnoDB.
Rarely we join table transactions and transactions_logs for statistics.
5)We have a table pages where we only SELECT data very often,like 1 row per second. For this we chose MyISAM and we turned on MySQL cache.
Questions:
a)We have another table with 1 INSERT every 100000 seconds, but many SELECT/UPDATE queries per second? What type should this be? We are using MyISAM for now for this type.
We read data from it, we modify it, then we update it and we do this once per 1-2 seconds. Is MyISAM the best option for this?
b)Do you think that we should've used InnoDB for all tables? I've read that since MySQL 5.6, InnoDB is the default table type and probably it was optimised a lot.
Fundamentally, I use the following two differences between MyISAM and InnoDB to choose which one to use in a specific scenario:
InnoDB supports transactions, MyISAM does not.
InnoDB has row-level locking, MyISAM has table-level locking.
(Source: MySQL 5.7 Reference Manual)
My rule of thumb is to use MyISAM when there are a high number of select queries and low number of update/insert queries. Whenever write performance, or data integrity are of importance I'll use InnoDB.
While the above is useful as a starting point, every database, and every application, are different. The specific details of your hardware and software setup will ultimately dictate which engine choice is best. When in doubt, test!
However, I will say that, based on the numbers provided, and assuming 'modern' server hardware, you're not anywhere near the performance limits of MySQL so either engine would suffice.
MyISAM Works great for read only loads and Write and Read forever loads. It handles co-currency with locking the entire table on writes. This can make it very slow on write heavy loads.
INNODB Is a little more complicated, adds some configuration options that must be configured somewhat properly. This adds support for row level locking, which is great for rows that are added, updated less than 1 per second ideally (giving plenty of time to read/write).

Move existing tables to InnoDB from MyISAM and which one is faster?

A Database already has up to 25-30 tables and all are MyISAM. Most of these tables are related to each other meaning a lot of queries use joins on IDs and retrieve data.
One of the tables contain 7-10 Million records and it becomes slow if i want to perform a search or update or even retrieval of all data. Now i proposed a solution to my boss saying that converting tables into InnoDB might give better performance.
I also explained the benefits of InnoDB:
Since we anyways join multiple tables on keys and they are related, it will be better to use foreign keys and have relational database which will avoid Orphan Rows. I found around 10-15k orphan rows in one of the big tables and had to manually remove them.
Support for transactions, we perform big updates from time to time and if one of them fails on the way we have to replace the entire table with the backed-up one and run the update again to make sure that all queries were executed. With InnoDB we can revert back any changes from query 1 if query 2 fails.
Now the response i got from my boss is that I need to prove that InnoDB will run faster than MyISAM. My question is, wont above 2 things improve the speed of the application itself by eliminating orphan rows?
In general is MyISAM faster than InnoDB?
Note: using MySQL 5.5
You should also mention to your boss probably the biggest benefit you get from InnoDB for large tables with both read/write load - You get row-level locking rather than table-level locking. This can be a great performance benefit for the application in cases where you see a lot of waits for table locks to be released.
Of course the best way to convince your boss is to prove it. Make copies of your large table and place on a testing database. Make one version of data in MyISAM and one in InnoDB. Then run load testing against it with a load mix that approximates your current DB read/write activity. Find out for yourself if it is better.
Just updated for your comment that you are on 5.5. With 5.5 it is a no brainer to use InnoDB. MyISAM engine basically has seen no improvement over the last several years and development effort has been around InnoDB. InnoDB is THE MySQL engine of choice going forward.

Update IN MYSQL InnoDB million records

MYSQL Innodb Update Issue:
Once I receive a response (status) for a record ,I need to update the response to a very large table (Approximate 1 million records and will keep increasing),and this will keep happen may be 100 times per second. May I know will there any performance issue? OR any setting I can modify to avoid table locking or query slowing issue.
Thanks.
It sounds like a design issue.
Instead storing the flag (which the status-record update changes) for million data-records, you should store a reference in data-records pointing to the status-record. So, when you update the status-record, no further db operation required. Also, when you're scanning through the data-records, you should JOIN for the status-records (if it's needed to display). If status-record change occurs often, it's better than update millions of data-records.
Maybe, I'm wrong, you should explain the db (structure, table record counts) for more accurate answers.
If you store your table using the MyISAM storage engine, then your table will lock with every update.
However, the InnoDB storage engine is capable of locking individual rows.
If you need to UPDATE multiple records simultaneously, InnoDB may be better.
Any indexes you have on the database (especially clustered indexes) will slow your writes down.
Indexes speed up reading, but they slow down writing. Most databases get read more than written to, but it sounds like yours gets written to much more.