MySQL Online DDL changes completing, but not persisting - mysql

I have a very large table (> 6 bn rows & > 3tb of data) that I'm trying to alter.
Example schema:
CREATE TABLE `huge_table` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`col1` int(11) unsigned DEFAULT NULL,
`col2` int(11) unsigned DEFAULT NULL,
`col3` int(11) unsigned NOT NULL,
`col4` int(11) unsigned DEFAULT NULL,
`col5` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`col6` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`col7` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`col8` datetime GENERATED ALWAYS AS (cast(`created` as date)) VIRTUAL,
`dt1` timestamp(3) NULL DEFAULT NULL,
`dt2` timestamp(3) NULL DEFAULT NULL,
`dt3` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
PRIMARY KEY (`id`),
UNIQUE KEY `idx_col5` (`col5`),
KEY `col4` (`col4`),
KEY `col1` (`col1`),
KEY `col3` (`col3`),
KEY `col6` (`col6`,`col3`),
KEY `col2` (`col2`),
KEY `col8` (`col8`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Most of these indexes are unnecessary, so I'm trying to clean them up, and add a more precise index. Note that all of the ALTER TABLE statements below take well over 24 hours to run.
First, I tried modifying all the indexes using Online DDL:
ALTER TABLE huge_table
DROP INDEX col4_idx,
DROP INDEX col1_idx,
DROP INDEX col3_idx,
DROP INDEX col6_idx,
DROP INDEX col2_idx,
DROP INDEX col8_idx,
ADD INDEX new_idx(col3, col6, col1),
ALGORITHM=INPLACE, LOCK=NONE;
When this finally finished (no errors), I checked the schema, only to find nothing had changed in the schema. Therefore, I decided to try again without forcing it to run "INPLACE":
ALTER TABLE huge_table
DROP INDEX col4_idx,
DROP INDEX col1_idx,
DROP INDEX col3_idx,
DROP INDEX col6_idx,
DROP INDEX col2_idx,
DROP INDEX col8_idx,
ADD INDEX new_idx(col3, col6, col1);
This didn't work, either (it still seemed to be running in online mode). I decided to break up the process into two: drop the unnecessary columns, first, then add the new column:
ALTER TABLE huge_table
DROP INDEX col4_idx,
DROP INDEX col1_idx,
DROP INDEX col3_idx,
DROP INDEX col6_idx,
DROP INDEX col2_idx,
DROP INDEX col8_idx;
ALTER TABLE huge_table
ADD INDEX new_idx(col3, col6, col1);
The dropping of the columns only took a few seconds, and the new index completed without errors (again, in online mode). Unfortunately, the new index didn't take.
I presume the entire DDL statement was ultimately rolled back after failing to create the new index, but don't see any record of that; it seems to fail silently. I'm wondering if it's because there's a VIRTUAL column, but I would expect the engine to return some sort of error message. Has anyone else seen this type of issue?

So, right before submitting this question, I found the error:
Error Code: 1799. Creating index 'new_idx' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again.
The issue is that the table is heavily used, and the innodb_online_alter_log_max_size setting is a hard limit to the amount of DML statements that are stored during the online DDL process. As a result, it seems I cannot modify this table in online mode, unless I bump up this value.
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_online_alter_log_max_size

Related

Simple select query takes more time in very large table in MySQL database in C# application

I am using a MySQL database in my ASP.NET with C# web application. The MySQL Server version is 5.7 and there is 8 GB RAM in the PC. When I am executing the select query in MySQL database table, it takes more time in execution; a simple select query takes around 42 seconds. Across 1 crorerecord (10 million records) in the table. I have also done indexing for the table. How can I fix this?
The following is my table structure.
CREATE TABLE `smstable_read` (
`MessageID` int(11) NOT NULL AUTO_INCREMENT,
`ApplicationID` int(11) DEFAULT NULL,
`Api_userid` int(11) DEFAULT NULL,
`ReturnMessageID` varchar(255) DEFAULT NULL,
`Sequence_Id` int(11) DEFAULT NULL,
`messagetext` longtext,
`adtextid` int(11) DEFAULT NULL,
`mobileno` varchar(255) DEFAULT NULL,
`deliverystatus` int(11) DEFAULT NULL,
`SMSlength` int(11) DEFAULT NULL,
`DOC` varchar(255) DEFAULT NULL,
`DOM` varchar(255) DEFAULT NULL,
`BatchID` int(11) DEFAULT NULL,
`StudentID` int(11) DEFAULT NULL,
`SMSSentTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTimeTicks` decimal(28,0) DEFAULT '0',
`SMSSentTimeTicks` decimal(28,0) DEFAULT '0',
`Sent_SMS_Day` int(11) DEFAULT NULL,
`Sent_SMS_Month` int(11) DEFAULT NULL,
`Sent_SMS_Year` int(11) DEFAULT NULL,
`smssent` int(11) DEFAULT '1',
`Batch_Name` varchar(255) DEFAULT NULL,
`User_ID` varchar(255) DEFAULT NULL,
`Year_ID` int(11) DEFAULT NULL,
`Date_Time` varchar(255) DEFAULT NULL,
`IsGroup` double DEFAULT NULL,
`Date_Time_Ticks` decimal(28,0) DEFAULT NULL,
`IsNotificationSent` int(11) DEFAULT NULL,
`Module_Id` double DEFAULT NULL,
`Doc_Batch` decimal(28,0) DEFAULT NULL,
`SMS_Category_ID` int(11) DEFAULT NULL,
`SID` int(11) DEFAULT NULL,
PRIMARY KEY (`MessageID`),
KEY `index2` (`ReturnMessageID`),
KEY `index3` (`mobileno`),
KEY `BatchID` (`BatchID`),
KEY `smssent` (`smssent`),
KEY `deliverystatus` (`deliverystatus`),
KEY `day` (`Sent_SMS_Day`),
KEY `month` (`Sent_SMS_Month`),
KEY `year` (`Sent_SMS_Year`),
KEY `index4` (`ApplicationID`,`SMSSentTimeTicks`),
KEY `smslength` (`SMSlength`),
KEY `studid` (`StudentID`),
KEY `batchid_studid` (`BatchID`,`StudentID`),
KEY `User_ID` (`User_ID`),
KEY `Year_Id` (`Year_ID`),
KEY `IsNotificationSent` (`IsNotificationSent`),
KEY `isgroup` (`IsGroup`),
KEY `SID` (`SID`),
KEY `SMS_Category_ID` (`SMS_Category_ID`),
KEY `SMSSentTimeTicks` (`SMSSentTimeTicks`)
) ENGINE=MyISAM AUTO_INCREMENT=16513292 DEFAULT CHARSET=utf8;
The following is my select query:
SELECT messagetext, SMSSentTime, StudentID, batchid,
User_ID,MessageID,Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch
FROM smstable_read
WHERE StudentID=977 AND SID = 8582 AND MessageID>16013282
You need to learn about compound indexes and covering indexes. Read about those things.
Your query is slow because it's doing a half-scan of the table. It uses the primary key to find the first row with a qualifying MessageID, then looks at every row of the table to find matching rows.
Your filter criteria are StudentID = constant, SID = constant AND MessageID > constant. That means you need those three columns, in that order, in an index. The first two filter criteria will random-access your index to the correct place. The third criterion will scan the index starting right after the constant value in your query. It's called an Index Range Scan operation, and it's quite efficient.
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId);
This compound index should make your query efficient. Notice that in MyISAM, the primary key column of a table should appear in compound indexes. That's cool in this case because it's also part of your query criteria.
If this query is used very frequently, you could make a covering index: you could add the other columns of the query (the ones mentioned in your SELECT clause) to the index.
But, unfortunately you have defined your messageText column with a longtext data type. That allows for each message to contain up to four gigabytes. (Why? Is this really SMS data? There's a limit of 160 bytes per message in SMS. Four gigabytes >> 160 bytes.)
Now the point of a covering index is to allow the query to be satisfied entirely from the index, without referring back to the table. But when you include a longtext or any other LOB column in an index, it only contains a subset of the data. So the point of the covering index is lost.
If I were you I would change my table so messageText was a VARCHAR(255) data type, and then create this covering index:
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId,
SMSSentTime, batchid,
User_ID, Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch,
messageText);
(Notice that you should put variable-length items last in the index if you can.)
If you can't change your application to handle VARCHAR(255) then go with the first index I mentioned.
Pro tip: putting lots of single-column indexes on MySQL tables rarely helps SELECT performance and always harms INSERT and UPDATE performance. You need an index on your primary key, and you need indexes to support the queries you run. Extra indexes are harmful.
It looks like your database is not properly indexed and even not properly normalized. Normalizing your database will go a long way to speed up all your queries. Particularly in view of the fact that mysql used only one index per table in a query. Even though you have lot's of indexes, they cannot be used.
Your current query filters on StudentID,SID, and MessageID. The last is an inequality comparision so an index will not be very effective with that but the other two columns are equality comparisons. I suggest an index like this:
KEY `studid` (`StudentID`,`SID`)
Follow that up by dropping your existing index on SID. If you find that you don't want to drop it because it's used in another query, further evidence that your table is in desperate need of normalization.
Too many indexes slow down inserts and adds a little overhead to each SELECT because the query planner needs more effort to figure out which index to use.

MySQL performance - large database

I've read heaps of posts here on stackoverflow, blog posts, tutorials and more, but I still fail to resolve a rather nasty performance issue with my MySQL db. Keep in mind that I'm a novice when it comes to large MySQL databases.
I have a table with approx. 11.000.000 rows (will increase to say 20.000.000 or more). Here's the layout:
CREATE TABLE `myTable` (
`intcol1` int(11) DEFAULT NULL,
`charcol1` char(25) DEFAULT NULL,
`intcol2` int(11) DEFAULT NULL,
`charcol2` char(50) DEFAULT NULL,
`charcol3` char(50) DEFAULT NULL,
`charcol4` char(50) DEFAULT NULL,
`intcol3` int(11) DEFAULT NULL,
`charcol5` char(50) DEFAULT NULL,
`intcol4` int(20) DEFAULT NULL,
`intcol5` int(20) DEFAULT NULL,
`intcol6` int(20) DEFAULT NULL,
`intcol7` int(11) DEFAULT NULL,
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
FULLTEXT KEY `idx` (`charcol2`,`charcol3`)
) ENGINE=MyISAM AUTO_INCREMENT=11665231 DEFAULT CHARSET=latin1;
A select statement like
SELECT * from myTable where charchol2='bogus' AND charcol3='bogus2';
takes 25 seconds or so to execute. That's too slow, and will be even slower as the table grows.
The table will not have any inserts or updates at all (so to speak), and will be primarily used for outputting searches on the char-columns.
I've tried to make indexing work (playing around with FULLTEXT, as you can see), but it seems that I'm missing something. Any takes on how to speed up the performance?
Please note: Im currently running MySQL on my Macbook Air (1.7 GHz i5, 4GB RAM). If this is the only answer to my performance issues, I'll move the database to something appropriate ;-)
EDIT: Explain table
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE myTable ALL NULL NULL NULL NULL 11596725 Using where
You don't need to create FULLTEXT indexes for such requests, where equality operator is used. Just create an index on every char field, that will be used in WHERE condition, and remove the fulltext index:
DROP INDEX idx;
ALTER TABLE myTable ADD INDEX charchol_idx (charchol2, charchol3);

Over-Indexed a MySQL table. How can I remediate?

I have a table with 3 million rows and 6 columns. The problem is that my mysqld server wouldn't generate the output for any query and it would simply time out.
I then read here that over-indexing could involve too much of swapping data from memory to disk and can cause the server to slow down.
So I ran a query ALTER TABLE <Tbl_name> DROP INDEX <Index_name>;. This query has been running for 10 hours and has not completed yet.
Is this expected to run for so long?
Is there a better way to Dropping/Altering my indices?
edit - Added SHOW CREATE TABLE output
| Sample | CREATE TABLE `sample` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`FiMD5` varchar(32) NOT NULL,
`NoMD5` varchar(32) NOT NULL,
`SeMD5` varchar(32) NOT NULL,
`SeesMD5` varchar(32) NOT NULL,
`ImMD5` varchar(32) NOT NULL,
`Ovlay` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`ID`),
KEY `FiMD5_3` (`FiMD5`),
KEY `ID` (`ID`),
KEY `ID_2` (`ID`),
KEY `pIndex` (`FiMD5`),
KEY `FiMD5_` (`FiMD5`,`NoMD5`)
) ENGINE=InnoDB AUTO_INCREMENT=3073630 DEFAULT CHARSET=latin1 |
Perhaps do the following would be faster:
SELECT ... INTO OUTFILE first
Use TRUNCATE TABLE to delete everything
Modify the table
Use LOAD to restore the data
If step 2 takes too long, perhaps dropping the table and recreate it.

How can you speed up making changes to large tables (200k+ rows) in mysql databases?

I have a table inside of my mysql database which I constantly need to alter and insert rows into but it continues running slow when I make changes making it difficult because there are over 200k+ entries. I tested another table which has very few rows and it moves quickly, so it's not the server or database itself but that particular table which has a tough time. I need all of the table's rows and cannot find a solution to get around the load issues.
DROP TABLE IF EXISTS `articles`;
/*!40101 SET #saved_cs_client = ##character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `articles` (
`id` int(11) NOT NULL auto_increment,
`content` text NOT NULL,
`author` varchar(255) NOT NULL,
`alias` varchar(255) NOT NULL,
`topic` varchar(255) NOT NULL,
`subtopics` varchar(255) NOT NULL,
`keywords` text NOT NULL,
`submitdate` timestamp NOT NULL default CURRENT_TIMESTAMP,
`date` varchar(255) NOT NULL,
`day` varchar(255) NOT NULL,
`month` varchar(255) NOT NULL,
`year` varchar(255) NOT NULL,
`time` varchar(255) NOT NULL,
`ampm` varchar(255) NOT NULL,
`ip` varchar(255) NOT NULL,
`score_up` int(11) NOT NULL default '0',
`score_down` int(11) NOT NULL default '0',
`total_score` int(11) NOT NULL default '0',
`approved` varchar(255) NOT NULL,
`visible` varchar(255) NOT NULL,
`searchable` varchar(255) NOT NULL,
`addedby` varchar(255) NOT NULL,
`keyword_added` varchar(255) NOT NULL,
`topic_added` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `score_up` (`score_up`),
KEY `score_down` (`score_down`),
FULLTEXT KEY `SEARCH` (`content `),
FULLTEXT KEY `asearch` (`author`),
FULLTEXT KEY `topic` (`topic`),
FULLTEXT KEY `keywords` (`content `,`keywords`,`topic`,`author`),
FULLTEXT KEY `content ` (`content `,`keywords`),
FULLTEXT KEY `new` (`keywords`),
FULLTEXT KEY `author` (`author`)
) ENGINE=MyISAM AUTO_INCREMENT=290823 DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = #saved_cs_client */;
With indexes it depends:
more indexes = faster selecting, slower inserting
less indexes = slower selecting, faster inserting
Because the index tables has to be rebuild when inserting and the more data in the table is the more work is for mysql to do to rebuild the index.
So maybe you could remove indexes you not need, that should speed your inserting up.
Another option is to partition you table into many - this stops the bottle neck.
Just try to pass the changes in an update script. This is slow because it creates tables. try updating the tables where changes has been made.
For example create a variable that catches all the changes in the program, with that, insert it to the tables query. That should be fast enough for programs. But as we all know speed depends on how much data is processed.
Let me know if you need anything else.
This may or may not help you directly, but I notice that you have a lot of VARCHAR(255) columns in your table. Some of them seem like they might be totally unnecessary — do you really need all those date / day / month / year / time / ampm columns? — and many could be replaced by more compact datatypes:
Dates could be stored as a DATETIME (or TIMESTAMP).
IP addresses could be stored as INTEGERs, or as BINARY(16) for IPv6.
Instead of storing usernames in the article table, you should create a separate user table and reference it using INTEGER keys.
I don't know what the approved, visible and searchable fields are, but I bet they don't need to be VARCHAR(255)s.
I'd also second Adrian Cornish's suggestion to split your table. In particular, you really want to keep frequently changing and frequently accessed metadata, such as up/down vote scores, separate from rarely changing and infrequently accessed bulk data like article content. See for example http://20bits.com/articles/10-tips-for-optimizing-mysql-queries-that-dont-suck/
"I have a table inside of my mysql database which I constantly need to alter and insert rows into but it continues"
Try innodb on this table if you application performs A LOT update, insert concurrently there, row level locking $$$
I recommend you to split that "big table"(not that big actually, but for MySQL it may be) in several tables to make the most of the query cache. Any time you update some record in that table, the query cache is erased. Also you can try to reduce the isolation level, but that is a little more complicated.

meaning of a KEY in CREATE TABLE statment in mysql

I am working with mysql .
I have checked the CREATE table statement , and I saw there a KEY word
| pickupspc | CREATE TABLE `pickupspc` (
`McId` int(11) NOT NULL,
`Slot` int(11) NOT NULL,
`FromTime` datetime NOT NULL,
`ToTime` datetime NOT NULL,
`Head` int(11) NOT NULL,
`Nozzle` int(11) DEFAULT NULL,
`FeederID` int(11) DEFAULT NULL,
`CompName` varchar(64) DEFAULT NULL,
`CompID` varchar(32) DEFAULT NULL,
`PickUps` int(11) DEFAULT NULL,
`Errors` int(11) DEFAULT NULL,
`ErrorCode` varchar(32) DEFAULT NULL,
KEY `ndx_PickupSPC` (`McId`,`Slot`,`FromTime`,`ToTime`,`Head`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
But what is the meaning of it ?
It's not like a PRIMARY KEY right ?
Thanks .
It is simply a synonym for INDEX. It creates an index with the name ndx_PickupSPC on the columns specified in parenthesis.
See the CREATE TABLE syntax for more information.
It's just a non-unique index. From the manual
KEY is normally a synonym for INDEX. The key attribute PRIMARY KEY can
also be specified as just KEY when given in a column definition. This
was implemented for compatibility with other database systems.
Key and index are the same. The word Key in the table creation is used to create an index, which enables faster performance.
In the above code, Key ndx_PickupSPC means that it is creating an index by the name ndx_PickupSPC on the columns mentioned in parenthesis.
It's an INDEX on the table. Indexes enable fast lookups for specific queries which check the values of the columns the index is built on. The example uses a compound key.
They are a bit similar to the indexes you find at the end of the books. You can quickly find an entry with the index without searching through the whole book. Databases typically use B-Trees for indexes.