I've read heaps of posts here on stackoverflow, blog posts, tutorials and more, but I still fail to resolve a rather nasty performance issue with my MySQL db. Keep in mind that I'm a novice when it comes to large MySQL databases.
I have a table with approx. 11.000.000 rows (will increase to say 20.000.000 or more). Here's the layout:
CREATE TABLE `myTable` (
`intcol1` int(11) DEFAULT NULL,
`charcol1` char(25) DEFAULT NULL,
`intcol2` int(11) DEFAULT NULL,
`charcol2` char(50) DEFAULT NULL,
`charcol3` char(50) DEFAULT NULL,
`charcol4` char(50) DEFAULT NULL,
`intcol3` int(11) DEFAULT NULL,
`charcol5` char(50) DEFAULT NULL,
`intcol4` int(20) DEFAULT NULL,
`intcol5` int(20) DEFAULT NULL,
`intcol6` int(20) DEFAULT NULL,
`intcol7` int(11) DEFAULT NULL,
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
FULLTEXT KEY `idx` (`charcol2`,`charcol3`)
) ENGINE=MyISAM AUTO_INCREMENT=11665231 DEFAULT CHARSET=latin1;
A select statement like
SELECT * from myTable where charchol2='bogus' AND charcol3='bogus2';
takes 25 seconds or so to execute. That's too slow, and will be even slower as the table grows.
The table will not have any inserts or updates at all (so to speak), and will be primarily used for outputting searches on the char-columns.
I've tried to make indexing work (playing around with FULLTEXT, as you can see), but it seems that I'm missing something. Any takes on how to speed up the performance?
Please note: Im currently running MySQL on my Macbook Air (1.7 GHz i5, 4GB RAM). If this is the only answer to my performance issues, I'll move the database to something appropriate ;-)
EDIT: Explain table
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE myTable ALL NULL NULL NULL NULL 11596725 Using where
You don't need to create FULLTEXT indexes for such requests, where equality operator is used. Just create an index on every char field, that will be used in WHERE condition, and remove the fulltext index:
DROP INDEX idx;
ALTER TABLE myTable ADD INDEX charchol_idx (charchol2, charchol3);
Related
So basically I created a table:
CREATE TABLE IF NOT EXISTS `student` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`campus` enum('CAMPUS1', 'CAMPUS2') NOT NULL,
`fullname` char(32) NOT NULL,
`gender` enum('MALE', 'FEMALE') NOT NULL,
`birthday` char(16) NOT NULL,
`phone` char(32) NOT NULL,
`emergency` char(32) NOT NULL,
`address` char(128) NOT NULL,
PRIMARY KEY (`idx`),
KEY `key_student` (`campus`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
I have like 20 rows with only 12 in CAMPUS1
But when I use query it: SELECT * FROM student WHERE campus='CAMPUS1'; The EXPLAIN is this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE student ALL key_student NULL NULL NULL 20 Using where
I am new to this thing, how does a KEY really works? I read documentation but I cant understand that much.
MySQL is trying to be smart (with varying success) when deciding which index to use for a query.
There are cases where it is faster to query the entire table instead of using the index. E.g: if your table has 500 records for CAMPUS1 and 100 records for CAMPUS2 it is faster to do a full (600 records) scan when looking for campus='CAMPUS1'.
When you have only 20 rows you run into the edge cases of the algorithm. Try adding some more rows, and see what happens.
Also, it seems this index will have a very low cardinality (an even split between only 2 values). It will probably not be very useful.
I have a MyISAM table (on a Mariadb) with 7 millions rows in it.
CREATE TABLE `mytable` (
`id` bigint(100) unsigned NOT NULL AUTO_INCREMENT,
`x` int(5) unsigned NOT NULL DEFAULT '0',
`y` int(5) unsigned NOT NULL DEFAULT '0',
`value` int(5) unsigned NOT NULL DEFAULT '0'
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=10152508 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1
When i do
SELECT * FROM mytable WHERE id = 167880;
it takes around 0.272 sec
When i do
UPDATE mytable SET value = 1 WHERE id = 167880;
it takes randomly from 0.200 to 2.5 sec
I was thinking it's because my table have a lot of rows, but still, it shouldn't take that much time to update a row by it's primary key.
Since i did some researchs before posting, here are the checks i've already done :
No duplicate indexes
No others indexes than the primary key "id"
No triggers
Tried to switch to innoDB engine, it was worse (around 6 sec for an update)
Tried to switch to aria engine, it's even worse
Already did OPTIMIZE TABLE;
Config is the default config of last version of Mariadb (fresh install)
Made all theses check while the db was not used by anything else, so no heavy readings during the tests
I think that the problem is the data type you are using for id column.
Using INT rather then BIGINT can make a significant reduction in disk space.
Read this article instead.
http://ronaldbradford.com/blog/bigint-v-int-is-there-a-big-deal-2008-07-18/
Hope it helps
I am using a MySQL database in my ASP.NET with C# web application. The MySQL Server version is 5.7 and there is 8 GB RAM in the PC. When I am executing the select query in MySQL database table, it takes more time in execution; a simple select query takes around 42 seconds. Across 1 crorerecord (10 million records) in the table. I have also done indexing for the table. How can I fix this?
The following is my table structure.
CREATE TABLE `smstable_read` (
`MessageID` int(11) NOT NULL AUTO_INCREMENT,
`ApplicationID` int(11) DEFAULT NULL,
`Api_userid` int(11) DEFAULT NULL,
`ReturnMessageID` varchar(255) DEFAULT NULL,
`Sequence_Id` int(11) DEFAULT NULL,
`messagetext` longtext,
`adtextid` int(11) DEFAULT NULL,
`mobileno` varchar(255) DEFAULT NULL,
`deliverystatus` int(11) DEFAULT NULL,
`SMSlength` int(11) DEFAULT NULL,
`DOC` varchar(255) DEFAULT NULL,
`DOM` varchar(255) DEFAULT NULL,
`BatchID` int(11) DEFAULT NULL,
`StudentID` int(11) DEFAULT NULL,
`SMSSentTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTimeTicks` decimal(28,0) DEFAULT '0',
`SMSSentTimeTicks` decimal(28,0) DEFAULT '0',
`Sent_SMS_Day` int(11) DEFAULT NULL,
`Sent_SMS_Month` int(11) DEFAULT NULL,
`Sent_SMS_Year` int(11) DEFAULT NULL,
`smssent` int(11) DEFAULT '1',
`Batch_Name` varchar(255) DEFAULT NULL,
`User_ID` varchar(255) DEFAULT NULL,
`Year_ID` int(11) DEFAULT NULL,
`Date_Time` varchar(255) DEFAULT NULL,
`IsGroup` double DEFAULT NULL,
`Date_Time_Ticks` decimal(28,0) DEFAULT NULL,
`IsNotificationSent` int(11) DEFAULT NULL,
`Module_Id` double DEFAULT NULL,
`Doc_Batch` decimal(28,0) DEFAULT NULL,
`SMS_Category_ID` int(11) DEFAULT NULL,
`SID` int(11) DEFAULT NULL,
PRIMARY KEY (`MessageID`),
KEY `index2` (`ReturnMessageID`),
KEY `index3` (`mobileno`),
KEY `BatchID` (`BatchID`),
KEY `smssent` (`smssent`),
KEY `deliverystatus` (`deliverystatus`),
KEY `day` (`Sent_SMS_Day`),
KEY `month` (`Sent_SMS_Month`),
KEY `year` (`Sent_SMS_Year`),
KEY `index4` (`ApplicationID`,`SMSSentTimeTicks`),
KEY `smslength` (`SMSlength`),
KEY `studid` (`StudentID`),
KEY `batchid_studid` (`BatchID`,`StudentID`),
KEY `User_ID` (`User_ID`),
KEY `Year_Id` (`Year_ID`),
KEY `IsNotificationSent` (`IsNotificationSent`),
KEY `isgroup` (`IsGroup`),
KEY `SID` (`SID`),
KEY `SMS_Category_ID` (`SMS_Category_ID`),
KEY `SMSSentTimeTicks` (`SMSSentTimeTicks`)
) ENGINE=MyISAM AUTO_INCREMENT=16513292 DEFAULT CHARSET=utf8;
The following is my select query:
SELECT messagetext, SMSSentTime, StudentID, batchid,
User_ID,MessageID,Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch
FROM smstable_read
WHERE StudentID=977 AND SID = 8582 AND MessageID>16013282
You need to learn about compound indexes and covering indexes. Read about those things.
Your query is slow because it's doing a half-scan of the table. It uses the primary key to find the first row with a qualifying MessageID, then looks at every row of the table to find matching rows.
Your filter criteria are StudentID = constant, SID = constant AND MessageID > constant. That means you need those three columns, in that order, in an index. The first two filter criteria will random-access your index to the correct place. The third criterion will scan the index starting right after the constant value in your query. It's called an Index Range Scan operation, and it's quite efficient.
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId);
This compound index should make your query efficient. Notice that in MyISAM, the primary key column of a table should appear in compound indexes. That's cool in this case because it's also part of your query criteria.
If this query is used very frequently, you could make a covering index: you could add the other columns of the query (the ones mentioned in your SELECT clause) to the index.
But, unfortunately you have defined your messageText column with a longtext data type. That allows for each message to contain up to four gigabytes. (Why? Is this really SMS data? There's a limit of 160 bytes per message in SMS. Four gigabytes >> 160 bytes.)
Now the point of a covering index is to allow the query to be satisfied entirely from the index, without referring back to the table. But when you include a longtext or any other LOB column in an index, it only contains a subset of the data. So the point of the covering index is lost.
If I were you I would change my table so messageText was a VARCHAR(255) data type, and then create this covering index:
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId,
SMSSentTime, batchid,
User_ID, Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch,
messageText);
(Notice that you should put variable-length items last in the index if you can.)
If you can't change your application to handle VARCHAR(255) then go with the first index I mentioned.
Pro tip: putting lots of single-column indexes on MySQL tables rarely helps SELECT performance and always harms INSERT and UPDATE performance. You need an index on your primary key, and you need indexes to support the queries you run. Extra indexes are harmful.
It looks like your database is not properly indexed and even not properly normalized. Normalizing your database will go a long way to speed up all your queries. Particularly in view of the fact that mysql used only one index per table in a query. Even though you have lot's of indexes, they cannot be used.
Your current query filters on StudentID,SID, and MessageID. The last is an inequality comparision so an index will not be very effective with that but the other two columns are equality comparisons. I suggest an index like this:
KEY `studid` (`StudentID`,`SID`)
Follow that up by dropping your existing index on SID. If you find that you don't want to drop it because it's used in another query, further evidence that your table is in desperate need of normalization.
Too many indexes slow down inserts and adds a little overhead to each SELECT because the query planner needs more effort to figure out which index to use.
i have this table :
CREATE TABLE `messenger_contacts` (
`number` varchar(15) NOT NULL,
`has_telegram` tinyint(1) NOT NULL DEFAULT '0',
`geo_state` int(11) NOT NULL DEFAULT '0',
`geo_city` int(11) NOT NULL DEFAULT '0',
`geo_postal` int(11) NOT NULL DEFAULT '0',
`operator` tinyint(1) NOT NULL DEFAULT '0',
`type` tinyint(1) NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `messenger_contacts`
ADD PRIMARY KEY (`number`),
ADD KEY `geo_city` (`geo_city`),
ADD KEY `geo_postal` (`geo_postal`),
ADD KEY `type` (`type`),
ADD KEY `type1` (`operator`),
ADD KEY `has_telegram` (`has_telegram`),
ADD KEY `geo_state` (`geo_state`);
with about 11 million records.
A simple count select on this table takes about 30 to 60 seconds to complete witch seems very high.
select count(number) from messenger_contacts where geo_state=1
I am not a Database pro so beside setting indexes i don't know what else i can do to make the query faster?
UPDATE:
OK , i made some changes to column type and size:
CREATE TABLE IF NOT EXISTS `messenger_contacts` (
`number` bigint(13) unsigned NOT NULL,
`has_telegram` tinyint(1) NOT NULL DEFAULT '0' ,
`geo_state` int(2) NOT NULL DEFAULT '0',
`geo_city` int(4) NOT NULL DEFAULT '0',
`geo_postal` int(10) NOT NULL DEFAULT '0',
`operator` tinyint(1) NOT NULL DEFAULT '0' ,
`type` tinyint(1) NOT NULL DEFAULT '0' ,
PRIMARY KEY (`number`),
KEY `has_telegram` (`has_telegram`,`geo_state`),
KEY `geo_city` (`geo_city`),
KEY `geo_postal` (`geo_postal`),
KEY `type` (`type`),
KEY `type1` (`operator`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now the query only takes 4 to 5 seconds with * and number
Tanks every one for your help, even the guy that gave me -1. this would be good enough for now considering that my server is a low end hardware and i will be caching the select count results.
Maybe
select count(geo_state) from messenger_contacts where geo_state=1
as it will give the same result but will not use number column from the clustered index?
If this does not help, I would try to change number column into INT type, which should reduce the index size, or try to increase amount of memory MySQL could use for caching indexes.
You did not change the datatypes. INT(11) == INT(2) == INT(100) -- each is a 4-byte signed integer. You probably want 1-byte unsigned TINYINT UNSIGNED or 2-byte SMALLINT UNSIGNED.
It is a waste to index "flags", which I assume type and has_telegram are. The optimizer will never use them because it will less efficient than simply doing a table scan.
The standard coding pattern is:
select count(*)
from messenger_contacts
where geo_state=1
unless you need to not count NULLs, which is what COUNT(geo_state) implies.
Once you have the index on geo_state (or an index starting with geo_state), the query will scan the index (which is a separate BTree structure) starting with the first occurrence of geo_state=1 until the last, counting as it goes. That is, it will touch 1.1 millions index entries. So, a few seconds is to be expected. Counting a 'rare' geo_state will run much faster.
The reason for 30-60 seconds versus 4-5 seconds is very likely to be caching. The former had to read stuff from disk; the latter did not. Run the query twice.
Using the geo_state index will be faster for that query than using the PRIMARY KEY unless there are caching differences.
INDEX(number,geo_state) is virtually useless for any of the SELECTs mentioned -- geo_state should be first. This is an example of a "covering" index for the select count(number)... case.
More on building indexes.
I am working with mysql .
I have checked the CREATE table statement , and I saw there a KEY word
| pickupspc | CREATE TABLE `pickupspc` (
`McId` int(11) NOT NULL,
`Slot` int(11) NOT NULL,
`FromTime` datetime NOT NULL,
`ToTime` datetime NOT NULL,
`Head` int(11) NOT NULL,
`Nozzle` int(11) DEFAULT NULL,
`FeederID` int(11) DEFAULT NULL,
`CompName` varchar(64) DEFAULT NULL,
`CompID` varchar(32) DEFAULT NULL,
`PickUps` int(11) DEFAULT NULL,
`Errors` int(11) DEFAULT NULL,
`ErrorCode` varchar(32) DEFAULT NULL,
KEY `ndx_PickupSPC` (`McId`,`Slot`,`FromTime`,`ToTime`,`Head`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
But what is the meaning of it ?
It's not like a PRIMARY KEY right ?
Thanks .
It is simply a synonym for INDEX. It creates an index with the name ndx_PickupSPC on the columns specified in parenthesis.
See the CREATE TABLE syntax for more information.
It's just a non-unique index. From the manual
KEY is normally a synonym for INDEX. The key attribute PRIMARY KEY can
also be specified as just KEY when given in a column definition. This
was implemented for compatibility with other database systems.
Key and index are the same. The word Key in the table creation is used to create an index, which enables faster performance.
In the above code, Key ndx_PickupSPC means that it is creating an index by the name ndx_PickupSPC on the columns mentioned in parenthesis.
It's an INDEX on the table. Indexes enable fast lookups for specific queries which check the values of the columns the index is built on. The example uses a compound key.
They are a bit similar to the indexes you find at the end of the books. You can quickly find an entry with the index without searching through the whole book. Databases typically use B-Trees for indexes.