GROUP BY Query -why so slow - mysql

I am trying to generate a group query on a large table (more than 8 million rows). However I can reduce the need to group all the data by date. I have a view that captures that dates I require and this limits the query bu it's not much better.
Finally I need to join to another table to pick up a field.
I am showing the query, the create on the main table and the query explain below.
Main Query:
SELECT pgi_raw_data.wsp_channel,
'IOM' AS wsp,
pgi_raw_data.dated,
pgi_accounts.`master`,
pgi_raw_data.event_id,
pgi_raw_data.breed,
Sum(pgi_raw_data.handle),
Sum(pgi_raw_data.payout),
Sum(pgi_raw_data.rebate),
Sum(pgi_raw_data.profit)
FROM pgi_raw_data
INNER JOIN summary_max
ON pgi_raw_data.wsp_channel = summary_max.wsp_channel
AND pgi_raw_data.dated > summary_max.race_date
INNER JOIN pgi_accounts
ON pgi_raw_data.account = pgi_accounts.account
GROUP BY pgi_raw_data.event_id
ORDER BY NULL
The create table:
CREATE TABLE `pgi_raw_data` (
`event_id` char(25) NOT NULL DEFAULT '',
`wsp_channel` varchar(5) NOT NULL,
`dated` date NOT NULL,
`time` time DEFAULT NULL,
`program` varchar(5) NOT NULL,
`track` varchar(25) NOT NULL,
`raceno` tinyint(2) NOT NULL,
`detail` varchar(30) DEFAULT NULL,
`ticket` varchar(20) NOT NULL DEFAULT '',
`breed` varchar(12) NOT NULL,
`pool` varchar(10) NOT NULL,
`gross` decimal(11,2) NOT NULL,
`refunds` decimal(11,2) NOT NULL,
`handle` decimal(11,2) NOT NULL,
`payout` decimal(11,4) NOT NULL,
`rebate` decimal(11,4) NOT NULL,
`profit` decimal(11,4) NOT NULL,
`account` mediumint(10) NOT NULL,
PRIMARY KEY (`event_id`,`ticket`),
KEY `idx_account` (`account`),
KEY `idx_wspchannel` (`wsp_channel`,`dated`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
This is my view for summary_max:
CREATE ALGORITHM=UNDEFINED DEFINER=`root`#`localhost` SQL SECURITY DEFINER VIEW
`summary_max` AS select `pgi_summary_tbl`.`wsp_channel` AS
`wsp_channel`,max(`pgi_summary_tbl`.`race_date`) AS `race_date`
from `pgi_summary_tbl` group by `pgi_summary_tbl`.`wsp
And also the evaluated query:
1 PRIMARY <derived2> ALL 6 Using temporary
1 PRIMARY pgi_raw_data ref idx_account,idx_wspchannel idx_wspchannel
7 summary_max.wsp_channel 470690 Using where
1 PRIMARY pgi_accounts ref PRIMARY PRIMARY 3 gf3data_momutech.pgi_raw_data.account 29 Using index
2 DERIVED pgi_summary_tbl ALL 42282 Using temporary; Using filesort
Any help on indexing would help.

At a minimum you need indexes on these fields:
pgi_raw_data.wsp_channel,
pgi_raw_data.dated,
pgi_raw_data.account
pgi_raw_data.event_id,
summary_max.wsp_channel,
summary_max.race_date,
pgi_accounts.account
The general (not always) rule is anything you are sorting, grouping, filtering or joining on should have an index.
Also: pgi_summary_tbl.wsp
Also, why the order by null?

The first thing is to be sure that you have indexes on pgi_summary_table(wsp_channel, race_date) and pgi_accounts(account). For this query, you don't need indexes on these columns in the raw data.
MySQL has a tendency to use indexes even when they are not the most efficient path. I would start by looking at the performance of the "full" query, without the joins:
SELECT pgi_raw_data.wsp_channel,
'IOM' AS wsp,
pgi_raw_data.dated,
-- pgi_accounts.`master`,
pgi_raw_data.event_id,
pgi_raw_data.breed,
Sum(pgi_raw_data.handle),
Sum(pgi_raw_data.payout),
Sum(pgi_raw_data.rebate),
Sum(pgi_raw_data.profit)
FROM pgi_raw_data
GROUP BY pgi_raw_data.event_id
If this has better performance, you may have a situation where the indexes are working against you. The specific problem is called "thrashing". It occurs when a table is too bit to fit into memory. Often, the fastest way to deal with such a table is to just read the whole thing. Accessing the table through an index can result in an extra I/O operation for most of the rows.
If this works, then do the joins after the aggregate. Also, consider getting more memory, so the whole table will fit into memory.
Second, if you have to deal with this type of data, then partitioning the table by date may prove to be a very useful option. This will allow you to significantly reduce the overhead of reading the large table. You do have to be sure that the summary table can be read the same way.

Related

Simple select query takes more time in very large table in MySQL database in C# application

I am using a MySQL database in my ASP.NET with C# web application. The MySQL Server version is 5.7 and there is 8 GB RAM in the PC. When I am executing the select query in MySQL database table, it takes more time in execution; a simple select query takes around 42 seconds. Across 1 crorerecord (10 million records) in the table. I have also done indexing for the table. How can I fix this?
The following is my table structure.
CREATE TABLE `smstable_read` (
`MessageID` int(11) NOT NULL AUTO_INCREMENT,
`ApplicationID` int(11) DEFAULT NULL,
`Api_userid` int(11) DEFAULT NULL,
`ReturnMessageID` varchar(255) DEFAULT NULL,
`Sequence_Id` int(11) DEFAULT NULL,
`messagetext` longtext,
`adtextid` int(11) DEFAULT NULL,
`mobileno` varchar(255) DEFAULT NULL,
`deliverystatus` int(11) DEFAULT NULL,
`SMSlength` int(11) DEFAULT NULL,
`DOC` varchar(255) DEFAULT NULL,
`DOM` varchar(255) DEFAULT NULL,
`BatchID` int(11) DEFAULT NULL,
`StudentID` int(11) DEFAULT NULL,
`SMSSentTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTime` varchar(255) DEFAULT NULL,
`SMSDeliveredTimeTicks` decimal(28,0) DEFAULT '0',
`SMSSentTimeTicks` decimal(28,0) DEFAULT '0',
`Sent_SMS_Day` int(11) DEFAULT NULL,
`Sent_SMS_Month` int(11) DEFAULT NULL,
`Sent_SMS_Year` int(11) DEFAULT NULL,
`smssent` int(11) DEFAULT '1',
`Batch_Name` varchar(255) DEFAULT NULL,
`User_ID` varchar(255) DEFAULT NULL,
`Year_ID` int(11) DEFAULT NULL,
`Date_Time` varchar(255) DEFAULT NULL,
`IsGroup` double DEFAULT NULL,
`Date_Time_Ticks` decimal(28,0) DEFAULT NULL,
`IsNotificationSent` int(11) DEFAULT NULL,
`Module_Id` double DEFAULT NULL,
`Doc_Batch` decimal(28,0) DEFAULT NULL,
`SMS_Category_ID` int(11) DEFAULT NULL,
`SID` int(11) DEFAULT NULL,
PRIMARY KEY (`MessageID`),
KEY `index2` (`ReturnMessageID`),
KEY `index3` (`mobileno`),
KEY `BatchID` (`BatchID`),
KEY `smssent` (`smssent`),
KEY `deliverystatus` (`deliverystatus`),
KEY `day` (`Sent_SMS_Day`),
KEY `month` (`Sent_SMS_Month`),
KEY `year` (`Sent_SMS_Year`),
KEY `index4` (`ApplicationID`,`SMSSentTimeTicks`),
KEY `smslength` (`SMSlength`),
KEY `studid` (`StudentID`),
KEY `batchid_studid` (`BatchID`,`StudentID`),
KEY `User_ID` (`User_ID`),
KEY `Year_Id` (`Year_ID`),
KEY `IsNotificationSent` (`IsNotificationSent`),
KEY `isgroup` (`IsGroup`),
KEY `SID` (`SID`),
KEY `SMS_Category_ID` (`SMS_Category_ID`),
KEY `SMSSentTimeTicks` (`SMSSentTimeTicks`)
) ENGINE=MyISAM AUTO_INCREMENT=16513292 DEFAULT CHARSET=utf8;
The following is my select query:
SELECT messagetext, SMSSentTime, StudentID, batchid,
User_ID,MessageID,Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch
FROM smstable_read
WHERE StudentID=977 AND SID = 8582 AND MessageID>16013282
You need to learn about compound indexes and covering indexes. Read about those things.
Your query is slow because it's doing a half-scan of the table. It uses the primary key to find the first row with a qualifying MessageID, then looks at every row of the table to find matching rows.
Your filter criteria are StudentID = constant, SID = constant AND MessageID > constant. That means you need those three columns, in that order, in an index. The first two filter criteria will random-access your index to the correct place. The third criterion will scan the index starting right after the constant value in your query. It's called an Index Range Scan operation, and it's quite efficient.
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId);
This compound index should make your query efficient. Notice that in MyISAM, the primary key column of a table should appear in compound indexes. That's cool in this case because it's also part of your query criteria.
If this query is used very frequently, you could make a covering index: you could add the other columns of the query (the ones mentioned in your SELECT clause) to the index.
But, unfortunately you have defined your messageText column with a longtext data type. That allows for each message to contain up to four gigabytes. (Why? Is this really SMS data? There's a limit of 160 bytes per message in SMS. Four gigabytes >> 160 bytes.)
Now the point of a covering index is to allow the query to be satisfied entirely from the index, without referring back to the table. But when you include a longtext or any other LOB column in an index, it only contains a subset of the data. So the point of the covering index is lost.
If I were you I would change my table so messageText was a VARCHAR(255) data type, and then create this covering index:
ALTER TABLE smstable_read
ADD INDEX StudentSidMessage (StudentId, SID, MessageId,
SMSSentTime, batchid,
User_ID, Sent_SMS_Day, Sent_SMS_Month,
Sent_SMS_Year,Module_Id,Year_ID,Doc_Batch,
messageText);
(Notice that you should put variable-length items last in the index if you can.)
If you can't change your application to handle VARCHAR(255) then go with the first index I mentioned.
Pro tip: putting lots of single-column indexes on MySQL tables rarely helps SELECT performance and always harms INSERT and UPDATE performance. You need an index on your primary key, and you need indexes to support the queries you run. Extra indexes are harmful.
It looks like your database is not properly indexed and even not properly normalized. Normalizing your database will go a long way to speed up all your queries. Particularly in view of the fact that mysql used only one index per table in a query. Even though you have lot's of indexes, they cannot be used.
Your current query filters on StudentID,SID, and MessageID. The last is an inequality comparision so an index will not be very effective with that but the other two columns are equality comparisons. I suggest an index like this:
KEY `studid` (`StudentID`,`SID`)
Follow that up by dropping your existing index on SID. If you find that you don't want to drop it because it's used in another query, further evidence that your table is in desperate need of normalization.
Too many indexes slow down inserts and adds a little overhead to each SELECT because the query planner needs more effort to figure out which index to use.

Should I be using multiple single-column indexes or a single multi-column index?

This is a pretty basic question, but I'm confused by what I'm reading in various places. I have a simple table that doesn't contain a huge amount of data (less than 500 rows for any given db is typical) A typical query against this table looks like :
select system_fields.name from system_fields where system_fields.form_id=? and system_fields.field_id=?
My question is, should I have a separate index for form_id and one for field_id, or should I be creating an index on a combination of those two fields? I've never really done anything with multi-column indexes before.
CREATE TABLE IF NOT EXISTS `system_fields` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`field_id` int(11) NOT NULL,
`form_id` int(11) NOT NULL,
`name` varchar(50) NOT NULL,
`reference_field_id` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `field_id` (`field_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=293 ;
If you are always going to query by these two fields, then add a multi-column index.
I'll also point out that if you're going to have < 500 rows in the table, your index may not even get used. Any performance difference with or without an index on a 500-row table will be negligible.
Here's a bit more (good) reading:
https://www.percona.com/blog/2014/01/03/multiple-column-index-vs-multiple-indexes-with-mysql-56/

Mysql query optimisation, EXPLAIN and slow execution

Having some real issues with a few queries, this one inparticular. Info below.
tgmp_games, about 20k rows
CREATE TABLE IF NOT EXISTS `tgmp_games` (
`g_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`g_name` varchar(255) NOT NULL,
`g_link` varchar(255) NOT NULL,
`g_url` varchar(255) NOT NULL,
`g_platforms` varchar(128) NOT NULL,
`g_added` datetime NOT NULL,
`g_cover` varchar(255) NOT NULL,
`g_impressions` int(8) NOT NULL,
PRIMARY KEY (`g_id`),
KEY `g_platforms` (`g_platforms`),
KEY `site_id` (`site_id`),
KEY `g_link` (`g_link`),
KEY `g_release` (`g_release`),
KEY `g_genre` (`g_genre`),
KEY `g_name` (`g_name`),
KEY `g_impressions` (`g_impressions`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
tgmp_reviews - about 200k rows
CREATE TABLE IF NOT EXISTS `tgmp_reviews` (
`r_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`r_source` varchar(128) NOT NULL,
`r_date` date NOT NULL,
`r_score` int(3) NOT NULL,
`r_copy` text NOT NULL,
`r_link` text NOT NULL,
`r_int_link` text NOT NULL,
`r_parent` int(8) NOT NULL,
`r_platform` varchar(12) NOT NULL,
`r_impressions` int(8) NOT NULL,
PRIMARY KEY (`r_id`),
KEY `site_id` (`site_id`),
KEY `r_parent` (`r_parent`),
KEY `r_platform` (`r_platform`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
Here is the query, takes 3 seconds ish
SELECT * FROM tgmp_games g
RIGHT JOIN tgmp_reviews r ON g_id = r.r_parent
WHERE g.site_id = '34'
GROUP BY g_name
ORDER BY g_impressions DESC LIMIT 15
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE r ALL r_parent NULL NULL NULL 201133 Using temporary; Using filesort
1 SIMPLE g eq_ref PRIMARY,site_id PRIMARY 4 engine_comp.r.r_parent 1 Using where
I am just trying to grab the 15 most viewed games, then grab a single review (doesnt really matter which, I guess highest rated would be ideal, r_score) for each game.
Can someone help me figure out why this is so horribly inefficient?
I don't understand what is the purpose of having a GROUP BY g_name in your query, but this makes MySQL performing aggregates on the columns selected, or all columns from both table. So please try to exclude it and check if it helps.
Also, RIGHT JOIN makes database to query tgmp_reviews first, which is not what you want. I suppose LEFT JOIN is a better choice here. Please, try to change the join type.
If none of the first options helps, you need to redesign your query. As you need to obtain 15 most viewed games for the site, the query will be:
SELECT g_id
FROM tgmp_games g
WHERE site_id = 34
ORDER BY g_impressions DESC
LIMIT 15;
This is the very first part that should be executed by the database, as it provides the best selectivity. Then you can get the desired reviews for the games:
SELECT r_parent, max(r_score)
FROM tgmp_reviews r
WHERE r_parent IN (/*1st query*/)
GROUP BY r_parent;
Such construct will force database to execute the first query first (sorry for the tautology) and will give you the maximal score for each of the wanted games. I hope you will be able to use the obtained results for your purpose.
Your MyISAM table is small, you can try converting it to see if that resolves the issue. Do you have a reason for using MyISAM instead of InnoDB for that table?
You can also try running an analyze on each table to update the statistics to see if the optimizer chooses something different.

How to optimize slow query with many joins [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
My situation:
the query searches around 90,000 vehicles
the query takes long each time
I already have indexes on all the fields being JOINed.
How can I optimise it?
Here is the query:
SELECT vehicles.make_id,
vehicles.fuel_id,
vehicles.body_id,
vehicles.transmission_id,
vehicles.colour_id,
vehicles.mileage,
vehicles.vehicle_year,
vehicles.engine_size,
vehicles.trade_or_private,
vehicles.doors,
vehicles.model_id,
Round(3959 * Acos(Cos(Radians(51.465436)) *
Cos(Radians(vehicles.gps_lat)) *
Cos(
Radians(vehicles.gps_lon) - Radians(
-0.296482)) +
Sin(
Radians(51.465436)) * Sin(
Radians(vehicles.gps_lat)))) AS distance
FROM vehicles
INNER JOIN vehicles_makes
ON vehicles.make_id = vehicles_makes.id
LEFT JOIN vehicles_models
ON vehicles.model_id = vehicles_models.id
LEFT JOIN vehicles_fuel
ON vehicles.fuel_id = vehicles_fuel.id
LEFT JOIN vehicles_transmissions
ON vehicles.transmission_id = vehicles_transmissions.id
LEFT JOIN vehicles_axles
ON vehicles.axle_id = vehicles_axles.id
LEFT JOIN vehicles_sub_years
ON vehicles.sub_year_id = vehicles_sub_years.id
INNER JOIN members
ON vehicles.member_id = members.id
LEFT JOIN vehicles_categories
ON vehicles.category_id = vehicles_categories.id
WHERE vehicles.status = 1
AND vehicles.date_from < 1330349235
AND vehicles.date_to > 1330349235
AND vehicles.type_id = 1
AND ( vehicles.price >= 0
AND vehicles.price <= 1000000 )
Here is the vehicle table schema:
CREATE TABLE IF NOT EXISTS `vehicles` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`number_plate` varchar(100) NOT NULL,
`type_id` int(11) NOT NULL,
`make_id` int(11) NOT NULL,
`model_id` int(11) NOT NULL,
`model_sub_type` varchar(250) NOT NULL,
`engine_size` decimal(12,1) NOT NULL,
`vehicle_year` int(11) NOT NULL,
`sub_year_id` int(11) NOT NULL,
`mileage` int(11) NOT NULL,
`fuel_id` int(11) NOT NULL,
`transmission_id` int(11) NOT NULL,
`price` decimal(12,2) NOT NULL,
`trade_or_private` tinyint(4) NOT NULL,
`postcode` varchar(25) NOT NULL,
`gps_lat` varchar(50) NOT NULL,
`gps_lon` varchar(50) NOT NULL,
`img1` varchar(100) NOT NULL,
`img2` varchar(100) NOT NULL,
`img3` varchar(100) NOT NULL,
`img4` varchar(100) NOT NULL,
`img5` varchar(100) NOT NULL,
`img6` varchar(100) NOT NULL,
`img7` varchar(100) NOT NULL,
`img8` varchar(100) NOT NULL,
`img9` varchar(100) NOT NULL,
`img10` varchar(100) NOT NULL,
`is_featured` tinyint(4) NOT NULL,
`body_id` int(11) NOT NULL,
`colour_id` int(11) NOT NULL,
`doors` tinyint(4) NOT NULL,
`axle_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL,
`contents` text NOT NULL,
`date_created` int(11) NOT NULL,
`date_edited` int(11) NOT NULL,
`date_from` int(11) NOT NULL,
`date_to` int(11) NOT NULL,
`member_id` int(11) NOT NULL,
`inactive_id` int(11) NOT NULL,
`status` tinyint(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `type_id` (`type_id`),
KEY `make_id` (`make_id`),
KEY `model_id` (`model_id`),
KEY `fuel_id` (`fuel_id`),
KEY `transmission_id` (`transmission_id`),
KEY `body_id` (`body_id`),
KEY `colour_id` (`colour_id`),
KEY `axle_id` (`axle_id`),
KEY `category_id` (`category_id`),
KEY `vehicle_year` (`vehicle_year`),
KEY `mileage` (`mileage`),
KEY `status` (`status`),
KEY `date_from` (`date_from`),
KEY `date_to` (`date_to`),
KEY `trade_or_private` (`trade_or_private`),
KEY `doors` (`doors`),
KEY `price` (`price`),
KEY `engine_size` (`engine_size`),
KEY `sub_year_id` (`sub_year_id`),
KEY `member_id` (`member_id`),
KEY `date_created` (`date_created`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=136237 ;
The EXPLAIN:
1 SIMPLE vehicles ref type_id,make_id,status,date_from,date_to,price,mem... type_id 4 const 85695 Using where
1 SIMPLE members index PRIMARY PRIMARY 4 NULL 3 Using where; Using index; Using join buffer
1 SIMPLE vehicles_makes eq_ref PRIMARY PRIMARY 4 tvs.vehicles.make_id 1 Using index
1 SIMPLE vehicles_models eq_ref PRIMARY PRIMARY 4 tvs.vehicles.model_id 1 Using index
1 SIMPLE vehicles_fuel eq_ref PRIMARY PRIMARY 4 tvs.vehicles.fuel_id 1 Using index
1 SIMPLE vehicles_transmissions eq_ref PRIMARY PRIMARY 4 tvs.vehicles.transmission_id 1 Using index
1 SIMPLE vehicles_axles eq_ref PRIMARY PRIMARY 4 tvs.vehicles.axle_id 1 Using index
1 SIMPLE vehicles_sub_years eq_ref PRIMARY PRIMARY 4 tvs.vehicles.sub_year_id 1 Using index
1 SIMPLE vehicles_categories eq_ref PRIMARY PRIMARY 4 tvs.vehicles.category_id 1 Using index
Improving the WHERE clause
Your EXPLAIN shows that MySQL is only utilizing one index (type_id) for selecting the rows that match the WHERE clause, even though you have multiple criteria in the clause.
To be able to utilize an index for all of the criteria in the WHERE clause, and to reduce the size of the result set as quickly as possible, add a multi-column index on the following columns on the vehicles table:
(status, date_from, date_to, type_id, price)
The columns should be in order of highest cardinality to least.
For example, vehicles.date_from is likely to have more distinct values than status, so put the date_from column before status, like this:
(date_from, date_to, price, type_id, status)
This should reduce the rows returned in the first part of the query execution, and should be demonstrated with a lower row count on the first line of the EXPLAIN result.
You will also notice that MySQL will use the multi-column index for the WHERE in the EXPLAIN result. If, by chance, it doesn't, you should hint or force the multi-column index.
Removing the unnecessary JOINs
It doesn't appear that you are using any fields in any of the joined tables, so remove the joins. This will remove all of the additional work of the query, and get you down to one, simple execution plan (one line in the EXPLAIN result).
Each JOINed table causes an additional lookup per row of the result set. So, if the WHERE clause selects 5,000 rows from vehicles, since you have 8 joins to vehicles, you will have 5,000 * 8 = 40,000 lookups. That's a lot to ask from your database server.
Instead of expensive calculation of precise distance for all of the rows use a bounding box and calculate the exact distance only for rows inside the box.
The simplest possible example is to calculate min/max longitude and latitude that interests you and add it to WHERE clause. This way the distance will be calculated only for a subset of rows.
WHERE
vehicles.gps_lat > min_lat ANDd vehicles.gps_lat < max_lat AND
vehicles.gps_lon > min_lon AND vehicles.gps_lon < max_lon
For more complex solutions see:
MySQL spatial extensions
How to use MySQL spatial extensions
https://stackoverflow.com/a/5237509/342473
Is you SQL faster without this?
Round(3959 * Acos(Cos(Radians(51.465436)) *
Cos(Radians(vehicles.gps_lat)) *
Cos(Radians(vehicles.gps_lon) -
Radians(-0.296482)) +
Sin(Radians(51.465436)) *
Sin(Radians(vehicles.gps_lat)))) AS distance
performing math equation is very expensive
Maybe you should consider a materialized view that pre-calculate you distance, and you can select from that view. Depending on how dynamic you data is, you may not have to refresh you data too often.
To be a little more specific than #Randy of indexes, I believe his intention was to have a COMPOUND index to take advantage of your querying critieria... One index that is built on a MINIMUM of ...
( status, type_id, date_from )
but could be extended to include the date_to and price too, but don't know how much the index at that granular level might actually help
( status, type_id, date_from, date_to, price )
EDIT per Comments
You shouldn't need all those individual indexes... Yes, the Primary Key by itself. However, for the others, you should have compound indexes based on what your common querying criteria might be and remove the others... the engine might get confused on which might be best suited for the query. If you know you are always looking for a certain status, type and date (assuming vehicle searches), make that as one index. If the query is looking for such information, but also prices within that criteria it will already be very close on the few indexed records that qualify and fly through the price as just an extra criteria.
If you offer querying like Only Automatic vs Manual transmission regardless of year/make, then yes, that could be an index of its own. However, if you would TYPICALLY have some other "common" criteria, tack that on as a secondary that MAY be utilized in the query. Ex: if you look for Manual Transmissions that are 2-door vs 4-door, have your index on (transmission_id, category_id).
Again, you want whatever will help narrow down the field of criteria based on some "minimum" condition. If you tack on an extra column to the index that might "commonly" be applied, that should only help the performance.
To clarify this as an answer: if you do not already have these indexes, you should consider adding them
do you also have indexes on these:
vehicles.status
vehicles.date_from
vehicles.date_to
vehicles.type_id
vehicles.price

Optimization of a query with GROUP BY clause by using indexes

I need to optimize indexes in a table that stores more than 10 Millions rows. The query that is particularly time consuming takes up to 10 seconds to load (when WHERE clause filters only about 2 Millions rows - 8 Millions must be grouped). I have created a few indexes (some of them are complex, some simpler) and tried to find out how to speed this up. Perhaps I'm doing something wrong. MySQL is using optimized_5 index (based on EXPLAIN).
Here is the table's structure and the query:
CREATE TABLE IF NOT EXISTS `geo_reverse` (
`fid` mediumint(8) unsigned NOT NULL,
`tablename` enum('table1','table2') NOT NULL default 'table1',
`geo_continent` varchar(2) NOT NULL,
`geo_country` varchar(2) NOT NULL,
`geo_region` varchar(8) NOT NULL,
`geo_city` mediumint(8) unsigned NOT NULL,
`type` varchar(30) NOT NULL,
PRIMARY KEY (`fid`,`tablename`,`geo_continent`,`geo_country`,`geo_region`,`geo_city`),
KEY `geo_city` (`geo_city`),
KEY `fid` (`fid`),
KEY `geo_region` (`geo_region`,`geo_city`),
KEY `optimized` (`tablename`,`type`,`geo_continent`,`geo_country`,`geo_region`,`geo_city`,`fid`),
KEY `optimized_2` (`fid`,`tablename`),
KEY `optimized_3` (`type`,`geo_city`),
KEY `optimized_4` (`geo_city`,`tablename`),
KEY `optimized_5` (`tablename`,`type`,`geo_city`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
An example query:
SELECT type, COUNT(*) AS objects FROM geo_reverse WHERE tablename = 'table1' AND geo_city IN (5847207,5112771,4916894,...) GROUP BY type
Do you have any idea of how to speed the computation up?
i would use the following index: (geo_city, tablename, type) - geo_city is obviously more selective than tablename, thus it should be on the left. After the condition is applied, the rest should be sorted by type for grouping.