select count(*) taking considerably longer than select * for same "where" clause?

select count(*) taking considerably longer than select * for same "where" clause? - mysql

I am finding a select count(*) is taking considerably longer than select * for the queries with the same where clause.
The table in question has about 2.2 million records (call it detailtable). It has a foreign key field linking to another table (maintable).
This query takes about 10-15 seconds:
select count(*) from detailtable where maintableid = 999
But this takes a second or less:
select * from detailtable where maintableid = 999
UPDATE - It was asked to specify the number of records involved. It is 150.
UPDATE 2 Here is information when the EXPLAIN keyword is used.
For the SELECT COUNT(*), The EXTRA column reports:
Using where; Using index
KEY and POSSIBLE KEYS both have the foreign key constraint as their value.
For the SELECT * query, everything is the same except EXTRA just says:
Using Where
UPDATE 3 Tried OPTIMIZE TABLE and it still does not make a difference.

For sure
select count(*)
should be faster than
select *
count(*), count(field), count(primary key), count(any) are all the same.
Your explain clearly stateas that the optimizer somehow uses the index for count(*) and not for the other making the foreign key the main reason for the delay.
Eliminate the foreign key.

Try
select count(PRIKEYFIELD) from detailtable where maintableid = 999
count(*) will get all data from the table, then count the rows meaning it has more work to do.
Using the primary key field means it's using its index, and should run faster.

Thread Necro!
Crazy idea... In some cases, depending on the query planner and the table size, etc, etc., it is possible for using an index to actually be slower than not using one. So if you get your count without using an index, in some cases, it could actually be faster.
Try this:
SELECT count(*)
FROM detailtable
USING INDEX ()
WHERE maintableid = 999

SELECT count(*)
with that syntax alone is no problem, you can do that to any table.
The main issue on your scenario is the proper use of INDEX and applying [WHERE] clause on your search.
Try to reconfigure your index if you have the chance.
If the table is too big, yes it may take time. Try to check MyISAM locking article.

As the table is 2.2 million records, count can take time. As technically, MySQL should find the records and then count them. This is an extra operation that becomes significant with millions of records. The only way to make it faster is to cache the result in another table and update it behind the scenes.

Or simply Try
SELECT count(1) FROM table_name WHERE _condition;
SELECT count('x') FROM table_name WHERE _condition;

Related

SQL gets slow on a simple query with ORDER BY

I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?

The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.

There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

SQL: Optimize the query on large table with indexing

For example, I have the following table:
table Product
------------
id
category_id
processed
product_name
This table has index on columns id category_id and processed and (category_id, proccessed). The statistic on this table is:
select count(*) from Product; -- 50M records
select count(*) from Product where category_id=10; -- 1M records
select count(*) from Product where processed=1; -- 30M records
My simplest query I want to query is: (select * is the must).
select * from Product
where category_id=10 and processed=1
order by id ASC LIMIT 100
The above query without limit only has about 10,000 records.
I want to call the above query for multiple time. Every time I get out I will update field processed to 0. (so it will not appear on the next query). When I test on the real data, sometime the optimizer try to use id as the key, so it cost a lot of time.
How can I optimize the above query (In general term)
P/S: for avoiding confuse, I know that the best index should be (category, processed, id). But I cannot change the index. My question is just only related to optimize the query.
Thanks

For this query:
select *
from Product
where category_id = 10 and processed = 1
order by id asc
limit 100;
The optimal index is on product(category_id, processed, id). This is a single index with a three-part key, with the keys in this order.

Given that you have INDEX(category_id, processed), there is virtually no advantage in also having just INDEX(category_id). So DROP the latter.
That may have the beneficial side effect of pushing the Optimizer toward the composite INDEX(category_id, processed), which is at least "better" for the query.
Without touching the indexes, you could use a FORCE INDEX mentioning the composite index's name. But I don't recommend it. "It may help today, but hurt tomorrow, after the data changes."
Why do you say "But I cannot change the index."? Newer version of MySQL/MariaDB make ADD/DROP INDEX much faster than older versions. Also, pt-online-schema-change is provides a fast way.

Can a query only use one index per table?

I have a query like this:
( SELECT * FROM mytable WHERE author_id = ? AND seen IS NULL )
UNION
( SELECT * FROM mytable WHERE author_id = ? AND date_time > ? )
Also I have these two indexes:
(author_id, seen)
(author_id, date_time)
I read somewhere:
A query can generally only use one index per table when process the WHERE clause
As you see in my query, there is two separated WHERE clause. So I want to know, "only one index per table" means my query can use just one of those two indexes or it can use one of those indexes for each subquery and both indexes are useful?
In other word, is this sentence true?
"always one of those index will be used, and the other one is useless"

That statement about only using one index is no longer true about MySQL. For instance, it implements the index merge optimization which can take advantage of two indexes for some where clauses that have or. Here is a description in the documentation.
You should try this form of your query and see if it uses index mer:
SELECT *
FROM mytable
WHERE author_id = ? AND (seen IS NULL OR date_time > ? );
This should be more efficient than the union version, because it does not incur the overhead of removing duplicates.
Also, depending on the distribution of your data, the above query with an index on mytable(author_id, date_time, seen) might work as well or better than your version.

UNION combines results of subqueries. Each subquery will be executed independent of others and then results will be merged. So, in this case WHERE limits are applied to each subquery and not to all united result.
In answer to your question: yes, each subquery can use some index.

There are cases when the database engine can use more indexes for one select statement, however when filtering one set of rows really it not possible. If you want to use indexing on two columns then build one index on both columns instead of two indexes.

Every single subquery or part of composite query is itself a query can be evaluated as single query for performance and index access .. you can also force the use of different index for eahc query .. In your case you are using union and these are two separated query .. united in a resulting query
. you can have a brief guide how mysql ue index .. acccessing at this guide
http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html

MYSQL like query does not show full table scan in explain

I have a query running in Mysql like so (obfuscated names)
explain
select this_.id as id1_0_,
this_.column1 as column1,
this_.column2 as column2,
this_.column3 as column3,
this_.column4 as column4,
this_.column5 as column5,
from
tablename this_
where
this_.column1 like '/blah%'
and this_.column2 = 'a9b51a14-4338-94f7-f23dbf9d539e'
and this_.column3 <> 'DUH'
and this_.column4=0
and this_.column5 like '%somename%'
order by this_.created desc
limit 20
Edit: column1 has a BTREE index, column2, column3, column 4, column5, created all have HASH indexes.
Table has one foreign key which is selected in the select clause but not the WHERE clause.
I'm told and read that the
like %somename%
will result in a full table scan. However, when i ran the explain, the output of the explain is
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE this_ ref somecolumnandinexnames 111 const 30414 Using where
The explain output looks exactly the same if I take away the like clause.
Based on this, we decided to put the query in production only to discover that in practice the query with like took way longer to execute (some seconds compared to some millis without the like).
Is there an explanation as to why the explain didn't warn me beforehand about this?
Edit: Observations
Taking away the order by makes the query go fast again even with the LIKE still in there.
Splitting into a subquery with the like in the outer query as mentioned below in the post actually works!
As #Uueerdo says, moving the rest of the conditions into a subquery actually speeds up performance! I'm therefore tempted to conclude that one of the things that could be happening is that the WHERE clause with the like is executed before the other conditions leading to a large resultset. As #Uueerdo says, moving the rest of the conditions into a subquery actually speeds up performance! I'm therefore tempted to conclude that one of the things that could be happening is that the WHERE clause with the like is executed before the other conditions leading to a large resultset. However, I still don't have an explanation on why removing the order by speeds up performance. The query selects all of 10 rows so the order by should be quite fast.
Is there a way I can see the order in which MYSQL evaluates the query. I think I remember seeing some sort of a graphical representation in MS SQL Server explain plans once. Don't remember if it was quite the same.

Even if it does not require a table scan it can still be expensive; what is likely happening is MySQL is using other conditions in your where for initial candidate row selection, and then reducing those results with the rest of the conditions.
If there are a large number of candidates, and/or column5 values are long, that condition could take some time to evaluate. Keep in mind the LIMIT occurs after the WHERE, so that does not reduce the amount of work needed.
You might see some improvement if you put most of the query in a subquery, and filter it's results by the like '%somename%' condition in an outer query.
Something like:
SELECT * FROM (
SELECT t.id as id1_0_
, t.column1, t.column2, t.column3, t.column4, t.column5
, t.created
FROM tablename AS t
WHERE t.column2 = 'a9b51a14-4338-94f7-f23dbf9d539e'
AND t.column3 <> 'DUH'
AND t.column4=0
AND t.column1 like '/blah%'
) AS subQ
WHERE subQ.column5 like '%somename%'
ORDER BY subQ.created DESC
LIMIT 20

If you have an index that includes all of the columns being read (in the SELECT or WHERE clause), MySQL will be able to read all of these values out of the index without scanning the table.
Also note that all primary key columns will be in the index as well.
In that case, even though it isn't scanning the full table, it will be scanning every row in the index in order to handle the LIKE '%somename%' query, so it's not going to be especially more efficient.

Very slow query, any other ways to format this with better performace?

I have this query (I didn't write) that was working fine for a client until the table got more then a few thousand rows in it, now it's taking 40 seconds+ on only 4200 rows.
Any suggetions on how to optimize and get the same result?
I've tried a few other methods but didn't get the correct result that this slower query returned...
SELECT COUNT(*) AS num
FROM `fl_events`
WHERE id IN(
SELECT DISTINCT (e2.id)
FROM `fl_events` AS e1, fl_events AS e2
WHERE e1.startdate >= now() AND e1.startdate = e2.startdate
)
ORDER BY `startdate`
Any help would be greatly appriciated!

Appart from the obvious indexes needed, I don't really get why you are joining your table with itself for choosing the IN condition. The ORDER BY is also not needed. Are you sure that your query can't be written just like this?:
SELECT COUNT(*) AS num
FROM `fl_events` AS e1
WHERE e1.startdate >= now()

I don't think rewriting the query will help. The key to your question is "until the table got more than a few thousand rows." This implies that important columns aren't indexed. Prior to a certain number of records, all the data fit on a single memory block - over that point, it takes a new block. And index is the only way to speed up the search.
first - check to see that the ID in fl_events is actually marked as a primary key. That physically orders the records and without it you can see data corruption and occasionally super-slow results. The use of distinct in the query makes it look like it might NOT be a unique value. That will pose a problem.
Then, make sure to add an index on the start_date.

The slowness is probably related to the join of the event table with itself, and possibly startdate not having an index.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008