i have very simple mysql table with 5 columns but 5 million data. earlier when data was less my server load was very less but now the load is increasing as the data is more than 5 million and i expect it to reach 10 million by this year end so server will be more slow.i have used indexed wisely
structure is very simple with id as auto increment and primary key and i am filtering the data based on id only which is automatically indexed as it is primary key(i tried indexing it as well but no benefit)
table A
id pid title app get
my query is
EXPLAIN SELECT * FROM tableA ORDER BY id DESC LIMIT 4061280 , 10
and explain says
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tableA ALL NULL NULL NULL NULL 4700461 Using filesort
i dont want to go through all rows as it will slow down my server and create heavy load for file sorting as it will create temporary files either in buffer or in disc.
please advice any good idea to solve this issue.
when my id is indexed why it will go through all rows and reach to desired row.it can not jump directly to that row????
Assuming you don't have "gaps" (read deleted records) in your id..
SELECT * FROM tableA WHERE id > 4061279 and id <= 4061290 ORDER BY id DESC
Ok next
SELECT * FROM tableA WHERE id <= 4061290 ORDER BY id DESC LIMIT 10
Related
We recently moved our database from MariaDB to AWS Amazon Aurora RDS (MySQL). We observed something strange in a set of queries. We have two queries that are very quick, but when together as nested subquery it takes ages to finish.
Here id is the primary key of the table
SELECT * FROM users where id in(SELECT max(id) FROM users where id = 1);
execution time is ~350ms
SELECT * FROM users where id in(SELECT id FROM users where id = 1);
execution time is ~130ms
SELECT max(id) FROM users where id = 1;
execution time is ~130ms
SELECT id FROM users where id = 1;
execution time is ~130ms
We believe it has to do something with the type of value returned by max that is causing the indexing to be ignored when running the outer query from results of the sub query.
All the above queries are simplified for illustration of the problem. The original queries have more clauses as well as 100s of millions of rows. The issue did not exist prior to the migration and worked fine in MariaDB.
--- RESULTS FROM MariaDB ---
MySQL seems to optimize less efficient compared to MariaDB (int this case).
When doing this in MySQL (see: DBFIDDLE1), the execution plans look like:
For the query without MAX:
id select_type table partitions type
possible_keys
key key_len ref
rows
filtered Extra
1 SIMPLE integers null const
PRIMARY
PRIMARY 4 const
1
100.00 Using index
1 SIMPLE integers null const
PRIMARY
PRIMARY 4 const
1
100.00 Using index
For the query with MAX:
id select_type table partitions type
possible_keys
key key_len ref
rows
filtered Extra
1 PRIMARY integers null index null
PRIMARY
4 null
1000
100.00 Using where; Using index
2 DEPENDENT SUBQUERY null null null null
null
null null
null
null Select tables optimized away
While MariaDB (see: DBFIDDLE2 does have a better looking plan when using MAX:
id select_type table type
possible_keys
key key_len ref
rows
filtered Extra
1 PRIMARY system null
null
null null
1
100.00
1 PRIMARY integers const PRIMARY
PRIMARY
4 const
1
100.00 Using index
2 MATERIALIZED null null null
null
null null
null
null Select tables optimized away
EDIT: Because of time (some lack of it 😉) I now add some info
A suggestion to fix this:
SELECT *
FROM integers
WHERE i IN (select * from (SELECT MAX(i) FROM integers WHERE i=1)x);
When looking at the EXECUTION PLAN from MariaDB, which has 1 extra step, I tried to do the same in MySQL. Above query has an even bigger execution plan, but tests show that it performs better. (for explain plans, see: DBFIDDLE1a)
"the question is Mariadb that much faster? it uses a step more that mysql"
One step more does not mean that things get slower.
MySQL takes about 2-3 seconds on the query using the MAX, and MariaDB does execute the same in under 10 msecs. But this is performance, and time may vary on different systems.
SELECT max(id) FROM users where id = 1
Is strange. Since it is looking only at rows where id = 1, then "max" is obviously "1". So is the min. And the average.\
Perhaps you wanted:
SELECT max(id) FROM users
Is there an index on id? Perhaps the PRIMARY KEY? If not, then that might explain the sluggishness.
This can be done much faster (against assuming an index):
SELECT * FROM users
ORDER BY id DESC
LIMIT 1
Does that give you what you want?
To discuss this further, please provide SHOW CREATE TABLE users
I have a table with ~ 9 million records
Structure
id int PK AI
pa_id int
cha_id smallint
cha_level tinyint
cha_points mediumint
cha_points_till smallint
cha_points_from medium
cha_points_date datetime
My query
select max(cha_points) as highest,cha_id,count(id) as entry_count,
sum(cha_points) as total_points
from playeraccounts_cha_masteries
group by cha_id
order by total_points desc
My indexes
playeraccounts_cha_masteries 0 PRIMARY 1 id A 9058483 NULL NULL BTREE
playeraccounts_cha_masteries 1 cha_id 1 cha_id A 9 NULL NULL BTREE
playeraccounts_cha_masteries 1 pa_id 1 pa_id A 156270 NULL NULL BTREE
playeraccounts_cha_masteries 1 cha_points 1 cha_points A 166100 NULL NULL BTREE
The index on pa_id has its use in a different query.
Explain
id select_type table partitions type possible_keys key key_len ref rows filterd Extra
1 simple m null range PRIMARY,cha_id PRIMARY 4 NULL 9164555 100.00 Using where; Using temporary; Using filesort
Is there any i can speed up the query still?
You have 3 options:
Speed up the existing query
Create a composite index on cha_id and cha_points fields, change count(id) to count(*) or count(cha_id), and test again. You may have to play with the order of fields in the index. Check with explain if the covering index is used.
By changing count(id) to count(*) or count(cha_id) you eliminate the need to check the id column. Since you use that count to return the number of records within each cha_id group, it is safe to replace the reference to id field with * or cha_id.
Creating a composite index on cha_id and cha_points fields will result in a covering index, meaning all fields required by the query is in a single index, so the query does not have to scan the entire table.
Create a separate statistics table and update it with triggers
Create a separate statistics table for playeraccounts_cha_masteries. You can use triggers to update counts, maximums, and totals. The page would query the statistics table instead of the playeraccounts_cha_masteries table. This solution may slow inserts / updates / deletes down, since each data modification transaction has to be serialised, so that the statistics table is properly updated.
Create a separate statistics table and update it periodically
Create a separate statistics table, but instead of using triggers to keep it constantly updated, use scheduled job (OS or mysql level) to periodically update the table with the latest statistics. This would mean that the stats will be out of sync for a while, but this may be a reasonable compromise, if an acceptable refresh period can be found.
You can even take this approach one step further, and instead of generating a separate statistics table, you can generate a static html file with appropriate expiry set in its headers with the statistics. This way the server has only to serve the static file for the statistics.
My table has 1,000,000 rows and 4 columns:
id cont stat message
1 rgrf 0 ttgthyhtg
2 frrgt 0 tthyrt
3 4r44 1 rrttttg
...
I am performing a select query which is very slow even though I have done indexing
SELECT * FROM tablea WHERE stat='0' order by id LIMIT 1
This query is making my mysql very slow, I checked with mysql explain and found this
explain SELECT * FROM tablea WHERE stat='0' order by id LIMIT 1
and I was shocked by the output but I don't know how to optimize it.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tablea ref stat stat 4 const 216404 Using where
There are 216,404 rows for optimizing that I have to reduce to 1 or 2 but how?
The problem is that MySQL can only use one index per table in a query, this is index stat in your case. So, the ORDER BY is performed without use of index, which is very slow on 1M rows.
Try the following:
Implicitly use the correct index:
SELECT * FROM tablea USE INDEX(PRIMARY) WHERE stat='0' order by id LIMIT 1
Create a composite index, just as Ollie Jones said above.
I suggest you try creating a compound index on (stat, id). This may allow your search / order operation to be optimized. There's a downside, of course: you'll incur extra overhead with insertions and updates.
CREATE INDEX ON tablea (stat,id) USING BTREE
Give it a try.
I have a relatively large table (5,208,387 rows, 400mb data/670mb index),
all columns i use to search with are indexes.
name and type are VARCHAR(255) BTREE INDEX
and sdate is an INTEGER column containing timestamps.
I fail to understand some issues,
first this query is very slow (5sec):
SELECT *
FROM `mytable`
WHERE `name` LIKE 'hello%my%big%text%thing%'
AND `type` LIKE '%'
ORDER BY `sdate` DESC LIMIT 3
EXPLAIN for the above:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE mytable range name name 257 NULL 5191 Using where
while this one is very fast (5msec):
SELECT *
FROM `mytable`
WHERE `name` LIKE 'hello.my%big%text%thing%'
AND `type` LIKE '%'
ORDER BY `sdate` DESC LIMIT 3
EXPLAIN for the above:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE mytable range name name 257 NULL 204 Using where
the amount of rows scanned different makes sense because of the indexes,
but having 5k of indexed rows take 5 seconds seems way too much.
also, ordering by name instead of sdate makes the queries very fast, but I need to order by the timestamp.
Second thing I do not understand is that before
adding the last column to the index,
the db had index of 1.4GB,
not after running an OPTIMIZE/REPAIR the size is just 670MB.
The problem is, only the portion before the first % can take advantage of the index, the rest of the like strings needs to process all rows which match hello% or hello.my% without the help of one. Also, ordering by another column then the index used, probably requires a second pass, or at least a scan rather then an already sorted index. Options to better performance (can be implemented independently from each other) are:
Using a full-text index on the name column and using a MATCH() AGAINST() search rather then LIKE with %'s.
Adding the sdate to in index combined (name,sdate) could very well speed up sorting.
I'm using "users" table with over 2 millions records. The query is:
SELECT * FROM users WHERE 1 ORDER BY firstname LIMIT $start,30
"firstname" column is indexed. Getting first pages is very fast while getting last pages is very slow.
I used EXPLAIN and here are the results:
for
EXPLAIN SELECT * FROM `users` WHERE 1 ORDER BY `firstname` LIMIT 10000 , 30
I'm getting:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE users index NULL firstname 194 NULL 10030
But for
EXPLAIN SELECT * FROM `users` WHERE 1 ORDER BY `firstname` LIMIT 100000 , 30
I'm getting
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE users ALL NULL NULL NULL NULL 2292912 Using filesort
What's the issue?
You shouldn't use limit to page that far into your dataset.
You'll get much better results by using range queries.
SELECT * FROM users
WHERE firstname >= last_used_name
ORDER BY firstname
LIMIT 30
Where last_used_name is one that you already seen (I'm assuming that you do batch processing of some sort). You will get more accurate results if you do range queries on a column with unique index. This way you won't get the same record twice.
When you do
LIMIT 100000 , 30
MySQL essentially does the same as in
LIMIT 100030
Only it doesn't return first 100 thousands. But it sorts and reads them.
For * queries without WHERE condition MySQL often does not use any indexes.
The following query should be much faster and make use of the index:
SELECT *
FROM (
SELECT user_id
FROM users
WHERE 1
ORDER BY firstname
LIMIT $start, 30) ids
JOIN users
USING (user_id);
user_id is the primary key.