I have a MyISAM Table with circa 1 million rows.
SELECT * FROM table WHERE colA='ABC' AND (colB='123' OR colC='123');
The query above takes over 10 seconds to run. All columns in question are indexed.
But when I split it as follows...
SELECT * FROM table WHERE colA='ABC' AND colB='123';
SELECT * FROM table WHERE colA='ABC' AND colC='123';
Each individual query takes 0.002 seconds.
What gives, and how do I optimize the table/query?
( SELECT * FROM table WHERE colA='ABC' AND colB='123' )
UNION DISTINCT
( SELECT * FROM table WHERE colA='ABC' AND colC='123' )
;
And have
INDEX(colA, colB),
INDEX(colA, colC)
You should consider moving to InnoDB, though it may not matter to this particular question.
Here's how the UNION will work:
Perform each SELECT. They will be very efficient due to the indexes suggested.
Collect the results in a tmp table.
De-dup the temp table and deliver the resulting rows.
All the rows are 'touched' in the original query (with OR).
With the UNION:
Only the necessary rows are touched in the SELECTs.
Those rows are written to the tmp table.
Those rows are reread. (The de-dupping may involve touching the rows more than once.)
Related
i have two query To do a job
query 1 :
SELECT * FROM table1 where id = 1
UNION ALL
SELECT * FROM table2 where id = 5
UNION ALL
SELECT * FROM table1 where id = 70
UNION ALL
SELECT * FROM table2 where id = 3
UNION ALL
SELECT * FROM table1 where id = 90
and query 2 :
SELECT * FROM table1 where id IN (1,70,90)
UNION ALL
SELECT * FROM table2 where id IN (5,3)
Which of these two queries is faster ?
If your answer is the second query .
I've used Query 1 in many different places. in the project Is the difference so large that I would replace everywhere with the second query ?
The second version is more concise, and should be faster, because it only requires actually executing two queries, as opposed to the first version, which does a separate query for each id value.
Assuming id be the primary key in both tables, then MySQL might also be able to use the clustered index for faster lookup of matching records.
What are the typical counts? Total of 5 rows? 2 tables? I would predict the performance difference to be a factor of rows/tables in favoring the 2nd (shorter) formulation. In experimenting, I got about 2x.
So, if you have 100 rows from 2 tables, the second formulation will be significantly faster; enough faster to be worth the effort.
Why?
For such simple queries, parsing and optimizing dominates the time.
For newer versions of MySQL, both queries will touch the same number of rows.
For MySQL 5.7.3 and later, no temp table will be needed for either UNION ALL.
Does it matter that the output rows are likely to be in a different order?
I have a lot of exactly same tables. TableA,TableB,TableC,TableD etc. which I want to create views from.
Doing select * from TableA takes 20ms, doing select * from tableB takes 20ms, but doing
(select * from TableA) union all (select * from TableB) takes over 20 minutes.
Those tables have exactly same columns. Is there any settings in my.cnf that I need to change, or a way to create a view that would run faster? All tables have 1.5m to about 10m rows.
Results of explain
PRIMARY TableA ALL 28808685
UNION TableB ALL 15316215
UNION RESULT <union1,2> ALL Using temporary
Table structure:
10 varchar(20)'s, 5 unsigned INTs.
My guess is that select * from TableA does not take 20 ms. It takes 20 ms to start returning results.
Although I am going to answer your question, you should revisit your data structure. Having multiple tables with the same layout is usually a really bad idea. Instead, you should have a single table with all the rows.
But, you don't seem to have that.
Try running the union all without parentheses:
select * from TableA union all
select * from TableB;
MySQL has a habit of materializing subqueries. I'm not sure if it does this with union all subqueries, but given your description of the problem, that sees likely.
I recently learned that I can search in a MySQL table across multiple columns by using the following select statement with OR:
SELECT * data WHERE TEMP = "3000" OR X ="3000" OR Y="3000";
Which returns the results needed, but it does take approximately 1.7 s to return the results in the table that has only ~260k rows. I also have already added indexes for each of the columns that are searched.
Is there a way to optimize this query? Or is there another one which is faster but returns the same results?
Another option is to use UNION...
SELECT * FROM data WHERE TEMP = "3000"
UNION
SELECT * FROM data WHERE X ="3000"
UNION
SELECT * FROM data WHERE Y="3000";
...however the real key to improving the performance is firstly indexes and second the query analyser. Often the data determines which is faster as TEMP may be a hundred times less likely that Y to be "3000" - so that should be first in you original OR statement for example.
I have created a view in MySQL as
create view vtax
as
SELECT * FROM table1
union
SELECT * FROM table2;
Where in table 1 have 800000 records, and table2 have 500000 records, when I run the independent queries the result are returned with 0.078 secs, but when I am running them through the view it goes in toss taking time more than 10-15 secs.
select * from vtax where col1=value; -- takes more than 10-15 secs
select * from table1 where col1=value; -- takes 0.078 secs
select * from table2 where col1=value; -- takes 0.078 secs
I have created indexes on the tables separately.
Any help/idea what should be done.
UNION
performs a distinct over your results (often a sort). Can you use
UNION ALL
? (ie. are the rows distinct?)
You should compare apples with apples. Unions are often much slower than simple queries. Compare the an union with the view. You will notice that the standard union query is slow as well. Probably the optimizer has problems with the decision for the optimal path. Check some other questions like: Why are UNION queries so slow in MySQL?
As stated in the comments a view isn't indexed in MySQL.
If you use the union in the query:
SELECT * FROM table1 WHERE col1 = 'value'
UNION
SELECT * FROM table2 WHERE col1 = 'value'
Then indexes (if there are any) can be used.
I have a MyISAM table with about 70 million records. When I do select count(distinct id) it takes about 80 seconds to get the query results. This table is a de-normalized table, that's why I need to get the unique count for id, and it has to be done dynamically. If I add a where clause, depending on the range I give, it takes a shorter time between 4 - 90 secs.
I'm wondering if there is any way I can optimize this to improve the query speed.
Try SELECT SQL_CALC_FOUND_ROWS DISTINCT(id) FROM ..., and after that execute query SELECT FOUND_ROWS() AS Total in order to fetch result.