Optimizing MySQL database query - mysql

I have a huge table, but I know in most cases, only a small portion of the data are used for a query. Is there a way to make MySQL only lookup this small portion? Does "view" help in this case?

Simply read this article - http://dev.mysql.com/doc/refman/5.0/en/optimization.html
Optimize indexes, statements/clauses, caching and server itself.

Many columns
If you have many columns, be sure to only name the used columns in the SELECT statement. This allows MySQL to skip over the unused columns, not returning values that you won't be using anyway.
So, instead of the following query:
SELECT *
FROM users
Use this type of query:
SELECT id, last_name, first_name
FROM users
Many rows
If you have many rows, add indexes to the columns that you are filtering on using the WHERE clause. For example:
SELECT id, last_name, first_name
FROM users
WHERE last_name = 'Smith'
The above query selects specific columns for all user records where the last name is 'Smith'.
If you have an index on the last_name column, MySQL would be able to locate the records that match the criteria in your WHERE clause very quickly.

Related

MySQL UNION ALL Providing 0 speed increase over equivalent OR statement [duplicate]

I just read part of an optimization article and segfaulted on the following statement:
When using SQL replace statements using OR with a UNION:
select username from users where company = ‘bbc’ or company = ‘itv’;
to:
select username from users where company = ‘bbc’ union
select username from users where company = ‘itv’;
From a quick EXPLAIN:
Using OR:
Using UNION:
Doesn't this mean UNION does in double the work?
While I appreciate UNION may be more performant for certain RDBMSes and certain table schemas, this is not categorically true as the author suggestions.
Question
Am I wrong?
Either the article you read used a bad example, or you misinterpreted their point.
select username from users where company = 'bbc' or company = 'itv';
This is equivalent to:
select username from users where company IN ('bbc', 'itv');
MySQL can use an index on company for this query just fine. There's no need to do any UNION.
The more tricky case is where you have an OR condition that involves two different columns.
select username from users where company = 'bbc' or city = 'London';
Suppose there's an index on company and a separate index on city. Given that MySQL usually uses only one index per table in a given query, which index should it use? If it uses the index on company, it would still have to do a table-scan to find rows where city is London. If it uses the index on city, it would have to do a table-scan for rows where company is bbc.
The UNION solution is for this type of case.
select username from users where company = 'bbc'
union
select username from users where city = 'London';
Now each sub-query can use the index for its search, and the results of the subquery are combined by the UNION.
An anonymous user proposed an edit to my answer above, but a moderator rejected the edit. It should have been a comment, not an edit. The claim of the proposed edit was that UNION has to sort the result set to eliminate duplicate rows. This makes the query run slower, and the index optimization is therefore a wash.
My response is that that the indexes help to reduce the result set to a small number of rows before the UNION happens. UNION does in fact eliminate duplicates, but to do that it only has to sort the small result set. There might be cases where the WHERE clauses match a significant portion of the table, and sorting during UNION is as expensive as simply doing the table-scan. But it's more common for the result set to be reduced by the indexed searches, so the sorting is much less costly than the table-scan.
The difference depends on the data in the table, and the terms being searched. The only way to determine the best solution for a given query is to try both methods in the MySQL query profiler and compare their performance.
Those are not the same query.
I don't have much experience with MySQL, so I am not sure what the query optimizer does or does not do, but here are my thoughts from my general background (primarily ms sql server).
Typically, the query analyzer can take the above two queries and make the exact same plan out of them (if they were the same), so it wouldn't matter. I would suspect that there is no performance difference between these queries (which are equivalent)
select distinct username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union
select username from users where company = ‘itv’;
Now, the question is, would there be a difference between the following queries, of which I actually don't know, but I would suspect that the optimizer would make it more like the first query
select username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union all
select username from users where company = ‘itv’;
It depends on what the optimizer ends up doing based on the size of the data, indexes, software version, etc.
I would guess that using OR would give the optimizer a better chance at finding some efficiencies, since everything is in a single logical statement.
Also, UNION has some overhead, since it creates a reset set (no duplicates).
Each statement in the UNION should execute pretty quickly if company is indexed... not sure it'd really be doing double the work.
Bottom line
Unless you really have a burning need to squeeze every bit of speed out of your query, it's probably better to just go with the form that best communicates your intention... the OR
Update
I also meant to mention IN. I believe the following query will give better performance than the OR (it's also the form I prefer):
select username from users where company in ('bbc', 'itv');
This my benchmark result
When use UNION - Query took 13.8699 seconds
row examined primary select type - 247685
when use OR - Query took 0.0126 seconds and row examined primary
select type - 495371
MySQL uses one index for a query, so when we are using or then mysql use one column index and scan full table for another column
another part union same work can 2 times
that's why or is faster then union
In almost all cases, the union or union all version is going to do two full table scans of the users table.
The or version is much better in practice, since it will only scan the table once. It will also use an index only once, if available.
The original statement just seems wrong, for just about any database and any situation.
Bill Karwin's answer is pretty right. When the both part of the OR statement has its own index, it's better doing union because once you have a small subset of results, it's easier to sort them and eliminate duplicates. Total cost is almost less than using only one index (for one of the column) and table scan for the other column (because mysql only uses one index for one column).
It depends of the table's structure and needs generally but in large tables union gave to me better results.

MySQL Optimizing: replace OR statement with UNION [duplicate]

I just read part of an optimization article and segfaulted on the following statement:
When using SQL replace statements using OR with a UNION:
select username from users where company = ‘bbc’ or company = ‘itv’;
to:
select username from users where company = ‘bbc’ union
select username from users where company = ‘itv’;
From a quick EXPLAIN:
Using OR:
Using UNION:
Doesn't this mean UNION does in double the work?
While I appreciate UNION may be more performant for certain RDBMSes and certain table schemas, this is not categorically true as the author suggestions.
Question
Am I wrong?
Either the article you read used a bad example, or you misinterpreted their point.
select username from users where company = 'bbc' or company = 'itv';
This is equivalent to:
select username from users where company IN ('bbc', 'itv');
MySQL can use an index on company for this query just fine. There's no need to do any UNION.
The more tricky case is where you have an OR condition that involves two different columns.
select username from users where company = 'bbc' or city = 'London';
Suppose there's an index on company and a separate index on city. Given that MySQL usually uses only one index per table in a given query, which index should it use? If it uses the index on company, it would still have to do a table-scan to find rows where city is London. If it uses the index on city, it would have to do a table-scan for rows where company is bbc.
The UNION solution is for this type of case.
select username from users where company = 'bbc'
union
select username from users where city = 'London';
Now each sub-query can use the index for its search, and the results of the subquery are combined by the UNION.
An anonymous user proposed an edit to my answer above, but a moderator rejected the edit. It should have been a comment, not an edit. The claim of the proposed edit was that UNION has to sort the result set to eliminate duplicate rows. This makes the query run slower, and the index optimization is therefore a wash.
My response is that that the indexes help to reduce the result set to a small number of rows before the UNION happens. UNION does in fact eliminate duplicates, but to do that it only has to sort the small result set. There might be cases where the WHERE clauses match a significant portion of the table, and sorting during UNION is as expensive as simply doing the table-scan. But it's more common for the result set to be reduced by the indexed searches, so the sorting is much less costly than the table-scan.
The difference depends on the data in the table, and the terms being searched. The only way to determine the best solution for a given query is to try both methods in the MySQL query profiler and compare their performance.
Those are not the same query.
I don't have much experience with MySQL, so I am not sure what the query optimizer does or does not do, but here are my thoughts from my general background (primarily ms sql server).
Typically, the query analyzer can take the above two queries and make the exact same plan out of them (if they were the same), so it wouldn't matter. I would suspect that there is no performance difference between these queries (which are equivalent)
select distinct username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union
select username from users where company = ‘itv’;
Now, the question is, would there be a difference between the following queries, of which I actually don't know, but I would suspect that the optimizer would make it more like the first query
select username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union all
select username from users where company = ‘itv’;
It depends on what the optimizer ends up doing based on the size of the data, indexes, software version, etc.
I would guess that using OR would give the optimizer a better chance at finding some efficiencies, since everything is in a single logical statement.
Also, UNION has some overhead, since it creates a reset set (no duplicates).
Each statement in the UNION should execute pretty quickly if company is indexed... not sure it'd really be doing double the work.
Bottom line
Unless you really have a burning need to squeeze every bit of speed out of your query, it's probably better to just go with the form that best communicates your intention... the OR
Update
I also meant to mention IN. I believe the following query will give better performance than the OR (it's also the form I prefer):
select username from users where company in ('bbc', 'itv');
This my benchmark result
When use UNION - Query took 13.8699 seconds
row examined primary select type - 247685
when use OR - Query took 0.0126 seconds and row examined primary
select type - 495371
MySQL uses one index for a query, so when we are using or then mysql use one column index and scan full table for another column
another part union same work can 2 times
that's why or is faster then union
In almost all cases, the union or union all version is going to do two full table scans of the users table.
The or version is much better in practice, since it will only scan the table once. It will also use an index only once, if available.
The original statement just seems wrong, for just about any database and any situation.
Bill Karwin's answer is pretty right. When the both part of the OR statement has its own index, it's better doing union because once you have a small subset of results, it's easier to sort them and eliminate duplicates. Total cost is almost less than using only one index (for one of the column) and table scan for the other column (because mysql only uses one index for one column).
It depends of the table's structure and needs generally but in large tables union gave to me better results.

If your table has more selects than inserts, are indexes always beneficial?

I have a mysql innodb table where I'm performing a lot of selects using different columns. I thought that adding an index on each of those fields could help performance, but after reading a bit on indexes I'm not sure if adding an index on a column you select on always helps.
I have far more selects than inserts/updates happening in my case.
My table 'students' looks like:
id | student_name | nickname | team | time_joined_school | honor_roll
and I have the following queries:
# The team column is varchar(32), and only has about 20 different values.
# The honor_roll field is a smallint and is only either 0 or 1.
1. select from students where team = '?' and honor_roll = ?;
# The student_name field is varchar(32).
2. select from students where student_name = '?';
# The nickname field is varchar(64).
3. select from students where nickname like '%?%';
all the results are ordered by time_joined_school, which is a bigint(20).
So I was just going to add an index on each of the columns, does that make sense in this scenario?
Thanks
Indexes help the database more efficiently find the data you're looking for. Which is to say you don't need an index simply because you're selecting a given column, but instead you (generally) need an index for columns you're selecting based on - i.e. using a WHERE clause (even if you don't end up including the searched column in your result).
Broadly, this means you should have indexes on columns that segregate your data in logical ways, and not on extraneous, simply informative columns. Before looking at your specific queries, all of these columns seem like reasonable candidates for indexing, since you could reasonably construct queries around these columns. Examples of columns that would make less sense would be things phone_number, address, or student_notes - you could index such columns, but generally you don't need or want to.
Specifically based on your queries, you'll want student_name, team, and honor_roll to be indexed, since you're defining WHERE conditions based on the values of these columns. You'll also benefit from indexing time_joined_school if, as you suggest, you're ORDER BYing your queries based on that column. Your LIKE query is not actually easy for most RDBs to handle, and indexing nickname won't help. Check out How to speed up SELECT .. LIKE queries in MySQL on multiple columns? for more.
Note also that the ratio of SELECT to INSERT is not terribly relevant for deciding whether to use an index or not. Even if you only populate the table once, and it's read-only from that point on, SELECTs will run faster if you index the correct columns.
Yes indexes help on accerate your querys.
In your case you should have index on:
1) Team and honor_roll from query 1 (only 1 index with 2 fields)
2) student_name
3) time_joined_school from order
For the query 3 you can't use indexes because of the like statement. Hope this helps.

SQL Performance UNION vs OR

I just read part of an optimization article and segfaulted on the following statement:
When using SQL replace statements using OR with a UNION:
select username from users where company = ‘bbc’ or company = ‘itv’;
to:
select username from users where company = ‘bbc’ union
select username from users where company = ‘itv’;
From a quick EXPLAIN:
Using OR:
Using UNION:
Doesn't this mean UNION does in double the work?
While I appreciate UNION may be more performant for certain RDBMSes and certain table schemas, this is not categorically true as the author suggestions.
Question
Am I wrong?
Either the article you read used a bad example, or you misinterpreted their point.
select username from users where company = 'bbc' or company = 'itv';
This is equivalent to:
select username from users where company IN ('bbc', 'itv');
MySQL can use an index on company for this query just fine. There's no need to do any UNION.
The more tricky case is where you have an OR condition that involves two different columns.
select username from users where company = 'bbc' or city = 'London';
Suppose there's an index on company and a separate index on city. Given that MySQL usually uses only one index per table in a given query, which index should it use? If it uses the index on company, it would still have to do a table-scan to find rows where city is London. If it uses the index on city, it would have to do a table-scan for rows where company is bbc.
The UNION solution is for this type of case.
select username from users where company = 'bbc'
union
select username from users where city = 'London';
Now each sub-query can use the index for its search, and the results of the subquery are combined by the UNION.
An anonymous user proposed an edit to my answer above, but a moderator rejected the edit. It should have been a comment, not an edit. The claim of the proposed edit was that UNION has to sort the result set to eliminate duplicate rows. This makes the query run slower, and the index optimization is therefore a wash.
My response is that that the indexes help to reduce the result set to a small number of rows before the UNION happens. UNION does in fact eliminate duplicates, but to do that it only has to sort the small result set. There might be cases where the WHERE clauses match a significant portion of the table, and sorting during UNION is as expensive as simply doing the table-scan. But it's more common for the result set to be reduced by the indexed searches, so the sorting is much less costly than the table-scan.
The difference depends on the data in the table, and the terms being searched. The only way to determine the best solution for a given query is to try both methods in the MySQL query profiler and compare their performance.
Those are not the same query.
I don't have much experience with MySQL, so I am not sure what the query optimizer does or does not do, but here are my thoughts from my general background (primarily ms sql server).
Typically, the query analyzer can take the above two queries and make the exact same plan out of them (if they were the same), so it wouldn't matter. I would suspect that there is no performance difference between these queries (which are equivalent)
select distinct username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union
select username from users where company = ‘itv’;
Now, the question is, would there be a difference between the following queries, of which I actually don't know, but I would suspect that the optimizer would make it more like the first query
select username from users where company = ‘bbc’ or company = ‘itv’;
and
select username from users where company = ‘bbc’
union all
select username from users where company = ‘itv’;
It depends on what the optimizer ends up doing based on the size of the data, indexes, software version, etc.
I would guess that using OR would give the optimizer a better chance at finding some efficiencies, since everything is in a single logical statement.
Also, UNION has some overhead, since it creates a reset set (no duplicates).
Each statement in the UNION should execute pretty quickly if company is indexed... not sure it'd really be doing double the work.
Bottom line
Unless you really have a burning need to squeeze every bit of speed out of your query, it's probably better to just go with the form that best communicates your intention... the OR
Update
I also meant to mention IN. I believe the following query will give better performance than the OR (it's also the form I prefer):
select username from users where company in ('bbc', 'itv');
This my benchmark result
When use UNION - Query took 13.8699 seconds
row examined primary select type - 247685
when use OR - Query took 0.0126 seconds and row examined primary
select type - 495371
MySQL uses one index for a query, so when we are using or then mysql use one column index and scan full table for another column
another part union same work can 2 times
that's why or is faster then union
In almost all cases, the union or union all version is going to do two full table scans of the users table.
The or version is much better in practice, since it will only scan the table once. It will also use an index only once, if available.
The original statement just seems wrong, for just about any database and any situation.
Bill Karwin's answer is pretty right. When the both part of the OR statement has its own index, it's better doing union because once you have a small subset of results, it's easier to sort them and eliminate duplicates. Total cost is almost less than using only one index (for one of the column) and table scan for the other column (because mysql only uses one index for one column).
It depends of the table's structure and needs generally but in large tables union gave to me better results.

Search SQL query from multiple tables MySQL

I am trying to perform search on multiple tables.
I will simplify problem and say that I have 2 tables Worker and Customer both have Id, Name, Surname and Worker has additional Position, all fields are varchar except Id which is Int.
How to make a query that will return rows of either Customer or Worker, where one of theirs fields contains entered search string.
I have tried with joins but I got returned joined row also.
select id,name,surname,position,'worker' as tbl from worker where ..
union all
select id,name,surname,'','customer' from customer where ...
In this way you can even know results what table belong to.
Just UNION both queries.
If you really can JOIN those two, you can use
an IF statement in the SELECT clause to show the right field.
But, from what I understand from your question, go with UNION