Related
I am really interested in how MySQL indexes work, more specifically, how can they return the data requested without scanning the entire table?
It's off-topic, I know, but if there is someone who could explain this to me in detail, I would be very, very thankful.
Basically an index on a table works like an index in a book (that's where the name came from):
Let's say you have a book about databases and you want to find some information about, say, storage. Without an index (assuming no other aid, such as a table of contents) you'd have to go through the pages one by one, until you found the topic (that's a full table scan).
On the other hand, an index has a list of keywords, so you'd consult the index and see that storage is mentioned on pages 113-120,231 and 354. Then you could flip to those pages directly, without searching (that's a search with an index, somewhat faster).
Of course, how useful the index will be, depends on many things - a few examples, using the simile above:
if you had a book on databases and indexed the word "database", you'd see that it's mentioned on pages 1-59,61-290, and 292 to 400. In such case, the index is not much help and it might be faster to go through the pages one by one (in a database, this is "poor selectivity").
For a 10-page book, it makes no sense to make an index, as you may end up with a 10-page book prefixed by a 5-page index, which is just silly - just scan the 10 pages and be done with it.
The index also needs to be useful - there's generally no point to index e.g. the frequency of the letter "L" per page.
The first thing you must know is that indexes are a way to avoid scanning the full table to obtain the result that you're looking for.
There are different kinds of indexes and they're implemented in the storage layer, so there's no standard between them and they also depend on the storage engine that you're using.
InnoDB and the B+Tree index
For InnoDB, the most common index type is the B+Tree based index, that stores the elements in a sorted order. Also, you don't have to access the real table to get the indexed values, which makes your query return way faster.
The "problem" about this index type is that you have to query for the leftmost value to use the index. So, if your index has two columns, say last_name and first_name, the order that you query these fields matters a lot.
So, given the following table:
CREATE TABLE person (
last_name VARCHAR(50) NOT NULL,
first_name VARCHAR(50) NOT NULL,
INDEX (last_name, first_name)
);
This query would take advantage of the index:
SELECT last_name, first_name FROM person
WHERE last_name = "John" AND first_name LIKE "J%"
But the following one would not
SELECT last_name, first_name FROM person WHERE first_name = "Constantine"
Because you're querying the first_name column first and it's not the leftmost column in the index.
This last example is even worse:
SELECT last_name, first_name FROM person WHERE first_name LIKE "%Constantine"
Because now, you're comparing the rightmost part of the rightmost field in the index.
The hash index
This is a different index type that unfortunately, only the memory backend supports. It's lightning fast but only useful for full lookups, which means that you can't use it for operations like >, < or LIKE.
Since it only works for the memory backend, you probably won't use it very often. The main case I can think of right now is the one that you create a temporary table in the memory with a set of results from another select and perform a lot of other selects in this temporary table using hash indexes.
If you have a big VARCHAR field, you can "emulate" the use of a hash index when using a B-Tree, by creating another column and saving a hash of the big value on it. Let's say you're storing a url in a field and the values are quite big. You could also create an integer field called url_hash and use a hash function like CRC32 or any other hash function to hash the url when inserting it. And then, when you need to query for this value, you can do something like this:
SELECT url FROM url_table WHERE url_hash=CRC32("http://gnu.org");
The problem with the above example is that since the CRC32 function generates a quite small hash, you'll end up with a lot of collisions in the hashed values. If you need exact values, you can fix this problem by doing the following:
SELECT url FROM url_table
WHERE url_hash=CRC32("http://gnu.org") AND url="http://gnu.org";
It's still worth to hash things even if the collision number is high cause you'll only perform the second comparison (the string one) against the repeated hashes.
Unfortunately, using this technique, you still need to hit the table to compare the url field.
Wrap up
Some facts that you may consider every time you want to talk about optimization:
Integer comparison is way faster than string comparison. It can be illustrated with the example about the emulation of the hash index in InnoDB.
Maybe, adding additional steps in a process makes it faster, not slower. It can be illustrated by the fact that you can optimize a SELECT by splitting it into two steps, making the first one store values in a newly created in-memory table, and then execute the heavier queries on this second table.
MySQL has other indexes too, but I think the B+Tree one is the most used ever and the hash one is a good thing to know, but you can find the other ones in the MySQL documentation.
I highly recommend you to read the "High Performance MySQL" book, the answer above was definitely based on its chapter about indexes.
Basically an index is a map of all your keys that is sorted in order. With a list in order, then instead of checking every key, it can do something like this:
1: Go to middle of list - is higher or lower than what I'm looking for?
2: If higher, go to halfway point between middle and bottom, if lower, middle and top
3: Is higher or lower? Jump to middle point again, etc.
Using that logic, you can find an element in a sorted list in about 7 steps, instead of checking every item.
Obviously there are complexities, but that gives you the basic idea.
Take a look at this link: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
How they work is too broad of a subject to cover in one SO post.
Here is one of the best explanations of indexes I have seen. Unfortunately it is for SQL Server and not MySQL. I'm not sure how similar the two are...
In MySQL InnoDB, there are two types of index.
Primary key which is called clustered index. Index key words are stored with
real record data in the B+Tree leaf node.
Secondary key which is non clustered index. These index only store primary key's key words along with their own index key words in the B+Tree leaf node. So when searching from secondary index, it will first find its primary key index key words and scan the primary key B+Tree to find the real data records. This will make secondary index slower compared to primary index search. However, if the select columns are all in the secondary index, then no need to look up primary index B+Tree again. This is called covering index.
Take at this videos for more details about Indexing
Simple Indexing
You can create a unique index on a table. A unique index means that two rows cannot have the same index value. Here is the syntax to create an Index on a table
CREATE UNIQUE INDEX index_name
ON table_name ( column1, column2,...);
You can use one or more columns to create an index. For example, we can create an index on tutorials_tbl using tutorial_author.
CREATE UNIQUE INDEX AUTHOR_INDEX
ON tutorials_tbl (tutorial_author)
You can create a simple index on a table. Just omit UNIQUE keyword from the query to create simple index. Simple index allows duplicate values in a table.
If you want to index the values in a column in descending order, you can add the reserved word DESC after the column name.
mysql> CREATE UNIQUE INDEX AUTHOR_INDEX
ON tutorials_tbl (tutorial_author DESC)
Adding some visual representation to the list of answers.
MySQL uses an extra layer of indirection: secondary index records point to primary index records, and the primary index itself holds the on-disk row locations. If a row offset changes, only the primary index needs to be updated.
Caveat: Disk data structure looks flat in the diagram but actually is a
B+ tree.
Source: link
I want to add my 2 cents. I am far from being a database expert, but I've recently read up a bit on this topic; enough for me to try and give an ELI5. So, here's may layman's explanation.
I understand it as such that an index is like a mini-mirror of your table, pretty much like an associative array. If you feed it with a matching key then you can just jump to that row in one "command".
But if you didn't have that index / array, the query interpreter must use a for-loop to go through all rows and check for a match (the full-table scan).
Having an index has the "downside" of extra storage (for that mini-mirror), in exchange for the "upside" of looking up content faster.
Note that (in dependence of your db engine) creating primary, foreign or unique keys automatically sets up a respective index as well. That same principle is basically why and how those keys work.
Let's suppose you have a book, probably a novel, a thick one with lots of things to read, hence lots of words.
Now, hypothetically, you brought two dictionaries, consisting of only words that are only used, at least one time in the novel. All words in that two dictionaries are stored in typical alphabetical order. In hypothetical dictionary A, words are printed only once while in hypothetical dictionary B words are printed as many numbers of times it is printed in the novel. Remember, words are sorted alphabetically in both the dictionaries.
Now you got stuck at some point while reading a novel and need to find the meaning of that word from anyone of those hypothetical dictionaries. What you will do? Surely you will jump to that word in a few steps to find its meaning, rather look for the meaning of each of the words in the novel, from starting, until you reach that bugging word.
This is how the index works in SQL. Consider Dictionary A as PRIMARY INDEX, Dictionary B as KEY/SECONDARY INDEX, and your desire to get for the meaning of the word as a QUERY/SELECT STATEMENT.
The index will help to fetch the data at a very fast rate. Without an index, you will have to look for the data from the starting, unnecessarily time-consuming costly task.
For more about indexes and types, look this.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
Indexing adds a data structure with columns for the search conditions and a pointer
The pointer is the address on the memory disk of the row with the
rest of the information
The index data structure is sorted to optimize query efficiency
The query looks for the specific row in the index; the index refers to the pointer which will find the rest of the information.
The index reduces the number of rows the query has to search through from 17 to 4.
Greeting,
My question; Whether or no sql query (SELECT) continues or stops reading data (records) from table when find the value that I was looking for?
referance: "In order to return data for this query, mysql must start at the beginning of the disk data file, read in enough of the record to know where the category field data starts (because long_text is variable length), read this value, see if it satisfies the where condition (and so decide whether to add to the return record set), then figure out where the next record set is, then repeat."
link for referance: http://www.verynoisy.com/sql-indexing-dummies/#how_the_database_finds_records_normally
In general you don't know and you don't care, but you have to adapt when queries take too long to execute. When you do something like
select a,b,c from mytable where a=3 and b=5
then the database engine has a couple of options to optimize. When all these options fail, then it will do a "full table scan" - which means, it will have to examine the entire table to see which rows are eligible. When you have indices on e.g. column a then the database engine can optimize the search because it can pre-select rows where a has value 3. So, in general, make sure that you have indices for the columns that are most searched. (Perversely, some database engines get confused when you have too many indices and will fall back to a full table scan because they've lost their way...)
As to whether or not the scanning stops: In general, the database engine has to examine all data in the table (hopefully aided by indices) and won't stop after having found just one hit. If you want just the first hit, use a limit 1 clause to make sure that your result set has only one outcome. But then again, if you have a sort by clause, the database engine cannot stop after the first hit, there might be next ones that should get priority given the sorting.
Summarizing, how the db engine does its scan depends on how smart it is, what indices are available etc.. If your select queries take too long then consider re-organizing your indices, writing your select statements differently, or rebuilding the table.
The RDBMS reading data from disk is something you cannot know, you should not care and you must not rely on.
The issue is too broad to get a precise answer. The engine reads data from storage in blocks, a block can contain records that are not needed by the query at hand. If all the columns needed by the query is available in an index, the RDBMS won't even read the data file, it will only use the index. The data it needs could already be cached in memory (because it was read during the execution of a previous query). The underlying OS and the storage media also keep their own caches.
On a busy system, all these factors could lead to very different storage access patterns while running the same query several times on a couple of minutes apart.
Yes it scans the entire file. Unless you put something like
select * from user where id=100 limit 1
This of course will still search entire rows if id 100 is the last record.
If id is a primary key it will automatically be indexed and searching would be optimized
I'm sorry... I thought the table.
I will change question and I will explain it in the following image;
I understand that in CASE 1 all columns must be read with each iteration.
My question is: If it's the same in the CASE 2 or columns that are not selected in the query are excluded from reading in each iteration.
Also, are the both queries are the some in performance perspective?
Clarify:
CASE: 1 In first CASE select print all data
CASE: 2 In second CASE select print columns first_name and last_name
Whether in CASE 2 mysql server (SQL query) reads only columns first_name, last_name or read the entire table to get that data(rows)=(first_name, last_name)?
An interest of me how the server reads table row in CASE 1 and CASE 2?
For a table(let's say 'food'), there is a column 'type' with potential value [1,2,3,4] that specifies the type of that entry (e.g. fruit). As I expect selection like
SELECT name FROM food WHERE type = 3 ;
would be called most often, I wonder would an index be recommended in this case. Since there is only a few values possible for that field I wonder if the index would be useful.(Similarly for MongoDB?)
An index on such a field is likely not to be useful in MySQL. Actually, such an index could make most queries worse.
There is a case where an index will always be faster. This is a query that only uses columns in the index, such as:
select count(type)
from food
where type = 3;
This is faster because reading the index should be faster than reading the table, because the data is smaller (presumably, you could include all columns in the index).
In other cases, MySQL uses an index for a table when it is available.
The question you are asking is about the "selectivity" of an index. Consider your query:
SELECT name
FROM food
WHERE type = 3 ;
If all rows have type = 3, then you have to read all the matching records anyway (to get the value of name). If there is one record per page, then the index is probably helping you, because it reduces the number of page reads. A more realistic situation would have a page contain 100 records. Then, if 25% of the records have the same type, a typical page would have 25 such records on it. Basically, every page still has to be read. The question is whether the pages are read sequentially (a "full table scan") or through the index.
There is a difference between these two ways of reading a table. In the full table scan, the pages are read sequentially and once a page is read, it is not accessed again. In the index read, the pages are read randomly, one record at a time, and a page could be read multiple times. In the extreme case, the pages don't fit in the page cache and the same page is flushed to disk and read again and again for each record on the page. Highly inefficient.
You can make this query more efficient by having an index on type, name.
So, the answer to your question is to be careful about indexes, especially with large tables. When you do have an index on a categorical column, make it a composite index so your queries can be satisfied only using the index and not having to go back to the data pages.
Having the index is unlikely to help, but you should test it with your queries and your data. If the column has few distinct values queries will return a sizeable portion of the table's rows, and reading the index is equivalent to a full table scan. In fact, a full table scan might even be faster than reading the index.
If the type of the row is used in other queries it may help to have the type as a part of a multi column index.
I've done plenty of searching/reading on this topic, but I'm still at a loss for a solution.
I have tables with tens or hundreds of thousands of rows, totaling somewhere in the area of 300GB of data. The column I need to select contains a lot of HTML, which may be part of the problem. I am using COMPRESS. The query contains two WHERE clauses:
SELECT id, olr_id, COMPRESS(source_html)
FROM buildings
WHERE scrape_status=1
AND parse_status=0 LIMIT 1;
As one might expect, id is a primary key. Additionally, olr_id is a unique index. This query takes something like 160-300sec just to return a result, which is completely unfeasible. Strangely (at least to me), removing the WHERE clause for parse_status results in the query completing in under 2-3 seconds. At first, I thought maybe this was because it was too specific, so I even tried removing scrape_status and only running the parse_status WHERE clause, but apparently it's this specific clause causing the insane increase in execution time.
The thing is, I don't know if indexing these columns would even do any good (scrape_status and parse_status), because their range of values is 0-2. I guess that's my main question -- would it help to index a column with such a small variance in values? I read somewhere once that indexing is really best for columns with great variation, but as I said, I'm at a loss for ideas, and 160-300 seconds is just totally unreasonable for hundreds of thousands of records.
Any input would be appreciated greatly. If you need additional input to help me, I'd be glad to provide. On that note, these are the EXPLAIN results for each of the three queries:
id: 1
select_type: SIMPLE
table: building
type: ALL
possible_keys: null
key: null
key_len: null
ref: null
rows: 58664
Extra: Using where
I appreciate your time in reading and offering any assistance that you may.
Create a composite index of both columns in the WHERE clause:
CREATE INDEX ix_sp ON buildings (scrape_status, parse_status);
Although each of them by itself doesn't partition the table too much, the combination probably does.
Try the following as well:
SELECT b1.id, olr_id, COMPRESS(source_html)
from buildings b1
JOIN (SELECT id
FROM buildings
WHERE scrape_status=1
AND parse_status=0
LIMIT 1) b2
USING (id)
Your original query may be compressing all matching rows, even though most of them are thrown out by the LIMIT clause. This version only compresses the one selected row.
Even if the indexed column doesn't have many distinct values, an index is beneficial in cases when the value you're searching for is uncommon. In other words, the index helps when the search is more selective, and matches a small percentage of rows.
So in this case, the decision about creating an index is based on what percentage of rows match your conditions scrape_status=1 and parse_status=0. For example, suppose scrape_status=1 matches 2% of rows, and parse_status=0 matches 1% of rows, I'd want to use parse_status as the leading column for the index.
If that's the case, but nearly all rows with parse_status=0 also have scrape_status=1, then there may be little additional benefit to making a compound index. Whereas if scrape_status=1 is effective at further restricting the matching rows, then you'd definitely want it to be built into the index.
MySQL also has optimizations for LIMIT. see http://dev.mysql.com/doc/refman/5.6/en/limit-optimization.html It tries to bail out of the query as soon as it finds the desired number of matching rows. MySQL 5.6 has added some new optimizations in this area.
MYSQL does not calculate the select-list expressions for rows that don't match conditions in the WHERE clause.
However, MySQL does have to read rows of data from disk to evaluate them against the conditions in the WHERE clause. This I/O is the source of a lot of performance cost, and it's the reason that narrowing down the search with indexes is so important. And if your source_html column is a TEXT column containing long strings, it's likely that this adds further overhead to the I/O, because MySQL will have to read additional pages of data from disk (see Blob Storage in Innodb for more details).
I am really interested in how MySQL indexes work, more specifically, how can they return the data requested without scanning the entire table?
It's off-topic, I know, but if there is someone who could explain this to me in detail, I would be very, very thankful.
Basically an index on a table works like an index in a book (that's where the name came from):
Let's say you have a book about databases and you want to find some information about, say, storage. Without an index (assuming no other aid, such as a table of contents) you'd have to go through the pages one by one, until you found the topic (that's a full table scan).
On the other hand, an index has a list of keywords, so you'd consult the index and see that storage is mentioned on pages 113-120,231 and 354. Then you could flip to those pages directly, without searching (that's a search with an index, somewhat faster).
Of course, how useful the index will be, depends on many things - a few examples, using the simile above:
if you had a book on databases and indexed the word "database", you'd see that it's mentioned on pages 1-59,61-290, and 292 to 400. In such case, the index is not much help and it might be faster to go through the pages one by one (in a database, this is "poor selectivity").
For a 10-page book, it makes no sense to make an index, as you may end up with a 10-page book prefixed by a 5-page index, which is just silly - just scan the 10 pages and be done with it.
The index also needs to be useful - there's generally no point to index e.g. the frequency of the letter "L" per page.
The first thing you must know is that indexes are a way to avoid scanning the full table to obtain the result that you're looking for.
There are different kinds of indexes and they're implemented in the storage layer, so there's no standard between them and they also depend on the storage engine that you're using.
InnoDB and the B+Tree index
For InnoDB, the most common index type is the B+Tree based index, that stores the elements in a sorted order. Also, you don't have to access the real table to get the indexed values, which makes your query return way faster.
The "problem" about this index type is that you have to query for the leftmost value to use the index. So, if your index has two columns, say last_name and first_name, the order that you query these fields matters a lot.
So, given the following table:
CREATE TABLE person (
last_name VARCHAR(50) NOT NULL,
first_name VARCHAR(50) NOT NULL,
INDEX (last_name, first_name)
);
This query would take advantage of the index:
SELECT last_name, first_name FROM person
WHERE last_name = "John" AND first_name LIKE "J%"
But the following one would not
SELECT last_name, first_name FROM person WHERE first_name = "Constantine"
Because you're querying the first_name column first and it's not the leftmost column in the index.
This last example is even worse:
SELECT last_name, first_name FROM person WHERE first_name LIKE "%Constantine"
Because now, you're comparing the rightmost part of the rightmost field in the index.
The hash index
This is a different index type that unfortunately, only the memory backend supports. It's lightning fast but only useful for full lookups, which means that you can't use it for operations like >, < or LIKE.
Since it only works for the memory backend, you probably won't use it very often. The main case I can think of right now is the one that you create a temporary table in the memory with a set of results from another select and perform a lot of other selects in this temporary table using hash indexes.
If you have a big VARCHAR field, you can "emulate" the use of a hash index when using a B-Tree, by creating another column and saving a hash of the big value on it. Let's say you're storing a url in a field and the values are quite big. You could also create an integer field called url_hash and use a hash function like CRC32 or any other hash function to hash the url when inserting it. And then, when you need to query for this value, you can do something like this:
SELECT url FROM url_table WHERE url_hash=CRC32("http://gnu.org");
The problem with the above example is that since the CRC32 function generates a quite small hash, you'll end up with a lot of collisions in the hashed values. If you need exact values, you can fix this problem by doing the following:
SELECT url FROM url_table
WHERE url_hash=CRC32("http://gnu.org") AND url="http://gnu.org";
It's still worth to hash things even if the collision number is high cause you'll only perform the second comparison (the string one) against the repeated hashes.
Unfortunately, using this technique, you still need to hit the table to compare the url field.
Wrap up
Some facts that you may consider every time you want to talk about optimization:
Integer comparison is way faster than string comparison. It can be illustrated with the example about the emulation of the hash index in InnoDB.
Maybe, adding additional steps in a process makes it faster, not slower. It can be illustrated by the fact that you can optimize a SELECT by splitting it into two steps, making the first one store values in a newly created in-memory table, and then execute the heavier queries on this second table.
MySQL has other indexes too, but I think the B+Tree one is the most used ever and the hash one is a good thing to know, but you can find the other ones in the MySQL documentation.
I highly recommend you to read the "High Performance MySQL" book, the answer above was definitely based on its chapter about indexes.
Basically an index is a map of all your keys that is sorted in order. With a list in order, then instead of checking every key, it can do something like this:
1: Go to middle of list - is higher or lower than what I'm looking for?
2: If higher, go to halfway point between middle and bottom, if lower, middle and top
3: Is higher or lower? Jump to middle point again, etc.
Using that logic, you can find an element in a sorted list in about 7 steps, instead of checking every item.
Obviously there are complexities, but that gives you the basic idea.
Take a look at this link: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
How they work is too broad of a subject to cover in one SO post.
Here is one of the best explanations of indexes I have seen. Unfortunately it is for SQL Server and not MySQL. I'm not sure how similar the two are...
In MySQL InnoDB, there are two types of index.
Primary key which is called clustered index. Index key words are stored with
real record data in the B+Tree leaf node.
Secondary key which is non clustered index. These index only store primary key's key words along with their own index key words in the B+Tree leaf node. So when searching from secondary index, it will first find its primary key index key words and scan the primary key B+Tree to find the real data records. This will make secondary index slower compared to primary index search. However, if the select columns are all in the secondary index, then no need to look up primary index B+Tree again. This is called covering index.
Take at this videos for more details about Indexing
Simple Indexing
You can create a unique index on a table. A unique index means that two rows cannot have the same index value. Here is the syntax to create an Index on a table
CREATE UNIQUE INDEX index_name
ON table_name ( column1, column2,...);
You can use one or more columns to create an index. For example, we can create an index on tutorials_tbl using tutorial_author.
CREATE UNIQUE INDEX AUTHOR_INDEX
ON tutorials_tbl (tutorial_author)
You can create a simple index on a table. Just omit UNIQUE keyword from the query to create simple index. Simple index allows duplicate values in a table.
If you want to index the values in a column in descending order, you can add the reserved word DESC after the column name.
mysql> CREATE UNIQUE INDEX AUTHOR_INDEX
ON tutorials_tbl (tutorial_author DESC)
Adding some visual representation to the list of answers.
MySQL uses an extra layer of indirection: secondary index records point to primary index records, and the primary index itself holds the on-disk row locations. If a row offset changes, only the primary index needs to be updated.
Caveat: Disk data structure looks flat in the diagram but actually is a
B+ tree.
Source: link
I want to add my 2 cents. I am far from being a database expert, but I've recently read up a bit on this topic; enough for me to try and give an ELI5. So, here's may layman's explanation.
I understand it as such that an index is like a mini-mirror of your table, pretty much like an associative array. If you feed it with a matching key then you can just jump to that row in one "command".
But if you didn't have that index / array, the query interpreter must use a for-loop to go through all rows and check for a match (the full-table scan).
Having an index has the "downside" of extra storage (for that mini-mirror), in exchange for the "upside" of looking up content faster.
Note that (in dependence of your db engine) creating primary, foreign or unique keys automatically sets up a respective index as well. That same principle is basically why and how those keys work.
Let's suppose you have a book, probably a novel, a thick one with lots of things to read, hence lots of words.
Now, hypothetically, you brought two dictionaries, consisting of only words that are only used, at least one time in the novel. All words in that two dictionaries are stored in typical alphabetical order. In hypothetical dictionary A, words are printed only once while in hypothetical dictionary B words are printed as many numbers of times it is printed in the novel. Remember, words are sorted alphabetically in both the dictionaries.
Now you got stuck at some point while reading a novel and need to find the meaning of that word from anyone of those hypothetical dictionaries. What you will do? Surely you will jump to that word in a few steps to find its meaning, rather look for the meaning of each of the words in the novel, from starting, until you reach that bugging word.
This is how the index works in SQL. Consider Dictionary A as PRIMARY INDEX, Dictionary B as KEY/SECONDARY INDEX, and your desire to get for the meaning of the word as a QUERY/SELECT STATEMENT.
The index will help to fetch the data at a very fast rate. Without an index, you will have to look for the data from the starting, unnecessarily time-consuming costly task.
For more about indexes and types, look this.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
Indexing adds a data structure with columns for the search conditions and a pointer
The pointer is the address on the memory disk of the row with the
rest of the information
The index data structure is sorted to optimize query efficiency
The query looks for the specific row in the index; the index refers to the pointer which will find the rest of the information.
The index reduces the number of rows the query has to search through from 17 to 4.