What is the better between search text and number in mysql ?
EG:
$sql = "SELECT * FROM table WHERE country = 'Australia' ";
and
$sql = "SELECT * FROM table WHERE country = '10899' ";
for load faster from database.
For Faster Searching you need to perform search operation via primary key which searches data faster because primary key is unique key. Example you provided are both performing text search...
Both of your queries are searching by text, as you use single quotes to surround the value. To search by number, you don't have to surround the value by quotes.
To answer your question about performance, you should search by primary key or indexed columns for better speed if you have huge amount of data. In small dataset, you won't notice a difference, as simple SELECT usually finished in split seconds.
Here you asked whats better Number or Text ? But there are so many factors thats need to take into consideration to improve where clause performance.
Biggest fear regarding where clause is full table scan and to avoid so you have to use index.
"If the optimizer gets confused or cannot find an appropriate index that matches the WHERE clause, the optimizer will read every row in the table."
Whenever you create any clusted index (Primary Key) on table, Clustered indexes sort and store the data rows in the table or view based on their key values.
Now here you can think about Text or Number
So you can go with number as sorting time required is less compared to text.
Things to improve where clause:
1) Use of Primary Key column in where clause (automatically uniqueness and NOT NULL comes)
2) If not possible with primary key column atleast have index on the column used in where cluse (Prefer Number as sorting is faster w.r.t large amount of data)
Related
I have a table with a billion+ rows. I have have the below query which I frequently execute:
SELECT SUM(price) FROM mytable WHERE domain IN ('com') AND url LIKE '%/shop%' AND date BETWEEN '2001-01-01' AND '2007-01-01';
Where domain is varchar(10) and url is varchar(255) and price is float. I understand that any query with %..% will not use any index. So logically, I created an index on price domain and date:
create index price_date on mytable(price, domain, date)
The problem here persists, this index is also not used because query contains: url LIKE '%.com/shop%'
On the other hand a FULLTEXT index still will not work since I have other non text filters in the query.
How can I optimise the above query? I have too many rows not to use an index.
UPDATE
Is this an sql limit? could such a query provide better performance on a noSQL database?
You have two range conditions, one uses IN() and the other uses BETWEEN. The best you can hope is that the condition on the first column of the index uses the index to examine rows, and the condition on the second column of the index uses index condition pushdown to make the storage engine do some pre-filtering.
Then it's up to you to choose which column should be the first column in the index, based on how well each condition would narrow down the search. If your condition on date is more likely to reduce the set of examined rows, then put that first in the index definition.
The order of terms in the WHERE clause does not have to match the order of columns in the index.
MySQL does not support optimizing with both a fulltext index and a B-tree index on the same table reference in the same query.
You can't use a fulltext index anyway for the pattern you are searching for. Fulltext indexes don't allow searches for punctuation characters, only words.
I vote for this order:
INDEX(domain, -- first because of "="
date, -- then range
url, price) -- "covering"
but, since the constants look like most of the billion rows would be hit, I don't expect good performance.
If this is a common query and/or "shop" is one of only a few possible filters, we can discuss whether a summary table would be useful.
I have a database column with titles of documents. These titles are not unique, and can be anywhere from a few words to a a few dozen words. I have over 3 million rows. I am trying to optimize looking for exact matches.
Indexing is not possible since there is no primary key, and the column is not unique. I have thought about a binary search, but that's done automatically I've heard when you index something. How can I implement a binary search on a column that's not index-able due to it not being unique?
SELECT * FROM cases where title = "Bondelmonte v Bondelmonte"
Takes a few seconds, I want it to take a fraction of that time.
Assuming MySQL
CREATE INDEX title_index ON cases (title)
Creates a non-unique index on table "cases" column "title"
https://dev.mysql.com/doc/refman/8.0/en/create-index.html
You would need to specify UNIQUE to create a unique index
Additionally you may want to create a full text index
CREATE FULLTEXT INDEX title_flt_Index ON cases ( title );
https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html
Is it useful for SELECT performance to set an index on a field that contains only distinct values?
eg:
order_id
--------
98317490
10928343
82931376
93438473
...
Is it useful for SELECT performance to set an index on a field that contains only distinct values?
That depends. An index is useful if you often search on this column:
WHERE column=value
WHERE column BETWEEN a AND b
The usefulness of an index is determined by its selectivity. For example, if your column contains a boolean, which is:
false in 99.9% of rows
true in 0.1% of rows
Then you can easily guess that using an index to find "true" values will be a huge boost relative to reading the entire table to search for them.
On the other hand, searching for "false" using an index will be slower than not using an index, since you're gonna read the whole table anyway, you might as well not bother to also process the index.
If values are all distinct, then selectivity is maximum, and index will be very useful. That is, assuming you actually search on that column!
An index that is never used only slows down updates.
Of course it is useful, as with all indexes - it is useful if you have select statements where you have this field on the WHERE clause.
Whether this field has distinct values or not doesn't really matter.
Note that if your field is marked as UNIQUE or PRIMARY KEY in the database, the database will technically already have an index for this field, so adding another index for it will not change anything.
Given the following SQL table :
Employee(ssn, name, dept, manager,
salary)
You discover that the following query is significantly slower than
expected. There is an index on salary, and you have verified that
the query plan is using it.
SELECT *
FROM Employee
WHERE salary = 48000
Please give a possible reason why this query is slower than expected, and provide a tuning solution that
addresses that reason.
I have two ideas for why this query is slower than expected. One is that we are trying to SELECT * instead of SELECT Employee.salary which would slow down the query as we must search across all columns instead of one. Another idea is that the index on salary is non-clustered, and we want to use a clustered index, as the company could be very large and it would make sense to organize the table by the salary field.
Would either of those two solutions speed up this query? I.e. either change SELECT * to SELECT Employee.salary or explicitly set the index on salary to be clustered?
What indexes do you have now?
Is it really "slow"? What evidence do you have?
Comments on "SELECT * instead of SELECT Employee.salary" --
* is bad form because tomorrow you might add a column, thereby breaking any code that is expecting a certain number of columns in a certain order.
Dealing with * versus salary does not happen until after the row(s) is located.
Locating the row(s) is the costly part.
On the other hand, if you have INDEX(salary) and only look at salary then the index is "covering". That means that the "data" (the other columns) does not need to be fetched. Hence, faster. But this is probably beyond what your teacher has told you about yet.
Comments on "the index on salary is non-clustered, and we want to use a clustered index" --
In MySQL (not necessarily in other RDBMSs), InnoDB has exactly one PRIMARY KEY and it is always UNIQUE and "clustered".
That is, "clustered" implies "unique", which seems inappropriate for "salary".
In InnoDB a "secondary key" implicitly includes the column(s) of the PK (ssn?), with which it can reach over into the data.
"verified that the query plan" -- Have you learned about EXPLAIN SELECT ...?
More Tips on creating the optimal index for a given SELECT.
I will try to be as simple as I can be ,
You can not simply make salary a clustered index unless you make it a unique or primary which is kind of both stupid and senseless because two person can have same salary.
There can be only one clustered index per table according to MYSQL documentation. Database by default elects primary key for being clustered index .
If you do not define a PRIMARY KEY for your table, MySQL locates the
first UNIQUE index where all the key columns are NOT NULL and InnoDB
uses it as the clustered index.
To speed up your query I have a few suggestions , go for secondary indexes,
If you want to search a salary by direct value then hash based indexes are a better option, if MYSQL supports that already.
If you want to search a value using greater than , less than or some range ,then B-tree indexes are better choice.
The first option is faster than the second one , but is limited to only equality operator.
Hope it helps.
I'm trying to understand if it's possible to use an index on a join if there is no limiting where on the first table.
Note: this is not a line-by-line real-case usage, just a thing I draft together for understanding purposes. Don't point out the obvious "what are your trying to obtain with this schema?", "you should use UNSIGNED" or the likes because that's not the question.
Note2: this MySQL JOINS without where clause is somehow related but not the same
Schema:
CREATE TABLE posts (
id_post INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
text VARCHAR(100)
);
CREATE TABLE related (
id_relation INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
id_post1 INT NOT NULL,
id_post2 INT NOT NULL
);
CREATE INDEX related_join_index ON related(id_post1) using BTREE;
Query:
EXPLAIN SELECT * FROM posts FORCE INDEX FOR JOIN(PRIMARY) INNER JOIN related ON id_post=id_post1 LIMIT 0,10;
SQL Fiddle: http://sqlfiddle.com/#!2/84597/3
As you can see, the index is being used on the second table, but the engine is doing a full table scan on the first one (the FORCE INDEX is there just to highlight the general question).
I'd like to understand if it's possible to get a "ref" on the left side too.
Thanks!
Update: if the first table has significantly more record than the second, the thing swap: the engine uses an index for the first one and a full table scan for the second http://sqlfiddle.com/#!2/3a3bb/1 Still, no way to get indexes used on both.
The DBMS has an optimizer to figure out the best plan to execute a query. It's up to the optimizer to decide whether to use an index or simply read the table directly.
An index makes sense when the DBMS expects only few records to read from a table (say 1% of all rows only). But once it expects to read many records (say 99% of all rows) it will not use the index. The threshold may lie at low as 5% (i.e. <= 5% -> index; > 5% table scan).
There are exceptions. One is when an index holds all columns needed. Then the table itself doesn't have to be read at all. Another may be when the optimizer thinks an index access may result faster in spite of having to read many rows. It's also always possible the optimizer simply guesses wrong.
There is a page on the MySQL documentation about this subject.
Regarding the possibility to get a ref on the first table from the query, the short answer is NO.
The reason is obvious: because there is no WHERE clause ALL the rows from table posts are analyzed because they could be included in the result set. There is no reason to use an index for that, a full table scan is better because it gets all the rows; and because the order doesn't matter, the access is (more or less) sequential. Using an index requires reading more information from the storage (index and data).
MySQL will use the join type index if all the columns that appear in the SELECT clause are present in an index. In this case MySQL will perform a full index scan (join type index) instead of a full table scan (join type ALL) because it requires reading less information from the storage (an index is usually smaller than the entire table data).