HI all
I am using single database and near about 7 tables. do have data s filled with all tables.
say near about 10k as of now. but will grow further and may strike millions but will take time.
my question is why my query is slow fetching results. its taking near about 10 to 12 seconds for a query on non load conditions. I am worried if what happens under load conditions say thousands of queries at one time??
here is my sample query...
$result = $db->sql_query("SELECT * FROM table1,table2,table3,table4,table5 WHERE table1.url = table2.url AND table1.url = table3.url AND table1.url = table4.url AND table1.url = table5.url AND table1.url='".$uri."'")or die(mysql_error());
$row = $db->sql_fetchrow($result);
$daysA = $row['regtime'];
$days = (strtotime(date("Y-m-d")) - strtotime($row['regtime'])) / (60 * 60 * 24);
if($row > 0 && $days < 2){
$row['data'];
$row['data1'];
//remaining
}else{ //some code}
I'm not sure if you have resolved the problem or not, but here's some test data that I have produced. There are a number of factors that can affect the speed of your queries, so my simple test cases may not accurately reflect your tables or data. However, they serve as a useful starting point.
First, create 5 simple tables, each with the same structure. As with your tables, I have used a UNIQUE index on the url column:
CREATE TABLE `table1` (
`id` int(11) NOT NULL auto_increment,
`url` varchar(255) default NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `url` (`url`)
) ENGINE=InnoDB;
CREATE TABLE table2 LIKE table1;
CREATE TABLE table3 LIKE table1;
CREATE TABLE table4 LIKE table1;
CREATE TABLE table5 LIKE table1;
The following script creates a stored procedure which is used to fill each table with 10,000 rows of data:
DELIMITER //
DROP PROCEDURE IF EXISTS test.autofill//
CREATE PROCEDURE test.autofill()
BEGIN
DECLARE i INT DEFAULT 5;
WHILE i < 10000 DO
INSERT INTO table1 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i ));
INSERT INTO table2 (url) VALUES (CONCAT('wwww.stackoverflow.com/', 10000 - i ));
INSERT INTO table3 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 6000 ));
INSERT INTO table4 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 3000 ));
INSERT INTO table5 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 2000 ));
SET i = i + 1;
END WHILE;
END;
//
DELIMITER ;
CALL test.autofill();
Each table now contains 10,000 rows. Your SELECT statement can now be used to query the data:
SELECT *
FROM table1,table2,table3,table4,table5
WHERE table1.url = table2.url
AND table1.url = table3.url
AND table1.url = table4.url
AND table1.url = table5.url
AND table1.url = 'wwww.stackoverflow.com/8000';
This gives the following result almost instantly:
+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+
| id | url | id | url | id | url | id | url | id | url |
+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+
| 7996 | wwww.stackoverflow.com/8000 | 1996 | wwww.stackoverflow.com/8000 | 1996 | wwww.stackoverflow.com/8000 | 4996 | wwww.stackoverflow.com/8000 | 5996 | wwww.stackoverflow.com/8000 |
+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+
An EXPLAIN SELECT shows why the query is very fast:
EXPLAIN SELECT *
FROM table1,table2,table3,table4,table5
WHERE table1.url = table2.url
AND table1.url = table3.url
AND table1.url = table4.url
AND table1.url = table5.url
AND table1.url = 'wwww.stackoverflow.com/8000';
+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+
| 1 | SIMPLE | table1 | const | url | url | 258 | const | 1 | Using index |
| 1 | SIMPLE | table2 | const | url | url | 258 | const | 1 | Using index |
| 1 | SIMPLE | table3 | const | url | url | 258 | const | 1 | Using index |
| 1 | SIMPLE | table4 | const | url | url | 258 | const | 1 | Using index |
| 1 | SIMPLE | table5 | const | url | url | 258 | const | 1 | Using index |
+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+
select_type is SIMPLE, which means that there are no JOIN statements to slow things down.
type is const, which means that the table has at most one possible match - this is thanks to the UNIQUE index, which guarantees no two URLs will be the same (see mysql 5.0 indexes - Unique vs Non Unique for a good description of UNIQUE INDEX). A const value in the type column is about as good as you can get.
possible_keys and key use the url key. That means that the correct index is being used for each table.
ref is const, which means that MySQL is comparing a constant value (one that does not change) with the index. Again, this is very fast.
rows equals 1. MySQL only needs to look at one row from each table. Once again, this is very fast.
Extra is Using index. MySQL does not have to do any additional non-indexed searches of the tables.
Provided you have an index on the url column of each table, your query should be extremely fast.
definitely looks like an index on the url field in each table is the way to go
It sounds likely that some of the columns in your WHERE clause are not indexed. Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
You might find EXPLAIN helpful in analyzing your queries.
Look up JOINs and especially look at the difference between INNER JOINS, LEFT JOINS and OUTER JOINS. Also INDEX all the fields on which you are going to do a lookup.
Probably something wrong with your indexes!
In any case long character strings like urls make for poorly performing primary keys. The take up a lot of room in the index and so the indexes are not as dense as they could be and less row pointers are loaded per IO. Also with urls the chances are that 99% of your strings start with "http://www." so the database engine has to compare 13 characters before it decides a row does not match.
One solution to this is to use some hash finction like MD5, SHA1 or even CRC32 to get a raw binary value from your strings and to use this value as the primary key for your tables. CRC32 makes a nice integer sized primary key but its almost certain that at some stage you will encounter two urls that hash to the same CRC32 value so you will need to store and compare the "url" string to be sure. The other hash functions return longer values (16 bytes and 20 bytes respectively in "raw" mode) but the chances of a collision are so small that its not worth bothering about.
.
Related
I am generating a mySQL query from PHP.
Part of the query re-orders a table based on some variables (which do not include the primary key).
The code doesn't produce errors, however the table is not sorted.
I echo'd out the SQL code, and it looks correct, I tried running it directly in phpMyAdmin, and it runs also without error, but the table is still not sorted as requested.
alter table anavar order by dset_name, var_id;
I am pretty sure that this has to do with the fact that I have a primary key variable (UID) which is not present in the sort.
Both prior and post running the query the table remains ordered by UID. Deleting UID and re-running the query results in a correctly sorted table, but this seems like an overkill solution.
Any suggestions?
create table t2
( id int auto_increment primary key,
someInt int not null,
thing varchar(100) not null,
theWhen datetime not null,
key(theWhen) -- creates an index on theWhen
);
-- my table now has 2 indexes on it
-- see it by running `show indexes from t2`
-- truncate table t2;
insert t2(someInt,thing,theWhen) values
(17,'chopstick','2016-05-08 13:00:00'),
(14,'alligator','2016-05-01'),
(11,'snail','2016-07-08 19:00:00');
select * from t2; -- returns in physical order (the primary key `id`)
select * from t2 order by thing; -- returns via thing, which has no index anyway
select * from t2 order by theWhen,thing; -- partial index use
note that indexes aren't even used until you have a significant number of rows in a db anyway
Edit (new data comes in)
insert t2 (someInt,thing,theWhen) values (777,'apple',now());
select t2.id,t2.thing,t2.theWhen,#rnk:=#rnk+1 as rank
from t2
cross join (select #rnk:=0) xParams
order by thing;
+----+-----------+---------------------+------+
| id | thing | theWhen | rank |
+----+-----------+---------------------+------+
| 2 | alligator | 2016-05-01 00:00:00 | 1 |
| 4 | apple | 2016-09-04 15:04:50 | 2 |
| 1 | chopstick | 2016-05-08 13:00:00 | 3 |
| 3 | snail | 2016-07-08 19:00:00 | 4 |
+----+-----------+---------------------+------+
Focus on the fact that you can maintain your secondary indices and generate a rank on the fly whenever you want.
Because I am using a data structure beyond my control, there is a table in my DB which will potentially have millions of Foreign-key => (key => value) pairs. Now, I know that one of the keys will be a certain value (in this case the key is related_content). Is it possible for MySQL to optimize the query so that it does not have to search the entire table for results?
Example table (called meta):
fk | key | value
====================================
1 | 'related_content' | '[2,3,4]'
1 | 'condiment' | 'mayo'
1 | 'condiment' | 'bananas'
29 | 'condiment' | 'ketchup'
29 | 'related_content' | '[1,7,9]'
95 | 'condiment' | 'mustard'
95 | 'related_content' | '[5,6,8]'
Example query:
SELECT value FROM meta WHERE fk = 29 AND key = 'related_content';
What I would like to do is:
ALTER TABLE `meta` ADD INDEX `meta_related` ON (`key`) WHERE `key` = 'related_content';
(Before anyone asks, the key column already has an index on it)
Add a 'composite' index
INDEX(fk, key)
what if you will add the key_index (int11) column that will represent the string in a key column like : related_content =1, condiment=2 and you will add index on key_index column, the index on int will be faster than on string, at the end you will select using two indexed integer field. It's gonna be fast
I have a sql query
SELECT level, data, string, soln, uid
FROM user_data
WHERE level = 10 AND (timetaken >= 151 AND timetaken <= 217) AND uid != 1
LIMIT 8852, 1;
which fetches from a table with 1.5 million entries.
I have indexed using
alter table user_data add index a_idx (level, timetaken, uid);
So the issue i am facing is it takes more than 30sec to query in some cases and in somecases less than 0.01sec.
Is there any issue with the indexing here.
Edit:
Added the explain query details
+----+-------------+------------------+-------+---------------+------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+-------+---------------+------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | user_data | range | a_idx | a_idx | 30 | NULL | 24091 | Using where; Using index |
+----+-------------+------------------+-------+---------------+------------+---------+------+-------+--------------------------+
The data field in the table is a text field. Its length is greater than 255 characters in most cases. Does this cause a Issue?
First of all you should try getting the execution plan of this query with EXPLAIN:
EXPLAIN SELECT level, data, string, soln, uid
FROM user_data
WHERE level = 10 AND (timetaken >= 151 AND timetaken <= 217) AND uid != 1
LIMIT 8852, 1;
This is a great slide to follow through on this topic:
http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
Try adding different indexes:
one on uid and level
a separate one on timetaken
The problem is in the high offset. In order to select the 8853rd result, MySQL has to scan all 8852 rows before this.
Btw, using limit without order by may lead to unexpected results.
In order to speed up the queries with high offset, you should move to a since..until pagination strategy
We have a `users' table that holds information about our users. One of the fields within this table is called 'query'. I am trying to SELECT the user id's of all users that have the same query. So my output should look like this:
user1_id user2_id common_query
43 2 "foo"
117 433 "bar"
1 119 "baz"
1 52 "qux"
Unfortunately, I can't get this query to finish in under an hour (the users table is pretty big). This is my current query:
SELECT u1.id,
u2.id,
u1.query
FROM users u1
INNER JOIN users u2
ON u1.query = u2.query
AND u1.id <> u2.id
My explain:
+----+-------------+-------+-------+----------------------+----------------------+---------+---------------------------------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------------+----------------------+---------+---------------------------------+----------+--------------------------+
| 1 | SIMPLE | u1 | index | index_users_on_query | index_users_on_query | 768 | NULL | 10905267 | Using index |
| 1 | SIMPLE | u2 | ref | index_users_on_query | index_users_on_query | 768 | u1.query | 11 | Using where; Using index |
+----+-------------+-------+-------+----------------------+----------------------+---------+---------------------------------+----------+--------------------------+
As you can see from the explain, the users table is indexed on query and the index appears to be being used in my SELECT. I'm wondering why the 'rows' column on table u2 has a value of 11, and not 1. Is there anything I can do to speed this query up? Is my '<>' comparison within the join bad practice? Also, the id field is the primary key
My biggest concern is the key_len, which indicates that MySQL must compare up to 768 bytes in order to lookup each index entry.
For this query, a hash index on query could be much more performant (as it would involve substantially shorter comparisons, at the cost of calculating hashes and being unable to sort records using that index):
ALTER TABLE users ADD INDEX (query) USING HASH
You might also consider making this a composite on (query, id) so that MySQL need not scan into the record itself to test the <> criterion.
The main driver of the query is the equality on the query field--if it's indexed. The <> to the id is probably not very specific and it shows by the type of select being used for it is 'ref'
Below only applies if 'query' is not indexed....
If id is the primary key you could just do this:
CREATE INDEX index_1 ON users (query);
The result of adding such an index will be a covering index for the query and will result in the fastest execution for the query.
How many queries do you have? You can add table UsersInQueries:
id queryId userId
0 5 453
1 23 732
2 15 761
then select from this table and group by queryId
If you only have up to two users per query, you could do this instead:
select query, min(id) as FirstID, max(id) as SecondId
from users
group by query
having count(*) > 1
If you have more than two users with the same query, can you explain why you would want all pairs of such users?
I am querying a mySQL database to retrieve the data from 1 particular row. I'm using the table primary key as the WHERE constraint parameter.
E.g.
SELECT name FROM users WHERE userid = 4
The userid column is the primary key of the table. Is it good practice to use LIMIT 1 on the end of that mySQL statement? Or are there any speed benefits?
I would call that a bad practice as when it comes to something like a userid it's generally unique and you won't have more than one. Therefore, having LIMIT 1 seems pretty contradictory and someone who comes to maintain your code later may have to second-guess your design.
Also, I don't think it has any speed benefit at all. You can check out mySQL's Explain for a simple tool to analyze a query.
Note, as mentioned in the comments. LIMIT # does have speed and general benefits in other cases, just not this one.
The userid column is the primary key of the table. Is it good practice to use LIMIT 1 on the end of that mySQL statement? Or are there any speed benefits?
It is not good practice to use LIMIT 1 at the end of the example - it's completely unnecessary, because the userid column is a primary key. A primary key means there is only one row/record in the table with that value, only one row/record will ever be returned.
But the ultimate indicator is the explain plan:
explain SELECT t.name FROM USERS t WHERE t.userid = 4
...returns:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
-----------------------------------------------------------------------------------------------------
1 | SIMPLE | users | const | PRIMARY | PRIMARY | 4 | const | 1 |
...and:
explain SELECT t.name FROM USERS t WHERE t.userid = 4 LIMIT 1
...returns:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
-----------------------------------------------------------------------------------------------------
1 | SIMPLE | users | const | PRIMARY | PRIMARY | 4 | const | 1 |
Conclusion
No difference, no need. It appears to be optimized out in this case (only searching against the primary key).
The LIMIT clause
Using LIMIT without an ORDER BY will return an arbitrary row/record if more than one is returned. For example, using the "John Smith" scenario where 2+ people can have the name "John Smith":
SELECT t.userid
FROM USERS t
WHERE t.first_name = 'John'
AND t.last_name = 'Smith'
LIMIT 1
...risks returning any of the possible userid values where the first name is "John" and the last name is "Smith". It can't be guaranteed to always return the same value, and the likelihood of getting a different value every time increases with the number of possible records.
Personally I don't care for the use of LIMIT. The syntax isn't supported on Oracle, SQL Server or DB2 - making queries less portable. LIMIT is a tool to be used conservatively, not the first thing you reach for - know when to use aggregate and/or analytic functions.