I have the table posts and its column descr in my application. And I need to query posts where description is not empty, but the table has too many rows, so I need to add an index. What is the best way to add this index?
Table structure (simplified):
CREATE TABLE posts (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
title VARCHAR(255),
descr VARCHAR(1024),
PRIMARY KEY(id)
) engine=InnoDB DEFAULT CHARSET=utf8;
Example of query:
SELECT * FROM posts WHERE descr <> '';
I don't want to create an index on the whole descr column, because it will be huge overhead.
Also I know variant about adding another column is_empty_descr BOOLEAN and add index to it. This solution I will use if no other variants would be found.
I tried to add INDEX( descr(1) ), but I couldn't find the way how to use it:
desrc <> '' - index is not used
LEFT(desrc, 1) = '' - index is not used
SUBSTR(desrc, 0, 1) = '' - index is not used
desrc LIKE 'a%' - index is used! But this is totally different case
In all my examples I see something like this:
mysql> EXPLAIN SELECT * FROM posts WHERE descr <> '';
+------+---------------+------+---------+------+------+-------------+
| type | possible_keys | key | key_len | ref | rows | Extra |
+------+---------------+------+---------+------+------+-------------+
| ALL | descr_1 | NULL | NULL | NULL | 42 | Using where |
+------+---------------+------+---------+------+------+-------------+
(I omited some result columns, because the table is too wide for this site)
Even if I pass FORCE index (descr_1) result will be the same.
In MySQL, you can add a description on a prefix of a string using this syntax:
create index idx_posts_descr1 on posts(descr(1));
You should test this to see if the index is used for that particular where clause, though.
Sorry my SQL knowledge is amateur.
SQL Fiddle: http://sqlfiddle.com/#!2/5640d/1
Please click the link above to refer to the database structure and query.
I have 6 tables,each data will take only one row in each table,and I have 3 same columns Custgroup,RandomNumber and user_id in all 6 tables.
Custgroup is a group name,within the group each data is with an unique RandomNumber.
The query is pretty slow at first run(took several seconds to few minutes randomly),after that will be fast,but for first few pages only.If I click to page 20 or 30+,it will be non stop loading(Just took about 5 minutes just now).And the data is not much,only 5000 rows,which will be in big trouble in future.And I still haven't add any WHERE clause yet as I need to have filtering for each columns in my website.(not my idea,requested by my boss).
I tried to changed it to LEFT JOIN,JOIN and any other ways I can found,but the loading is still slow.
I added INDEX for user_id,Custgroup AND RandomNumber of all tables.
Anyway to solve this problem?I never good in using JOIN,really slow for my database.
Or please let me know if my table structure is really bad,I'm willing to redo it.
Thanks.
**Edited
RUN EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tE ALL NULL NULL NULL NULL 5685
1 SIMPLE tA ALL NULL NULL NULL NULL 6072 Using join buffer
1 SIMPLE t1 ref user_id,Custgroup,RandomNumber RandomNumber 23 func 1 Using where
1 SIMPLE tB ALL NULL NULL NULL NULL 5868 Using where; Using join buffer
1 SIMPLE tC ALL NULL NULL NULL NULL 6043 Using where; Using join buffer
1 SIMPLE tD ALL NULL NULL NULL NULL 5906 Using where; Using join buffer
Keyname Type Unique Packed Column Cardinality Collation Null Comment
PRIMARY BTREE Yes No ID 6033 A
RandomNumber BTREE No No RandomNumber 6033 A
Custgroup BTREE No No Custgroup 1 A
user_id BTREE No No user_id 1 A
Edited: EXPLAIN EXTENDED .....
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE tE ALL NULL NULL NULL NULL 6084 100.00
1 SIMPLE t1 ref user_id,Custgroup,RandomNumber RandomNumber 23 func 1 100.00 Using where
1 SIMPLE tB ALL NULL NULL NULL NULL 5664 100.00 Using where; Using join buffer
1 SIMPLE tC ALL NULL NULL NULL NULL 5976 100.00 Using where; Using join buffer
1 SIMPLE tA ALL NULL NULL NULL NULL 6065 100.00 Using where; Using join buffer
1 SIMPLE tD ALL NULL NULL NULL NULL 6286 100.00 Using where; Using join buffer
The logical indexing for such a structure would have to be
CREATE INDEX UserAddedRecord1_ndx ON UserAddedRecord1 (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_A_ndx ON UserAddedRecord1_A (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_B_ndx ON UserAddedRecord1_B (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_C_ndx ON UserAddedRecord1_C (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_D_ndx ON UserAddedRecord1_D (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_E_ndx ON UserAddedRecord1_E (Custgroup, RandomNumber);
And if you are going to add WHERE clauses, they ought to go in the relevant index before the JOIN conditions (provided you run an equal or IN search, e.g. City = "New York"). For example if City is in UserAddedRecord1_B, then UserAddedRecord1_B_ndx ought to be City, Custgroup, RandomNumber.
But at this point, I have to ask, why? Apparently you have records always for the same user. For example:
t1.Cell,t1.Name,t1.Gender,t1.Birthday
tA.Email,tA.State,tA.Address,tA.City,tA.Postcode
...it is obvious that you can't have two different users here (and having Email in the same block as Postcode tells me this was not really intended as a one-to-many relation).
tB.Website,tB.Description,
tC.Model,tC.Capital,tC.Registry,tC.NoEmployees,
tD.SetUpDate,tD.PeopleInCharge,tD.Certification,tD.AddOEM,
tD.NoResearcher,tD.RoomSize,tD.RegisterMessage,
tE.WebsiteName,tE.OriginalWebsite,tE.QQ,tE.MSN,tE.Skype
These are all portions of a single large "user information form", divided in (optional?) sections.
I surmise that this structure arose from some kind of legacy/framework system that mapped a form submission section to a table. So that someone may have an entry in tables B, C and E, and someone else in tables A, C and D.
If this is true, and if user_id is the same for all tables, then one way of having this go faster is to explicitly add a condition on user_id for each table, and suitably modify indexes and JOINs:
CREATE INDEX UserAddedRecord1_ndx ON UserAddedRecord1 (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_A_ndx ON UserAddedRecord1_A (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_B_ndx ON UserAddedRecord1_B (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_C_ndx ON UserAddedRecord1_C (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_D_ndx ON UserAddedRecord1_D (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_E_ndx ON UserAddedRecord1_E (user_id, Custgroup, RandomNumber);
... FROM UserAddedRecord1 t1
JOIN UserAddedRecord1_A tA USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_B tB USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_C tC USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_D tD USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_E tE USING (user_id, CustGroup, RandomNumber)
WHERE t1.user_id = '1'
Try fiddle
The thing to do would be to incorporate all the tables into one table with all the fields in one row and then, maybe, for legacy purposes, you might create VIEWs that look like tables 1, A, B, C, D and E, each with a "vertical" partition of the tuple. But the big SELECT you would run on the complete table having all the fields (and you would save on duplicate columns, too).
I am trying to get several records by composite index from a table having PRIMARY KEY (a, b)
SELECT * FROM table WHERE (a, b) IN ((1,2), (2,4), (1,3))
The problem is, that MySQL is not using index, even if I FORCE INDEX (PRIMARY).
EXPLAIN SELECT shows null possible_keys.
Why there are no possible_keys?
What is the best way to retrieve multiple rows by composite key:
using OR
using UNION ALL
using WHERE () IN ((),())
P.S. Query is equal by result to
SELECT * FROM table WHERE (a = 1 AND b = 2) OR (a = 2 AND b = 4) OR (a = 1 AND b = 3)
Thanks
If query selects only fields from index (or if table has no other fields) by composite WHERE ... IN, index will be used:
SELECT a,b FROM `table` WHERE (a, b) IN ((1,2), (2,4), (1,3))
Otherwise it will not be used.
The workaround is to use derived query:
SELECT t.* FROM (SELECT a, b FROM `table` WHERE (a, b) IN ((1,2), (2,4), (1,3))) AS o INNER JOIN `table` AS t ON (t.a = o.a AND t.b = o.b)
EXPLAIN SELECT:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 2
1 PRIMARY t eq_ref PRIMARY PRIMARY 2 o.a,o.b 1
2 DERIVED table index NULL PRIMARY 2 NULL 6 Using where; Using index
In strong desire of indexing a certain column, have you considered having a new column: a_b which is basicly CONCAT(a, '-', b) and just compare that (WHERE a_b = {$id1}-{$id2})?
And you can only have one PRIMARY column per table. You can't "index primary" both a and b
Try to create the combined index on the columns a,b.
Index doesn't need to be primary key and it can still help a lot.
More info about your issue here: http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
Here is my query:
SELECT heading FROM table_a, table_b as m1, table_b as m2 WHERE (m1.join_id = '69' AND MATCH(m1.content) AGAINST('Harry' IN BOOLEAN MODE)) AND (m2.join_id = '71' AND MATCH(m2.content) AGAINST('+Highway +Design' IN BOOLEAN MODE)) AND m1.webid = table_a.id AND m2.webid = table_a.id
Right now it takes about 3 seconds. If i take out one of the conditions like this:
SELECT heading FROM table_a, table_b as m2 WHERE (m2.join_id = '71' AND MATCH(m2.content) AGAINST('+Highway +Design' IN BOOLEAN MODE)) AND m2.webid = table_a.id
It takes around 0.05 seconds.
I have a fulltext index on the 'content' column.
Also, in the first query, if I were to search for 'Highway Design' (no operators like + or "") it takes about 30 seconds.
Here is my explain query:
d select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE m1 fulltext id_index,content_ft content_ft 0 1 Using where
1 SIMPLE table_a eq_ref PRIMARY PRIMARY 4 user.m1.id 1
1 SIMPLE m2 fulltext id_index,content_ft content_ft 0 1 Using where
Is there anything else I can do to speed it up?
To explain my tables,
table_a is the main table that has a heading, and a content field and
table_b is my attribute table for the rows in table_a, so that rows in table_a can have additional attributes.
Let me know if I need to explain this better.
Thanks!
UPDATE here is the explanation of the fast query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE m2 fulltext id_index,content_ft content_ft 0 1 Using where
1 SIMPLE table_a eq_ref PRIMARY PRIMARY 4 user.m2.webid 1
ANOTHER UPDATE - TABLE definitions:
Table_b
id int(11) No None AUTO_INCREMENT
join_id int(11) No None
webid int(11) No None
content text utf8_general_ci No None
Indexes for content table_b
PRIMARY BTREE Yes No id 2723702 A
content BTREE No No content (333) 226975 A
id_index BTREE No No webid 151316 A
content_ft FULLTEXT No No content 118421
Table_a
id int(11) No None AUTO_INCREMENT
heading text utf8_general_ci Yes NULL
Table a indexes
PRIMARY BTREE Yes No id 179154 A
heading BTREE No No heading (300) 89577 A YES
Ok, my first opinion would be to change one of fulltext searches to LIKE clause. For this purpose it would be good to have an index on webid, or even better a composite index on (webid, join_id). This should help for more consistent table joining.
I have the following query that is being logged as a slow query:
EXPLAIN EXTENDED SELECT *
FROM (
`photo_data`
)
LEFT JOIN `deleted_photos` ON `deleted_photos`.`photo_id` = `photo_data`.`photo_id`
WHERE `deleted_photos`.`photo_id` IS NULL
ORDER BY `upload_date` DESC
LIMIT 50
Here's the output of explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE photo_data index NULL upload_date 8 NULL 142523
1 SIMPLE deleted_photos eq_ref photo_id photo_id 767 tbc.photo_data.photo_id 1 Using where; Not exists
I can see that it's having to go through all 142K records to pull the latest 50 out of the database.
I have these two indexes:
UNIQUE KEY `photo_id` (`photo_id`),
KEY `upload_date` (`upload_date`)
I was hoping hat the index key on upload_date would help limit the number rows. Anythoughts on what I can do to speed this up?
You could add a field to your photo_data table which shows whether or not it is deleted instead of having to find out this fact by joining to another table. Then if you add an index on (deleted, upload_date) your query should be very fast.