Slow Inner JOIN of 6 tables - mysql

Sorry my SQL knowledge is amateur.
SQL Fiddle: http://sqlfiddle.com/#!2/5640d/1
Please click the link above to refer to the database structure and query.
I have 6 tables,each data will take only one row in each table,and I have 3 same columns Custgroup,RandomNumber and user_id in all 6 tables.
Custgroup is a group name,within the group each data is with an unique RandomNumber.
The query is pretty slow at first run(took several seconds to few minutes randomly),after that will be fast,but for first few pages only.If I click to page 20 or 30+,it will be non stop loading(Just took about 5 minutes just now).And the data is not much,only 5000 rows,which will be in big trouble in future.And I still haven't add any WHERE clause yet as I need to have filtering for each columns in my website.(not my idea,requested by my boss).
I tried to changed it to LEFT JOIN,JOIN and any other ways I can found,but the loading is still slow.
I added INDEX for user_id,Custgroup AND RandomNumber of all tables.
Anyway to solve this problem?I never good in using JOIN,really slow for my database.
Or please let me know if my table structure is really bad,I'm willing to redo it.
Thanks.
**Edited
RUN EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tE ALL NULL NULL NULL NULL 5685
1 SIMPLE tA ALL NULL NULL NULL NULL 6072 Using join buffer
1 SIMPLE t1 ref user_id,Custgroup,RandomNumber RandomNumber 23 func 1 Using where
1 SIMPLE tB ALL NULL NULL NULL NULL 5868 Using where; Using join buffer
1 SIMPLE tC ALL NULL NULL NULL NULL 6043 Using where; Using join buffer
1 SIMPLE tD ALL NULL NULL NULL NULL 5906 Using where; Using join buffer
Keyname Type Unique Packed Column Cardinality Collation Null Comment
PRIMARY BTREE Yes No ID 6033 A
RandomNumber BTREE No No RandomNumber 6033 A
Custgroup BTREE No No Custgroup 1 A
user_id BTREE No No user_id 1 A
Edited: EXPLAIN EXTENDED .....
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE tE ALL NULL NULL NULL NULL 6084 100.00
1 SIMPLE t1 ref user_id,Custgroup,RandomNumber RandomNumber 23 func 1 100.00 Using where
1 SIMPLE tB ALL NULL NULL NULL NULL 5664 100.00 Using where; Using join buffer
1 SIMPLE tC ALL NULL NULL NULL NULL 5976 100.00 Using where; Using join buffer
1 SIMPLE tA ALL NULL NULL NULL NULL 6065 100.00 Using where; Using join buffer
1 SIMPLE tD ALL NULL NULL NULL NULL 6286 100.00 Using where; Using join buffer

The logical indexing for such a structure would have to be
CREATE INDEX UserAddedRecord1_ndx ON UserAddedRecord1 (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_A_ndx ON UserAddedRecord1_A (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_B_ndx ON UserAddedRecord1_B (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_C_ndx ON UserAddedRecord1_C (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_D_ndx ON UserAddedRecord1_D (Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_E_ndx ON UserAddedRecord1_E (Custgroup, RandomNumber);
And if you are going to add WHERE clauses, they ought to go in the relevant index before the JOIN conditions (provided you run an equal or IN search, e.g. City = "New York"). For example if City is in UserAddedRecord1_B, then UserAddedRecord1_B_ndx ought to be City, Custgroup, RandomNumber.
But at this point, I have to ask, why? Apparently you have records always for the same user. For example:
t1.Cell,t1.Name,t1.Gender,t1.Birthday
tA.Email,tA.State,tA.Address,tA.City,tA.Postcode
...it is obvious that you can't have two different users here (and having Email in the same block as Postcode tells me this was not really intended as a one-to-many relation).
tB.Website,tB.Description,
tC.Model,tC.Capital,tC.Registry,tC.NoEmployees,
tD.SetUpDate,tD.PeopleInCharge,tD.Certification,tD.AddOEM,
tD.NoResearcher,tD.RoomSize,tD.RegisterMessage,
tE.WebsiteName,tE.OriginalWebsite,tE.QQ,tE.MSN,tE.Skype
These are all portions of a single large "user information form", divided in (optional?) sections.
I surmise that this structure arose from some kind of legacy/framework system that mapped a form submission section to a table. So that someone may have an entry in tables B, C and E, and someone else in tables A, C and D.
If this is true, and if user_id is the same for all tables, then one way of having this go faster is to explicitly add a condition on user_id for each table, and suitably modify indexes and JOINs:
CREATE INDEX UserAddedRecord1_ndx ON UserAddedRecord1 (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_A_ndx ON UserAddedRecord1_A (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_B_ndx ON UserAddedRecord1_B (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_C_ndx ON UserAddedRecord1_C (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_D_ndx ON UserAddedRecord1_D (user_id, Custgroup, RandomNumber);
CREATE INDEX UserAddedRecord1_E_ndx ON UserAddedRecord1_E (user_id, Custgroup, RandomNumber);
... FROM UserAddedRecord1 t1
JOIN UserAddedRecord1_A tA USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_B tB USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_C tC USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_D tD USING (user_id, CustGroup, RandomNumber)
JOIN UserAddedRecord1_E tE USING (user_id, CustGroup, RandomNumber)
WHERE t1.user_id = '1'
Try fiddle
The thing to do would be to incorporate all the tables into one table with all the fields in one row and then, maybe, for legacy purposes, you might create VIEWs that look like tables 1, A, B, C, D and E, each with a "vertical" partition of the tuple. But the big SELECT you would run on the complete table having all the fields (and you would save on duplicate columns, too).

Related

Does index work with view?

Assume that I have two tables:
table1(ID, attribute1, attribute2) and
table2(ID, attribute1, attribute2) with ID is primary key of two table
and I have a view:
create view myview as
select ID, attribute1, attribute2 from table1
union
select ID, attribute1, attribute2 from table1
Can I use advantage of index of primary key (in sql in general and for mysql in my case), when I execute query like following query ?
select * from myview where ID = 100
It depends on your query. Using a view may limit the indexes that can be used efficiently.
For example using a table I have handy I can create a view using 2 UNIONed selects each with a WHERE clause.
CREATE VIEW fred AS
SELECT *
FROM item
WHERE code LIKE 'a%'
UNION SELECT *
FROM item
WHERE mmg_code LIKE '01%'
Both the code and the mmg_code fields have indexes. The table also has id as a primary key (highest value is about 59500).
As a query I can select from the view, or do a query similar to the view, or I can use an OR (all 3 should give the same results). I get 3 quite different EXPLAINs:-
SELECT *
FROM item
WHERE id > 59000
AND code LIKE 'a%'
UNION SELECT *
FROM item
WHERE id > 59000
AND mmg_code LIKE '01%';
gives and EXPLAIN of
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY item range PRIMARY,code,id,id_mmg_code,id_code,code_id PRIMARY 4 NULL 508 Using where
2 UNION item range PRIMARY,id,mmg_code,id_mmg_code,id_code,mmg_code_id PRIMARY 4 NULL 508 Using where
NULL UNION RESULT <union1,2> ALL NULL NULL NULL NULL NULL Using temporary
while the following
SELECT *
FROM item
WHERE id > 59000
AND (code LIKE 'a%'
OR mmg_code LIKE '01%');
gives and EXPLAIN of
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE item range PRIMARY,code,id,mmg_code,id_mmg_code,id_code,code_id,mmg_code_id PRIMARY 4 NULL 508 Using where
and the following
SELECT *
FROM fred
WHERE id > 59000;
gives and EXPLAIN of
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 4684 Using where
2 DERIVED item range code,code_id code 34 NULL 1175 Using index condition
3 UNION item range mmg_code,mmg_code_id mmg_code 27 NULL 3509 Using index condition
NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL Using temporary
As you can see as indexes have been used in the view it has affected the indexes which can be used when selecting from the view.
The best index is potentially the primary key, but the view doesn't use this.
"Can I use advantage of index of primary key (in sql in general and for mysql in my case), when I execute query like following query?"
MySQL will consider using indexes that have been defined on the underlying tables. However you cannot create an index on the view. Check link mysql Restrictions on Views for further explanation.
Using mysql explain on a query using the view will show the keys being considered under the "possible_keys" column.
EXPLAIN select * from myview where ID = 100;

MYSQL: Optimize Order By in Table Sort

I am developing an application for my college's website and I would like to pull all the events in ascending date order from the database. There is a total of four tables:
Table Events1
event_id, mediumint(8), Unsigned
date, date,
Index -> Primary Key (event_id)
Index -> (date)
Table events_users
event_id, smallint(5), Unsigned
user_id, mediumint(8), Unsigned
Index -> PRIMARY (event_id, user_id)
Table user_bm
link, varchar(26)
user_id, mediumint(8)
Index -> PRIMARY (link, user_id)
Table user_eoc
link, varchar(8)
user_id, mediumint(8)
Index -> Primary (link, user_id)
Query:
EXPLAIN SELECT * FROM events1 E INNER JOIN event_users EU ON E.event_id = EU.event_id
RIGHT JOIN user_eoc EOC ON EU.user_id = EOC.user_id
INNER JOIN user_bm BM ON EOC.user_id = BM.user_id
WHERE E.date >= '2013-01-01' AND E.date <= '2013-01-31'
AND EOC.link = "E690"
AND BM.link like "1.1%"
ORDER BY E.date
EXPLANATION:
The query above does two things.
1) Searches and filters out all students through the user_bm and user_eoc tables. The "link" columns are denormalized columns to quickly filter students by major/year/campus etc.
2) After applying the filter, MYSQL grabs the user_ids of all matching students and finds all events they are attending and outputs them in ascending order.
QUERY OPTIMIZER EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE EOC ref PRIMARY PRIMARY 26 const 47 Using where; Using index; Using temporary; Using f...
1 SIMPLE BM ref PRIMARY,user_id-link user_id-link 3 test.EOC.user_id 1 Using where; Using index
1 SIMPLE EU ref PRIMARY,user_id user_id 3 test.EOC.user_id 1 Using index
1 SIMPLE E eq_ref PRIMARY,date-event_id PRIMARY 3 test.EU.event_id 1 Using where
QUESTION:
The query works fine but can be optimized. Specifically - using filesort and using temporary is costly and I would like to avoid this. I am not sure if this is possible because I would like to 'Order By' events by date that have a 1:n relationship with the matching users. The Order BY applies to a joined table.
Any help or guidance would be greatly appreciated. Thank you and Happy Holidays!
Ordering can be done in two ways. By index or by temporary table. You are ordering by date in table Events1 but it's using the PRIMARY KEY which doesn't contain date so in this case the result needs to be ordered in a temporary table.
It is not necessarily expensive though. If the result is small enough to fit in memory it will not be a temporary table on disk, just in memory and that is not expensive.
Neither is filesort. "Using filesort" doesn't mean it will use any file, it just means it's not sorting by index.
So, if your query executes fast you should be happy. If the result set is small it will be sorted in memory and no files will be created.

Why is MySQL not using indexes with composite WHERE IN?

I am trying to get several records by composite index from a table having PRIMARY KEY (a, b)
SELECT * FROM table WHERE (a, b) IN ((1,2), (2,4), (1,3))
The problem is, that MySQL is not using index, even if I FORCE INDEX (PRIMARY).
EXPLAIN SELECT shows null possible_keys.
Why there are no possible_keys?
What is the best way to retrieve multiple rows by composite key:
using OR
using UNION ALL
using WHERE () IN ((),())
P.S. Query is equal by result to
SELECT * FROM table WHERE (a = 1 AND b = 2) OR (a = 2 AND b = 4) OR (a = 1 AND b = 3)
Thanks
If query selects only fields from index (or if table has no other fields) by composite WHERE ... IN, index will be used:
SELECT a,b FROM `table` WHERE (a, b) IN ((1,2), (2,4), (1,3))
Otherwise it will not be used.
The workaround is to use derived query:
SELECT t.* FROM (SELECT a, b FROM `table` WHERE (a, b) IN ((1,2), (2,4), (1,3))) AS o INNER JOIN `table` AS t ON (t.a = o.a AND t.b = o.b)
EXPLAIN SELECT:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 2
1 PRIMARY t eq_ref PRIMARY PRIMARY 2 o.a,o.b 1
2 DERIVED table index NULL PRIMARY 2 NULL 6 Using where; Using index
In strong desire of indexing a certain column, have you considered having a new column: a_b which is basicly CONCAT(a, '-', b) and just compare that (WHERE a_b = {$id1}-{$id2})?
And you can only have one PRIMARY column per table. You can't "index primary" both a and b
Try to create the combined index on the columns a,b.
Index doesn't need to be primary key and it can still help a lot.
More info about your issue here: http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html

How to optimize mysql query with multiple fulltext matches

Here is my query:
SELECT heading FROM table_a, table_b as m1, table_b as m2 WHERE (m1.join_id = '69' AND MATCH(m1.content) AGAINST('Harry' IN BOOLEAN MODE)) AND (m2.join_id = '71' AND MATCH(m2.content) AGAINST('+Highway +Design' IN BOOLEAN MODE)) AND m1.webid = table_a.id AND m2.webid = table_a.id
Right now it takes about 3 seconds. If i take out one of the conditions like this:
SELECT heading FROM table_a, table_b as m2 WHERE (m2.join_id = '71' AND MATCH(m2.content) AGAINST('+Highway +Design' IN BOOLEAN MODE)) AND m2.webid = table_a.id
It takes around 0.05 seconds.
I have a fulltext index on the 'content' column.
Also, in the first query, if I were to search for 'Highway Design' (no operators like + or "") it takes about 30 seconds.
Here is my explain query:
d select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE m1 fulltext id_index,content_ft content_ft 0 1 Using where
1 SIMPLE table_a eq_ref PRIMARY PRIMARY 4 user.m1.id 1
1 SIMPLE m2 fulltext id_index,content_ft content_ft 0 1 Using where
Is there anything else I can do to speed it up?
To explain my tables,
table_a is the main table that has a heading, and a content field and
table_b is my attribute table for the rows in table_a, so that rows in table_a can have additional attributes.
Let me know if I need to explain this better.
Thanks!
UPDATE here is the explanation of the fast query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE m2 fulltext id_index,content_ft content_ft 0 1 Using where
1 SIMPLE table_a eq_ref PRIMARY PRIMARY 4 user.m2.webid 1
ANOTHER UPDATE - TABLE definitions:
Table_b
id int(11) No None AUTO_INCREMENT
join_id int(11) No None
webid int(11) No None
content text utf8_general_ci No None
Indexes for content table_b
PRIMARY BTREE Yes No id 2723702 A
content BTREE No No content (333) 226975 A
id_index BTREE No No webid 151316 A
content_ft FULLTEXT No No content 118421
Table_a
id int(11) No None AUTO_INCREMENT
heading text utf8_general_ci Yes NULL
Table a indexes
PRIMARY BTREE Yes No id 179154 A
heading BTREE No No heading (300) 89577 A YES
Ok, my first opinion would be to change one of fulltext searches to LIKE clause. For this purpose it would be good to have an index on webid, or even better a composite index on (webid, join_id). This should help for more consistent table joining.

Slow select query with left join is null and limit results

I have the following query that is being logged as a slow query:
EXPLAIN EXTENDED SELECT *
FROM (
`photo_data`
)
LEFT JOIN `deleted_photos` ON `deleted_photos`.`photo_id` = `photo_data`.`photo_id`
WHERE `deleted_photos`.`photo_id` IS NULL
ORDER BY `upload_date` DESC
LIMIT 50
Here's the output of explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE photo_data index NULL upload_date 8 NULL 142523
1 SIMPLE deleted_photos eq_ref photo_id photo_id 767 tbc.photo_data.photo_id 1 Using where; Not exists
I can see that it's having to go through all 142K records to pull the latest 50 out of the database.
I have these two indexes:
UNIQUE KEY `photo_id` (`photo_id`),
KEY `upload_date` (`upload_date`)
I was hoping hat the index key on upload_date would help limit the number rows. Anythoughts on what I can do to speed this up?
You could add a field to your photo_data table which shows whether or not it is deleted instead of having to find out this fact by joining to another table. Then if you add an index on (deleted, upload_date) your query should be very fast.