MySQL query performance limit? - mysql

We have a home grown document management system and our system is running very slow, particularly on the search. It worked fine at first, but it has gotten progressively slower over time. Its now taking anywhere from 30 to 150 seconds to return results depending upon criteria. This is our search query. We’ve been staring at this thing left and right and can’t see any place to tune this more. All of the joined fields are indexed on their respective tables.
SELECT DISTINCT f.*, ts.*, fo.*, ft.*, p.*, u.*, c.*, co.*, ct.*, fs.*, fd.*, r.*, rt.*, si.*, s.* FROM (
SELECT DISTINCT f.* FROM files f
JOIN folders fo ON(fo.id = f.belongs_to_folder_id)
JOIN projects p ON(p.id = f.belongs_to_project_id)
LEFT OUTER JOIN file_statuses fs ON(fs.id = f.file_status_id)
LEFT OUTER JOIN submittal_items_files sif ON(sif.file_id = f.id)
LEFT OUTER JOIN submittal_items si ON(si.id = sif.submittal_item_id)
LEFT OUTER JOIN submittals s ON(s.id = si.belongs_to_submittal_id)
LEFT OUTER JOIN record_types rt ON(rt.id = f.record_type_id)
LEFT OUTER JOIN companies co ON(co.id = f.company_id)
LEFT JOIN folders_actions_groups ag ON (
f.belongs_to_folder_id = ag.folder_id AND
ag.action_id = 10010
)
LEFT JOIN files_actions_groups fg ON (fg.file_id = f.id)
JOIN users_groups ug ON ((ug.group_id = ag.group_id OR ug.group_id = fg.group_id) AND ug.user_id = 411)
WHERE (
(f.file_generated_name LIKE CONCAT('%', 'the', '%')) OR
(f.record_id LIKE CONCAT('%', 'the', '%')) OR
(f.record_title LIKE CONCAT('%', 'the', '%')) OR
(f.additional_info LIKE CONCAT('%', 'the', '%')) OR
(si.item_number LIKE CONCAT('%', 'the', '%')) OR
(s.element_number LIKE CONCAT('%', 'the', '%'))
) AND f.path LIKE CONCAT('Some Text', '%') AND
f.file_status_id = 3 AND
f.file_revision = 1 AND
f.discipline_id = 1 AND
f.record_type_id = 2 AND
f.triage_status_id = 2 AND
f.deleted = 0
ORDER BY f.created DESC, f.id DESC
LIMIT 100
) AS f
LEFT OUTER JOIN users u ON(f.created_by_user_id = u.id)
LEFT OUTER JOIN contacts c ON(c.user_id = u.id)
LEFT OUTER JOIN companies co ON(co.id = f.company_id)
LEFT OUTER JOIN company_types ct ON(ct.id = co.company_type_id)
JOIN triage_statuses ts ON(f.triage_status_id = ts.id)
JOIN folders fo ON(fo.id = f.belongs_to_folder_id)
JOIN folder_types ft ON(ft.id = fo.folder_type_id)
JOIN projects p ON(p.id = f.belongs_to_project_id)
LEFT OUTER JOIN file_statuses fs ON(fs.id = f.file_status_id)
LEFT OUTER JOIN file_disciplines fd ON(fd.id = f.discipline_id)
LEFT OUTER JOIN revisions r ON(r.id = f.file_revision)
LEFT OUTER JOIN record_types rt ON(rt.id = f.record_type_id)
LEFT OUTER JOIN submittal_items_files sif ON(sif.file_id = f.id)
LEFT OUTER JOIN submittal_items si ON(si.id = sif.submittal_item_id)
LEFT OUTER JOIN submittals s ON(s.id = si.belongs_to_submittal_id)
LEFT OUTER JOIN files_actions_groups ffg ON(ffg.file_id = f.id)
LEFT OUTER JOIN groups g ON(g.id = ffg.group_id)
ORDER BY f.created DESC, f.id DESC

This might be an obvious answer, but have you indexed your database? If you're new to indexing, here's a pretty good rule: just put a unique index on all the columns named "id", such as folders.id or projects.id, then put a standard index on all the columns that reference a foreign id, such as folder.belongs_to_folder_id or folder.record_type_id
Another thing I would change is to try and select only the columns you will actually use rather than your huge list of f.*, ts.*, fo.*, ft.*, p.*, u.*, c.*, co.*, ct.*, fs.*, etc...
You also have TONS of joins, which are very expensive in terms of processing time. Do you really need all those joined tables?

Related

Left Outer Join Subquery

I am trying to do a left outer join to a subquery, is that possible?
Can I do something like this?:
##this is this weeks targets
select * from targets t
inner join streams s on s.id = t.stream_id
where t.week_no =WEEKOFYEAR(NOW())
left outer join
(
###############This is records selected so far this week
select p.brand_id, p.part_product_family, sum(r.best) from records r
inner join products p on p.id = r.product_id
left outer join streams s on s.body = p.brand_id and s.stream = p.part_product_family
where WEEKOFYEAR(r.date_selected) =WEEKOFYEAR(NOW())
group by p.brand_id, p.part_product_family;
) sq_2
on s.stream = sq_2.part_product_family
This is working:
##this is this weeks targets
select * from targets t
inner join streams s on s.id = t.stream_id
left outer join
(
###############This is records selected so far this week
select p.brand_id, p.part_product_family, sum(r.best) from records r
inner join products p on p.id = r.product_id
left outer join streams s on s.body = p.brand_id and s.stream = p.part_product_family
where WEEKOFYEAR(r.date_selected) =WEEKOFYEAR(NOW()) and YEAR(r.date_selected) = YEAR(now())
group by p.brand_id, p.part_product_family
) sq_2
on s.body = sq_2.brand_id and s.stream = sq_2.part_product_family

Count matched words from IN operator

i have this little mysql query :
select t.title FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select k.keyword
FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where t.id = 166282
)
LIMIT 15
as you can see it will return all titles from title that have at least one the same keyword that have movie with id 166282.
Now i have problem, because i want also count how many keywords was matched in IN operator(let's say i want to see only titles that have 3 or more the same keywords), i tried something with aggregate functions, but everything failed, so i came here with my problem. Maybe somebody can give me some advice, or code example.
I'm not also sure, if this "subquery way" is good, so if there are some better options how i should solve my problem, I am open to any suggestions or tips.
Thank you!
#Edit
So after some problems, i have one more. This is my current query :
SELECT s.title,s.vote,s.rating,count(dk.key) as keywordCnt, count(dg.name) as genreCnt
FROM series s
INNER JOIN series_has_genre shg ON shg.series_id = s.id
INNER JOIN dict_genre dg ON dg.id = shg.dict_genre_id
INNER JOIN series_has_keyword shk ON shk.series_id = s.id
INNER JOIN dict_keyword dk ON dk.id = shk.dict_keyword_id
WHERE dk.key IN (
SELECT dki.key FROM series si
INNER JOIN series_has_keyword shki ON shki.series_id = si.id
INNER JOIN dict_keyword dki ON dki.id = shki.dict_keyword_id
WHERE si.title LIKE 'The Wire'
)
and dg.name IN (
SELECT dgo.name FROM series so
INNER JOIN series_has_genre shgo ON shgo.series_id = so.id
INNER JOIN dict_genre dgo ON dgo.id = shgo.dict_genre_id
WHERE so.title LIKE 'The Wire'
)
and s.production_year > 2000
GROUP BY s.title
ORDER BY s.vote DESC, keywordCnt DESC ,s.rating DESC, genreCnt DESC
LIMIT 5
Problem is, it is very, very, very slow. Any tips what i should change, to run it faster ?
Will this work for you:
select t.title, count(k.keyword) as keywordCount FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select ki.keyword
FROM title ti
inner join movie_keyword mki on mki.movie_id = ti.id
inner join keyword ki on ki.id = mki.keyword_id
where ti.id = 166282
) group by t.title
LIMIT 15
Note that I have changed the table names inside the nested query to avoid confusion.

Rows missing from mysql pivot query results

I have a mysql query as stated below, it returns exactly the results I want for one row, but doesn't return any other rows where I expect there to be 8 in my test data (there are 8 unique test ids). I was inspired by this answer but obviously messed up my implementation, does anyone see where I'm going wrong?
SELECT
c.first_name,
c.last_name,
n.test_name,
e.doc_name,
e.email,
e.lab_id,
a.test_id,
a.date_req,
a.date_approved,
a.accepts_terms,
a.res_value,
a.reason,
a.test_type,
a.date_collected,
a.date_received,
k.kind_name,
sum(case when metabolite_name = "Creatinine" then t.res_val end) as Creatinine,
sum(case when metabolite_name = "Glucose" then t.res_val end) as Glucose,
sum(case when metabolite_name = "pH" then t.res_val end) as pH
FROM test_requisitions AS a
INNER JOIN personal_info AS c ON (a.user_id = c.user_id)
INNER JOIN test_types AS d ON (a.test_type = d.test_type)
INNER JOIN kinds AS k ON (k.id = d.kind_id)
INNER JOIN test_names AS n ON (d.name_id = n.id)
INNER JOIN docs AS e ON (a.doc_id = e.id)
INNER JOIN test_metabolites AS t ON (t.test_id = a.test_id)
RIGHT JOIN metabolites AS m ON (m.id = t.metabolite_id)
GROUP BY a.test_id
ORDER BY (a.date_approved IS NOT NULL),(a.res_value IS NOT NULL), a.date_req, c.last_name ASC;
Most of your joins are inner joins. The last is a right outer join. As written, the query keeps all the metabolites, but not necessarily all the tests.
I would suggest that you change them all to left outer joins, because you want to keep all the rows in the first table:
FROM test_requisitions AS a
LEFT JOIN personal_info AS c ON (a.user_id = c.user_id)
LEFT JOIN test_types AS d ON (a.test_type = d.test_type)
LEFT JOIN kinds AS k ON (k.id = d.kind_id)
LEFT JOIN test_names AS n ON (d.name_id = n.id)
LEFT JOIN docs AS e ON (a.doc_id = e.id)
LEFT JOIN test_metabolites AS t ON (t.test_id = a.test_id)
LEFT JOIN metabolites AS m ON (m.id = t.metabolite_id)
I would also suggest that your aliases be related to the table, so tr for test_requisition, pi for personal_info, and so on.

Mysql Group concat and then queries based on results

Hi all I have a table business which has alot of many to many relationships. It was suggested i perform a group concat first to get the ideas from the many table and then look at these ids to get the values from the many tables
In the below instance OK i can see i get a list of announcement ids back via the GROUP_CONCAT(DISTINCT ba.announcement_id) as 'announcement', how from here do i set
SELECT * from announcement
where id IN(_______)
where the in represents what was returned from the group_concat
id b
BEGIN
/* Business Information and Categories */
SELECT
b.alias_title, b.title, b.premisis_name,
a.address_line_1, a.address_line_2, a.postal_code,tvc.town_village_city,spc.state_province_county, c.country,
GROUP_CONCAT(DISTINCT be.event_id) as 'event',
GROUP_CONCAT(DISTINCT ba.announcement_id) as 'announcement',
GROUP_CONCAT(DISTINCT bd.document_id) as 'document',
GROUP_CONCAT(DISTINCT bi.image_id) as 'image',
GROUP_CONCAT(DISTINCT bprod.product_id ) as 'product',
GROUP_CONCAT(DISTINCT bt.tag_title_id) as 'tag'
FROM business AS b
INNER JOIN business_category bc_1 ON b.primary_category = bc_1.id
INNER JOIN business_category bc_2 ON b.secondary_category = bc_2.id
LEFT OUTER JOIN business_category bc_3 ON b.tertiary_category = bc_3.id
INNER JOIN address a ON b.address_id = a.id
LEFT OUTER JOIN town_village_city tvc ON a.town_village_city_id = tvc.id
LEFT OUTER JOIN state_province_county spc ON a.state_province_county_id
INNER JOIN country c ON a.country_id = c.id
LEFT OUTER JOIN geolocation g ON b.geolocation_id = g.id
LEFT OUTER JOIN business_event be ON b.id = be.event_id
LEFT OUTER JOIN business_announcement ba ON b.id = ba.announcement_id
LEFT OUTER JOIN business_document bd ON b.id = bd.business_id
LEFT OUTER JOIN business_image bi ON b.id = bi.business_id
LEFT JOIN business_property bp ON b.id= bp.business_id
LEFT JOIN business_product bprod ON b.id= bprod.business_id
LEFT JOIN business_tag bt ON b.id = bt.business_id
WHERE b.id= in_business_id;
SELECT * from announcement
where
END
In your first select statement you may assign the announcementId's to a variable and then use it to get all announcements in the second query:
set #announcementIds = '';
select ...........,
#announcementIds:= GROUP_CONCAT(DISTINCT announcement_id) as 'announcement',
...........;
Select * from announcement
where announcement_id REGEXP REPLACE(#announcementIds,',','|');
Some links:
Replace function
Regexp

need advice on speeding up query

This query takes about 9 seconds and returns 2 records.
SELECT
s.description, s.improvement,
s.first_name, s.last_name, s.finding, s.action, s.share,
s.learned, s.timestamp, d.title as department_title,
group_concat(DISTINCT g.title SEPARATOR " | ") as strategic_goals,
group_concat(DISTINCT m.statement SEPARATOR " | ") as mission_references,
group_concat(DISTINCT meas.statement SEPARATOR " | ") as measure_statement,
group_concat(DISTINCT o.statement SEPARATOR " | ") as outcome_statement,
group_concat(DISTINCT i.title SEPARATOR " | ") as ilo_title,
group_concat(DISTINCT cv.title SEPARATOR " | ") as core_value_title,
y1.year as current_year_title, y2.year as previous_year_title,
u.file_name as file_name
FROM summary s
LEFT JOIN year y1 ON s.current_year_id = y1.id
INNER JOIN year y2 ON s.previous_year_id = y2.id
INNER JOIN strategic_goal_entries sge ON s.id = sge.summary_id
INNER JOIN goal g ON sge.goal_id = g.id
INNER JOIN outcome o ON s.id = o.summary_id
LEFT JOIN measure meas ON o.id = meas.outcome_id
INNER JOIN department d ON s.department_id = d.id
LEFT JOIN uploads u ON s.id = u.summary_id
INNER JOIN mission_entries me ON s.id = me.summary_id
LEFT JOIN mission m ON me.mission_id = m.id
LEFT JOIN ilo_entries ie ON s.id = ie.summary_id
LEFT JOIN ilo i ON ie.ilo_id = i.id
INNER JOIN core_value_entries cve ON s.id = cve.summary_id
INNER JOIN core_value cv ON cve.core_value_id = cv.id
INNER JOIN executive_of_department eod ON s.department_id = eod.department_id
WHERE eod.executive_id = 3
GROUP BY s.id
I added the rest of the primary keys. Query went from 9 seconds to 2 seconds.
Then I set fields that refer to other table's primary keys, as index fields. The query went from 2 seconds to 20 seconds, did I assign too many indexes?
What are some ways I could speed this up?
1. After setting indexes, etc. you can speed up a little by limiting the JOIN conditions. For example, instead of
LEFT JOIN uploads u ON s.id = u.summary_i
do
LEFT JOIN uploads u ON (s.id = u.summary_i AND s.id = 3)
The WHERE clause is evaluated after all tables have been joined. Preciese your JOIN condition when you can.
2. Did you try constructing a view from your select and perform a SELECT * FROM myView WHERE myView.id = 3 ?
UPDATE : read this blog comment.