Recently I wrote a PHP web app to gather a list of data and output it. Originally I thought the PHP code was running slow but I checked the amount of time this query takes to run and noticed it's MySQL and not PHP.
My conclusion is that I need to make indexes on these tables but I wanted to get feedback from.others before moving forward and doing that.
Here's my query:
SELECT *
FROM claims c
LEFT
JOIN claims_data d
ON c.claim_number = d.claim_number
LEFT
JOIN merchant_category_code m
ON c.procedure_code = m.code
LEFT JOIN claim_log l
ON c.claim_number = l.claim_number
WHERE c.social_security_num = :num
ORDER
BY c.start_date DESC
For your QUERY, indexes are needed for each line with
ON left and right side of = for each table, column
WHERE for each table, column
SORT BY each table, column
spend some time with MYSQL INDEX TUTORIALS
Related
I have complex SQL request with a lot of joins (it's generated by PHP for table with different filters applied to show users data):
SELECT u.*,
n.id AS network_id,
n.name AS network_name,
n.size AS network_size,
count(distinct(un.userid)) AS network_elements_have,
count(distinct(o.id)) AS total_orders,
max(o.date) AS last_order_date,
am.firstname AS am_firstname,
am.surname AS am_surname,
bdr.firstname AS bdr_firstname,
bdr.surname AS bdr_surname,
wc.status AS wc_status,
wc.potential AS wc_potential,
wc.calls AS wc_calls
FROM ei_users u
LEFT JOIN ei_orders o ON o.user_id=u.userid
LEFT JOIN ei_users am ON u.amid = am.userid
LEFT JOIN ei_users bdr ON u.bdrid = bdr.userid
LEFT JOIN ei_networks n ON u.network = n.id
LEFT JOIN ei_users un ON n.id=un.network
AND un.archive != '1'
LEFT JOIN ei_calls wc ON wc.userid = u.userid
AND wc.type = 'welcome'
WHERE u.archive != '1'
GROUP BY u.userid
HAVING last_order_date < NOW() - INTERVAL 15 DAY
ORDER BY u.userid DESC,
o.date DESC
When I run this request on a MySQL table (ei_users) with 1755 users and 75000 orders (ei_orders table) it's works very slow (10+ minutes or so). All id's and used in request fields have index'es added in all tables.
Question:
How to make this work faster? (we use shared Siteground hosting, but maybe we can enable some database caching or some other modules or you have ideas how to optimize some part of request here)
What requests parts in this SQL code are MOST time consuming? (as I see main problem here is dates intervals). How we can run this request (for ex. in PHPMyAdmin or via PHP) in some kind of debug mode so we can see what parts of request how much time consume?
Maybe someone can provide fixed version of my EXAMPLE query provided here, so I can understand better what is wrong here?
EXPLAIN query:
Try adding index on repeatedly used columns. That would speed up the select query.
I have a commenting system I'm developing that I have asked another question about here. The schema is pretty much the same except I have added a rating column in the comments table and have set up a trigger to update it when there's changes in the comments_ratings table in order to avoid calculating the rating every time I need to fetch comments.
So when I execute my query to fetch the latest comments:
SELECT
c.*, COALESCE (COUNT(r.id), 0) AS replies
FROM
(
SELECT
...
FROM
comments c
LEFT JOIN users u ON u.id = c.author
LEFT JOIN comments_ratings crv ON crv.COMMENT = c.id
AND crv.USER = ?
WHERE
c.item = ?
AND c.type = ?
ORDER BY
c.id DESC
LIMIT 0,
10
) AS c
LEFT JOIN comments r ON c.id = r.reply
GROUP BY
c.id
ORDER BY
c.id DESC
I get a result back in a matter of 0.001 ~ 0.003 seconds, and I can confirm cache is not helping me because I have tried limiting by random values and the time is always in this range.
However if I try to order by rating instead of c.id, the query takes 30+ seconds (I have a lot of test data). When I open up the profiler I see that more than 90% of the time it has used for Copying to tmp table. I suppose that it is copying the entire table into a in-memory table and sorting it there, but I don't understand why (if) that's happening since I have created an index on column rating which should help?
The reason I broke database normalization and created the rating column instead of calculating it was to be able to index it in order to ease sorting.
I am pretty confused at this point, do you see what I'm doing wrong?
I'd like to be able to join a lot of tables together (9-10 or so), but even with only 7, it times out. I tried indexing to the best of my knowledge (see attached EXPLAIN). Are joins just not made to handle 7+ tables, or am I doing something wrong?
(As you can see in the EXPLAIN, there are VERY few records in the database. Even if I can get this to work, what happens when there are 500,000+ records?
SELECT `Node`.`id`, `Node`.`name`, `Node`.`slug`, `Node`.`node_type_id`, `Node`.`site_id`, `Node`.`created`, `Node`.`modified`
FROM `mysite`.`nodes` AS `Node`
LEFT JOIN `mysite`.`data_date_times` AS `DataDateTime` ON (`DataDateTime`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_locations` AS `DataLocation` ON (`DataLocation`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_media` AS `DataMedia` ON (`DataMedia`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_metas` AS `DataMeta` ON (`DataMeta`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_profiles` AS `DataProfile` ON (`DataProfile`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_products` AS `DataProduct` ON (`DataProduct`.`node_id` = `Node`.`id`)
LEFT JOIN `mysite`.`data_texts` AS `DataText` ON (`DataText`.`node_id` = `Node`.`id`)
WHERE 1=1
GROUP BY `Node`.`id`
I'm using JOINS because I want to be able to order the main results by it's joined tables, and be able to query against specific fields within each contain (potentially) - and I don't know of a good way to do that with separate queries.
Any thoughts/suggestions VERY welcome.
Update:
When run on a local copy of the database, it says that "copying to temp table" took 117 seconds, then does complete and shows all 7 records.
I have this query for an application that I am designing. There is a table of references, an authors table and a reference_authors table. There is a sub query to return all authors for a given reference which I then display formatted in php. The subquery and query run individually are both nice and speedy. However as soon as the subquery is put into the main query the whole thing takes over 120s to run. I would apprecaite some fresh eyes on this one.
Thanks.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
(SELECT GROUP_CONCAT(a.author_name)
FROM authors_final AS a
INNER JOIN reference_authors AS ra2 ON ra2.author_id = a.author_id
WHERE ra2.reference_id = rf.reference_id
GROUP BY ra2.reference_id) AS authors
FROM
references_final AS rf
INNER JOIN reference_authors AS ra ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id;
Here is the fixed query. Thanks guys for the recommendations.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
GROUP_CONCAT(a.author_name) AS authors
FROM
references_final as rf
INNER JOIN (reference_authors AS ra INNER JOIN authors_final AS a ON ra.author_id = a.author_id)
ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id
GROUP BY rf.reference_id
Although not every subquery can be rewritten as an inner join, I think yours can.
From 120 seconds to 78 milliseconds is not a bad improvement--about three orders of magnitude. Take the rest of the day off.
When you come back tomorrow, start looking for other subqueries in your source code.
You say the subquery is nice and speedy in isolation but its now obviously running for every single row - 100 rows = 100 sub queries.
Assuming you have indexes on all your foreign keys that's as good as it gets as a sub query.
One option is to left join authors and create a Cartesian product - you'll have a lot more rows returned and will need some code to get to the same end result but it will put less strain on the db and will run quicker.
If you've got paging on and say are returning 10 rows, issung 10 individual calls to get the authors in isolation would also be be pretty quick.
OK, I can't work out what index I should have on "TBL_PHOTOS" to get this query to run quickly. Currently taking about 0.8 seconds with 50,000 rows in PH, 50,000 in PL, 300 in R1 and 100 in R2.
If I remove the ORDER BY clause then the query is speedy, taking < 0.05 seconds.
The following is in MySQL by the way:
SELECT PH.tTaken, PH.nPhotoPK, PH.sFilename
FROM TBL_PHOTOS PH
LEFT JOIN TBL_PHOTO_LINKS PL ON PH.nPhotoPK = PL.nPhotoFK
LEFT JOIN TBL_RACES1 R1 ON R1.nRacePK = PH.nRace1FK
LEFT JOIN TBL_RACES2 R2 ON R2.nRacePK = PH.nRace2FK
WHERE PL.nPhotoLinkPK IS NULL
ORDER BY PH.tAdded DESC
LIMIT 0,100
The intention is to pull back the 100 most recently uploaded photos that haven't yet been linked to anything. TBL_RACES1 & TBL_RACES2 are two separate tables for a good reason, so I can't change that. A photo will always belong to one entity from R1 or R2, never both.
Apologies if that's bad SQL for some reason, it's not my strong point. I'm not even sure what information you will need to help me out, so if I've left something vital out just ask.
I have a few indexes set on the table already, but in an explain statement I get
possible_keys: (Null)
key: (null)
ref: (null)
Thank you!
If removing the ORDER BY makes the query run a lot faster, then the problem is in the sorting. You are likely extracting thousands of rows and then taking the top 100. Restructure the query so you extract the most recent 100 photos first, then join to the other tables. Something like this (untested):
SELECT tTaken, nPhotoPK, sFilename
FROM
(select * from TBL_PHOTOS order by tAdded DESC LIMIT 0,100) PH
LEFT JOIN TBL_PHOTO_LINKS PL ON PH.nPhotoPK = PL.nPhotoFK
LEFT JOIN TBL_RACES1 R1 ON R1.nRacePK = PH.nRace1FK
LEFT JOIN TBL_RACES2 R2 ON R2.nRacePK = PH.nRace2FK
WHERE PL.nPhotoLinkPK IS NULL
I'm not up on MySQL syntax to be sure this is the way to express it in MySQL, but this should give you the general idea.