I have a performance issue with the query below on MYSQL. The below query has 5 tables involved. When I apply the order by and limit, the results are retrieved in 0.3 secs. But without the order by and limit, I was able to get the results in 0.01 secs. I am tired changing the query but that did not work. Could someone please help me with this query so I can get the results in desired time (<0.3 secs).
Below are the details.
m_todos = 286579 (records)
m_pat = 214858 (records)
users = 119 (records)
m_programs = 26 (records)
role = 4 (records)
SELECT *
FROM (
SELECT t.*,
mp.name as A_name,
u.first_name, u.last_name,
p.first, p.last, p.zone, p.language,p.handling,
r.name,
u2.first_name AS created_first_name,
u2.last_name AS created_last_name
FROM m_todos t
INNER JOIN role r ON t.role_id=r.id
INNER JOIN m_pat p ON t.patient_id = p.id
LEFT JOIN users u2 ON t.created_id=u2.id
LEFT JOIN m_programs mp ON t.prog_id=mp.id
LEFT JOIN users u ON t.user_id=u.id
WHERE t.role_id !='9'
AND t.completed = '0000-00-00 00:00:00'
) C
ORDER BY priority DESC, due ASC
LIMIT 0,10
Get rid of the outer SELECT; move the ORDER BY and LIMIT in.
Indexes:
t: (completed)
t: (priority, due)
I assume priority and due are in t?? Please be explicit in the query. It could make a huge difference.
If the following works, it should speed things up a lot: Start by finding the t.id without all the JOINs:
SELECT id
FROM m_todos
WHERE role_id !='9'
AND completed = '0000-00-00 00:00:00'
ORDER BY priority DESC, due DESC
LIMIT 10
That will benefit from this covering composite index:
INDEX(completed, role_id, priority, due, id)
Debug that. Then use it in the rest:
SELECT t.*, the-other-stuff
FROM ( that-query ) AS t1
JOIN m_todos AS t USING(id)
then-the-rest-of-the-JOINs
ORDER BY priority DESC, due ASC -- yes, again
If you don't need all of t.*, it may be beneficial to spell out the actual columns needed.
The reason for this to run much faster is that the 10 rows are found efficiently by looking only at the one table. The original code was shoveling around a lot more rows than 10 and they included all the columns of t, plus columns from the other tables.
My version does only 10 lookups for all the extra stuff.
Related
I have a mysql query to join four tables and I thought that it was just best to join tables but now that mysql data is getting bigger the query seems to cause the application to stop execution.
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM `purchase_order`
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY id ORDER BY `purchase_order`.`po_date` DESC LIMIT 0, 20
my problem really is the query that takes a lot of time to finish. Is there a way to speed this query or to change this query for faster retrieval of the data?
heres the EXPLAIN EXTENED as requested in the comments.
Thanks in advance, I really hope this is the right channel for me to ask. If not please let me know.
Will this give you the correct list of ids?
SELECT id
FROM purchase_order
ORDER BY`po_date` DESC
LIMIT 0, 20
If so, then start with that before launching into the JOIN. You can also (I think) get rid of the GROUP BY that is causing an "explode-implode" of rows.
SELECT ...
FROM ( SELECT id ... (as above) ...) AS ids
JOIN purchase_order po ON po.id = ids.id
JOIN ... (the other tables)
GROUP BY ... -- (this may be problematic, especially with the LIMIT)
ORDER BY po.po_date DESC -- yes, this needs repeating
-- no LIMIT
Something like this
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM (SELECT id, po_date, po_number, customer_id, status
FROM purchase_order
ORDER BY `po_date` DESC
LIMIT 0, 5) as purchase_order
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items
ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY purchase_order.id DESC
LIMIT 0, 5
You need to be sure that purchase_order.po_date and all id column are indexed. You can check it with below query.
SHOW INDEX FROM yourtable;
Since you mentioned that data is getting bigger. I would suggest doing sharding and then you can parallelize multiple queries. Please refer to the following article
Parallel Query for MySQL with Shard-Query
First, I cleaned up readability a bit. You don't need tick marks around every table.column reference. Also, for short-hand, using aliases works well. Ex: "po" instead of "purchase_order", "poi" instead of "purchase_order_items". The only time I would use tick marks is around reserved words that might cause a problem.
Second, you don't have any aggregations (sum, min, max, count, avg, etc.) in your query so you should be able to strip the GROUP BY clause.
As for indexes, I would have to assume you have an index on your reference tables on their respective "id" key columns.
For your Purchase Order table, I would have an index on that based on the "po_date" in the first index field position in case you already had an index using it. Since your Order by is on that, let the engine jump directly to those dated records first and you have your descending order resolved.
SELECT
po.id,
po.po_date,
po.po_number,
po.customer_id,
c.`name` AS customer_name,
po.`status` AS po_status,
poi.product_id,
poi.po_item_name,
p.weight as product_weight,
p.pending as product_pending,
p.company_owner,
poi.uom,
poi.po_item_type,
poi.order_sequence,
poi.pending_balance,
poi.quantity,
poi.notes,
poi.`status` AS po_item_status,
poi.id AS po_item_id
FROM
purchase_order po
INNER JOIN customer c
ON po.customer_id = c.id
INNER JOIN purchase_order_items poi
ON po.id = poi.po_id
INNER JOIN product p
ON poi.product_id = p.id
ORDER BY
po.po_date DESC
LIMIT
0, 20
I'm struggling to make a query efficient enough. I'm using Doctrine2 ORM (the query is build with QueryBuilder) and part of my query is running very slow - takes about 4s with table of 5000 rows.
This is the relevant part of db schema:
TABLE user
id (primary)
... (plenty of rows, not relevant to the query)
TABLE slot
id (primary)
user_id (foreign for user)
date (datetime)
And this is how my query looks like (it's the basic version, there's a lot of filters to be applied, but these work like fine for now)
SELECT
u.id AS uid,
COUNT(DISTINCT s_order.id) AS sclr_1,
COUNT(DISTINCT s_filter.id) AS sclr_2
FROM
user u
LEFT JOIN slot s_order ON (s_order.user_id = u.id)
LEFT JOIN slot s_filter ON (s_filter.user_id = u.id)
WHERE
(
(
(
s_order.date BETWEEN ?
AND ?
)
AND (
s_filter.date BETWEEN ?
AND ?
)
)
AND (u.deleted_at IS NULL)
)
AND u.userType IN ('2')
GROUP BY
u.id
HAVING
sclr_2 > 0
ORDER BY
sclr_1 DESC
LIMIT
12
Let me explain what I'm trying to achieve here:
I need to filter users who has any slots between 1 week ago and 1 week ahead, then order them by count of slots available between now and 1 week ahead. The part of query causing issues is LEFT JOIN of s_filter and I'm wondering whether perhaps there's a way to improve the performance of that query?
Any help appreciated really, even if it's only plain SQL I'll try to convert it to DQL myself!
#UPDATE
Just an additional info that I forgot, the LIMIT in query is for pagination purposes!
#UPDATE 2
After a while of tweaking the query I figured out that I can use JOIN for filtering instead of LEFT JOIN + COUNT, so my query does look like that now:
SELECT
u.id AS uid, COUNT(DISTINCT s_order.id) AS ordinal
FROM
langu_user u
LEFT JOIN
slot s_order ON (s_order.user_id = u.id) AND s_order.date BETWEEN '2017-02-03 14:03:22' AND '2017-02-10 14:03:22'
JOIN
slot s_filter ON (s_filter.user_id = u.id) AND s_filter.date BETWEEN '2017-01-27 14:03:22' AND '2017-02-10 14:03:22'
WHERE
u.deleted_at IS NULL
AND u.userType IN ('2')
GROUP BY u.id
ORDER BY ordinal DESC
LIMIT 12
And it went down from 4.1-4.3s to 3.6~
I have the following table
As you can see It has 1868155 rows. I am attempting to make a realtime graph, but It is impossible since almost any query lasts 1 or 2 seconds.
For example, this query
SELECT sensor.nombre, temperatura.temperatura
FROM sensor, temperatura
WHERE sensor.id = temperatura.idsensor
ORDER BY temperatura.fecha DESC, idsensor ASC
LIMIT 4
Is supposed to show this
Ive tried everything, using indexes(perhaps not correctly), using only the fields i need instead of *, etc. but the results are the same!
These are the indexes of the table.
Explain of the query
EDITED
This is the explain of the query after implementing
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
And using inner join syntax for the query
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
This is my whole sensor table
Try the following:
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
I also recommend using modern join syntax:
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
Report the EXPLAIN again after making the above changes, if performance is still not good enough.
Attempt #2
After looking closely at what it appears you are trying to do, I believe this next query may be more effective:
SELECT
s.nombre, t.temperatura
FROM temperatura t
LEFT OUTER JOIN temperatura later_t
ON later_t.idsensor = t.idsensor
AND later_t.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
WHERE later_t.idsensor IS NULL
ORDER BY t.idsensor ASC
You can also try:
SELECT
s.nombre, t.temperatura
FROM temperatura t
INNER JOIN (
SELECT
t.idsensor,
MAX(t.fecha) AS fecha
FROM temperatura t
GROUP BY t.idsensor
) max_fecha
ON max_fecha.idsensor = t.idsensor
AND max_fecha.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
ORDER BY t.idsensor ASC
In my experience, if you are trying to find the most recent record, one of the two queries above will work. Which works best depends on various factors, so try them both.
Let me know how those perform, and if they still get you the data you want. Also, any query you run, run at least 3 times, and report all 3 times. That will help get an accurate measure of how fast a given query is, since various external factors can affect the speed of a query.
It is not possible to optimize a mixture of ASC and DESC, as in
ORDER BY t.fecha DESC, t.idsensor ASC
You tried a covering index:
INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
However, this covering index may be better:
INDEX `sensor_temp` (`fecha`,`idsensor`,`temperatura`)
Then, if you are willing to get the sensors in a different order, use
ORDER BY t.fecha DESC, t.idsensor DESC
This will give you up to 4 sensors for the last fecha:
sensor: PRIMARY KEY(id)
tempuratura: INDEX(fecha, idsensor, tempuratura)
SELECT
( SELECT nombre FROM sensor WHERE id = t.idsensor ) AS nombre,
t.temperatura
FROM
( SELECT MAX(fecha) AS max_fecha FROM tempuratura ) AS z
JOIN temperatura AS t ON t.fecha = z.max_fecha
ORDER BY t.idsensor ASC
LIMIT 4;
There are 3 tables, persontbl1, persontbl2 (each 7500 rows) and schedule (~3000 active schedules i.e. schedule.status = 0). Person tables contain data for the same persons as one to one relationship and INNER join between two takes less than a second. And schedule table contains data about persons to be interviewed and not all persons have schedules in schedule table. With Left join query instantly takes around 45 seconds, which is causing all sorts of issues.
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
Here is the explain for query:
Schedule Table structure:
Schedule Table indexes:
Please let me know if any further information is required.
Thanks.
Edit: Added fully qualified table names and their columns.
You should just replace this line:
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
to this one:
AND SCHEDULE.call_datetime <= '2015-04-18 00:00:00'
so mysql will not call 2 functions per every record but will use static constant '2015-04-18 00:00:00'.
So you can just try for performance improvements if your query is:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (SCHEDULE.call_datetime <= '2015-02-01 00:00:00')
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
EDIT 1 So you said without LEFT JOIN part it was fast enough, so you can try then:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id,
s.enum_change, s.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN
(SELECT *
FROM SCHEDULE
WHERE status=0
AND call_datetime <= '2015-02-01 00:00:00'
) s
ON s.survey_id = persontbl1._URI
ORDER BY s.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
I'm guessing that AGR_CONTACT comes from p1. This is the query you want to optimize:
SELECT p1._CREATION_DATE, _TOP_LEVEL_AURI, RESP_CNIC, RESP_CNIC_NAME,
MOB_NUMBER1, MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id, s.enum_change, s.status
FROM persontbl1 p1 INNER JOIN
persontbl2 p2
ON (p2._TOP_LEVEL_AURI = p1._URI) AND (p1.AGR_CONTACT = 1) LEFT JOIN
SCHEDULE s
ON (s.survey_id = p1._URI) AND
(s.status = 0) AND
(DATE(s.call_datetime) <= CURDATE())
ORDER BY s.call_datetime IS NULL DESC, p1._CREATION_DATE ASC;
The best indexes for this query are: persontbl2(agr_contact), persontbl1(_TOP_LEVEL_AURI, _uri), and schedule(survey_id, status, call_datime).
The use of date() around the date time is not recommended. In general, that precludes the use of indexes. However, in this case, you have a left join, so it doesn't make a difference. That column is not being used for filtering anyway. The index on schedule is only for covering the on clause.
The below query is very slow (takes around 1 second), but is only searching approx 2500 records (+ inner joined tables).
if i remove the ORDER BY, the query runs in much less time (0.05 or less)
OR if i remove the part nested select below "# used to select where no ProfilePhoto specified" it also runs fast, but i need both of these included.
I have indexes (or primary key) on :tPhoto_PhotoID, PhotoID, p.Enabled, CustomerID, tCustomer_CustomerID, ProfilePhoto (bool), u.UserName, e.PrivateEmail, m.tUser_UserID, Enabled, Active, m.tMemberStatuses_MemberStatusID, e.tCustomerMembership_MembershipID, e.DateCreated
(do i have too many indexes? my understanding is add them anywhere i use WHERE or ON)
The Query :
SELECT e.CustomerID,
e.CustomerName,
e.Location,
SUBSTRING_INDEX(e.CustomerProfile,' ', 25) AS Description,
IFNULL(p.PhotoURL, PhotoTable.PhotoURL) AS PhotoURL
FROM tCustomer e
LEFT JOIN (tCustomerPhoto ep INNER JOIN tPhoto p ON (ep.tPhoto_PhotoID = p.PhotoID AND p.Enabled=1))
ON e.CustomerID = ep.tCustomer_CustomerID AND ep.ProfilePhoto = 1
# used to select where no ProfilePhoto specified
LEFT JOIN ((SELECT pp.PhotoURL, epp.tCustomer_CustomerID
FROM tPhoto pp
LEFT JOIN tCustomerPhoto epp ON epp.tPhoto_PhotoID = pp.PhotoID
GROUP BY epp.tCustomer_CustomerID) AS PhotoTable) ON e.CustomerID = PhotoTable.tCustomer_CustomerID
INNER JOIN tUser u ON u.UserName = e.PrivateEmail
INNER JOIN tmembers m ON m.tUser_UserID = u.UserID
WHERE e.Enabled=1
AND e.Active=1
AND m.tMemberStatuses_MemberStatusID = 2
AND e.tCustomerMembership_MembershipID != 6
ORDER BY e.DateCreated DESC
LIMIT 12
i have similar queries that but they run much faster.
any opinions would be grateful:
Until we get more clarity on your question between working in other query etc..Try EXPLAIN {YourSelectQuery} in MySQL client and see the suggestions to improve the performance.