MySQL query optimization - getting the last post of all threads - mysql

My MySQL query is loading very slow (over 30 secs), I was wondering what tweaks I can make to optimize it.
The query should return the last post with the string "?" of all threads.
SELECT FeedbackId, ParentFeedbackId, PageId, FeedbackTitle, FeedbackText, FeedbackDate
FROM ReaderFeedback AS c
LEFT JOIN (
SELECT max(FeedbackId) AS MaxFeedbackId
FROM ReaderFeedback
WHERE ParentFeedbackId IS NOT NULL
GROUP BY ParentFeedbackId
) AS d ON d.MaxFeedbackId = c.FeedbackId
WHERE ParentFeedbackId IS NOT NULL
AND FeedbackText LIKE '%?%'
GROUP BY ParentFeedbackId
ORDER BY d.MaxFeedbackId DESC LIMIT 50

Before discuss this problem, I have formatted your SQL:
SELECT feedbackid,
parentfeedbackid,
pageid,
feedbacktitle,
feedbacktext,
feedbackdate
FROM readerfeedback AS c
LEFT JOIN (SELECT Max(feedbackid) AS MaxFeedbackId
FROM readerfeedback
WHERE parentfeedbackid IS NOT NULL
GROUP BY parentfeedbackid) AS d
ON d.maxfeedbackid = c.feedbackid
WHERE parentfeedbackid IS NOT NULL
AND feedbacktext LIKE '%?%'
GROUP BY parentfeedbackid
ORDER BY d.maxfeedbackid DESC
LIMIT 50
Since there is an Inefficient query criteria in your SQL:
feedbacktext LIKE '%?%'
Which is not able to take benefit from Index and needs a full scan, I suggest you to add a new field
isQuestion BOOLEAN
to your table, and then add logic in your program to assign this field when insert/update a feedbacktext.
Finally your can query based on this field and take benefit from index.

Firstly your SQL is not valid. The outer Group by is not valid.
According to the SQL the second group by is not needed. I moved the 2 where into inner SQL, as well as the limit, wonder if the following is quicker:
SELECT FeedbackId, ParentFeedbackId, PageId, FeedbackTitle, FeedbackText, FeedbackDate
FROM ReaderFeedback AS c
JOIN (
SELECT max(FeedbackId) AS MaxFeedbackId
FROM ReaderFeedback
WHERE ParentFeedbackId IS NOT NULL
AND FeedbackText LIKE '%?%'
GROUP BY ParentFeedbackId
ORDER BY 1 DESC LIMIT 50
) AS d ON d.MaxFeedbackId = c.FeedbackId
Please have a look at your table structure, see if there is any normalisation be downed for speed concern.

Related

Speeding up mysql query

I have a mysql query to join four tables and I thought that it was just best to join tables but now that mysql data is getting bigger the query seems to cause the application to stop execution.
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM `purchase_order`
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY id ORDER BY `purchase_order`.`po_date` DESC LIMIT 0, 20
my problem really is the query that takes a lot of time to finish. Is there a way to speed this query or to change this query for faster retrieval of the data?
heres the EXPLAIN EXTENED as requested in the comments.
Thanks in advance, I really hope this is the right channel for me to ask. If not please let me know.
Will this give you the correct list of ids?
SELECT id
FROM purchase_order
ORDER BY`po_date` DESC
LIMIT 0, 20
If so, then start with that before launching into the JOIN. You can also (I think) get rid of the GROUP BY that is causing an "explode-implode" of rows.
SELECT ...
FROM ( SELECT id ... (as above) ...) AS ids
JOIN purchase_order po ON po.id = ids.id
JOIN ... (the other tables)
GROUP BY ... -- (this may be problematic, especially with the LIMIT)
ORDER BY po.po_date DESC -- yes, this needs repeating
-- no LIMIT
Something like this
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM (SELECT id, po_date, po_number, customer_id, status
FROM purchase_order
ORDER BY `po_date` DESC
LIMIT 0, 5) as purchase_order
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items
ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY purchase_order.id DESC
LIMIT 0, 5
You need to be sure that purchase_order.po_date and all id column are indexed. You can check it with below query.
SHOW INDEX FROM yourtable;
Since you mentioned that data is getting bigger. I would suggest doing sharding and then you can parallelize multiple queries. Please refer to the following article
Parallel Query for MySQL with Shard-Query
First, I cleaned up readability a bit. You don't need tick marks around every table.column reference. Also, for short-hand, using aliases works well. Ex: "po" instead of "purchase_order", "poi" instead of "purchase_order_items". The only time I would use tick marks is around reserved words that might cause a problem.
Second, you don't have any aggregations (sum, min, max, count, avg, etc.) in your query so you should be able to strip the GROUP BY clause.
As for indexes, I would have to assume you have an index on your reference tables on their respective "id" key columns.
For your Purchase Order table, I would have an index on that based on the "po_date" in the first index field position in case you already had an index using it. Since your Order by is on that, let the engine jump directly to those dated records first and you have your descending order resolved.
SELECT
po.id,
po.po_date,
po.po_number,
po.customer_id,
c.`name` AS customer_name,
po.`status` AS po_status,
poi.product_id,
poi.po_item_name,
p.weight as product_weight,
p.pending as product_pending,
p.company_owner,
poi.uom,
poi.po_item_type,
poi.order_sequence,
poi.pending_balance,
poi.quantity,
poi.notes,
poi.`status` AS po_item_status,
poi.id AS po_item_id
FROM
purchase_order po
INNER JOIN customer c
ON po.customer_id = c.id
INNER JOIN purchase_order_items poi
ON po.id = poi.po_id
INNER JOIN product p
ON poi.product_id = p.id
ORDER BY
po.po_date DESC
LIMIT
0, 20

MY SQL running very slow due to `Order by` & `Limit`

I have a performance issue with the query below on MYSQL. The below query has 5 tables involved. When I apply the order by and limit, the results are retrieved in 0.3 secs. But without the order by and limit, I was able to get the results in 0.01 secs. I am tired changing the query but that did not work. Could someone please help me with this query so I can get the results in desired time (<0.3 secs).
Below are the details.
m_todos = 286579 (records)
m_pat = 214858 (records)
users = 119 (records)
m_programs = 26 (records)
role = 4 (records)
SELECT *
FROM (
SELECT t.*,
mp.name as A_name,
u.first_name, u.last_name,
p.first, p.last, p.zone, p.language,p.handling,
r.name,
u2.first_name AS created_first_name,
u2.last_name AS created_last_name
FROM m_todos t
INNER JOIN role r ON t.role_id=r.id
INNER JOIN m_pat p ON t.patient_id = p.id
LEFT JOIN users u2 ON t.created_id=u2.id
LEFT JOIN m_programs mp ON t.prog_id=mp.id
LEFT JOIN users u ON t.user_id=u.id
WHERE t.role_id !='9'
AND t.completed = '0000-00-00 00:00:00'
) C
ORDER BY priority DESC, due ASC
LIMIT 0,10
Get rid of the outer SELECT; move the ORDER BY and LIMIT in.
Indexes:
t: (completed)
t: (priority, due)
I assume priority and due are in t?? Please be explicit in the query. It could make a huge difference.
If the following works, it should speed things up a lot: Start by finding the t.id without all the JOINs:
SELECT id
FROM m_todos
WHERE role_id !='9'
AND completed = '0000-00-00 00:00:00'
ORDER BY priority DESC, due DESC
LIMIT 10
That will benefit from this covering composite index:
INDEX(completed, role_id, priority, due, id)
Debug that. Then use it in the rest:
SELECT t.*, the-other-stuff
FROM ( that-query ) AS t1
JOIN m_todos AS t USING(id)
then-the-rest-of-the-JOINs
ORDER BY priority DESC, due ASC -- yes, again
If you don't need all of t.*, it may be beneficial to spell out the actual columns needed.
The reason for this to run much faster is that the 10 rows are found efficiently by looking only at the one table. The original code was shoveling around a lot more rows than 10 and they included all the columns of t, plus columns from the other tables.
My version does only 10 lookups for all the extra stuff.

Optimization of relatively basic JOIN and GROUP BY query

I have a relatively basic query that fetches the most recent messages per conversation:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
The message table contains more than 911000 records, the conversation table contains around 680000. The execution time for this query, varies between 4 and 10 seconds, depending on the load on the server. Which is far too long.
Below is a screenshot of the EXPLAIN result:
The cause is apparently the MAX and/or the GROUP BY, because the following similar query only takes 10ms:
SELECT COUNT(*)
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE (`message`.`status`=0)
AND (`message`.`user_id` <> 1)
AND ((`conversation`.`sender_user_id` = 1 OR `conversation`.`receiver_user_id` = 1))
The corresponding EXPLAIN result:
I have tried adding different indices to both tables without any improvement, for example: conv_msg_idx(add_time, conversation_id) on message which seems to be used according to the first EXPLAIN result, however the query still takes around 10 seconds to execute.
Any help improving the indices or query to get the execution time down would be greatly appreciated.
EDIT:
I have changed the query to use an INNER JOIN:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
But the execution time is still ~ 6 seconds.
You should create Multiple-Column Index on the columns which are in your WHERE clause and which you want to SELECT (except conversation_id). (reference)
conversation_id should be an index in both table.
Try to avoid 'Or' in Sql query this will make the fetching slow. Instead use union or any other methods.
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE (conversation.sender_user_id = 1 AND conversation.status != -1)) GROUP BY conversation_id
union
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE ((conversation.receiver_user_id = 1 AND conversation.status != -2) ) GROUP BY conversation_id ORDER BY max_add_time DESC LIMIT 12
Instead of depending on a single table message, have two tables: One for message, as you have, plus another thread that keeps the status of the thread of messages.
Yes, that requires a little more work when adding a new message -- update a column or two in thread.
But it eliminates the GROUP BY and MAX that are causing grief in this query.
While doing this split, see if some other columns would be better off in the new table.
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
You can try with INNER JOIN, if your logic not get affect using it.
you can modify this query by avoiding max() use
select * from(
select row_number() over(partition by conversation_id order by add_time desc)p1
)t1 where t1.p1=1

How to make faster queries on my mysql table?

I have the following table
As you can see It has 1868155 rows. I am attempting to make a realtime graph, but It is impossible since almost any query lasts 1 or 2 seconds.
For example, this query
SELECT sensor.nombre, temperatura.temperatura
FROM sensor, temperatura
WHERE sensor.id = temperatura.idsensor
ORDER BY temperatura.fecha DESC, idsensor ASC
LIMIT 4
Is supposed to show this
Ive tried everything, using indexes(perhaps not correctly), using only the fields i need instead of *, etc. but the results are the same!
These are the indexes of the table.
Explain of the query
EDITED
This is the explain of the query after implementing
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
And using inner join syntax for the query
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
This is my whole sensor table
Try the following:
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
I also recommend using modern join syntax:
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
Report the EXPLAIN again after making the above changes, if performance is still not good enough.
Attempt #2
After looking closely at what it appears you are trying to do, I believe this next query may be more effective:
SELECT
s.nombre, t.temperatura
FROM temperatura t
LEFT OUTER JOIN temperatura later_t
ON later_t.idsensor = t.idsensor
AND later_t.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
WHERE later_t.idsensor IS NULL
ORDER BY t.idsensor ASC
You can also try:
SELECT
s.nombre, t.temperatura
FROM temperatura t
INNER JOIN (
SELECT
t.idsensor,
MAX(t.fecha) AS fecha
FROM temperatura t
GROUP BY t.idsensor
) max_fecha
ON max_fecha.idsensor = t.idsensor
AND max_fecha.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
ORDER BY t.idsensor ASC
In my experience, if you are trying to find the most recent record, one of the two queries above will work. Which works best depends on various factors, so try them both.
Let me know how those perform, and if they still get you the data you want. Also, any query you run, run at least 3 times, and report all 3 times. That will help get an accurate measure of how fast a given query is, since various external factors can affect the speed of a query.
It is not possible to optimize a mixture of ASC and DESC, as in
ORDER BY t.fecha DESC, t.idsensor ASC
You tried a covering index:
INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
However, this covering index may be better:
INDEX `sensor_temp` (`fecha`,`idsensor`,`temperatura`)
Then, if you are willing to get the sensors in a different order, use
ORDER BY t.fecha DESC, t.idsensor DESC
This will give you up to 4 sensors for the last fecha:
sensor: PRIMARY KEY(id)
tempuratura: INDEX(fecha, idsensor, tempuratura)
SELECT
( SELECT nombre FROM sensor WHERE id = t.idsensor ) AS nombre,
t.temperatura
FROM
( SELECT MAX(fecha) AS max_fecha FROM tempuratura ) AS z
JOIN temperatura AS t ON t.fecha = z.max_fecha
ORDER BY t.idsensor ASC
LIMIT 4;

How to limiting subquery requests to one?

I was thinking a way to using one query with a subquery instead of using two seperate queries.
But turns out using a subquery is causing multiple requests for each row in result set. Is there a way to limit that count subquery result only one with in a combined query ?
SELECT `ad_general`.`id`,
( SELECT count(`ad_general`.`id`) AS count
FROM (`ad_general`)
WHERE `city` = 708 ) AS count,
FROM (`ad_general`)
WHERE `ad_general`.`city` = '708'
ORDER BY `ad_general`.`id` DESC
LIMIT 15
May be using a join can solve the problem but dunno how ?
SELECT ad_general.id, stats.cnt
FROM ad_general
JOIN (
SELECT count(*) as cnt
FROM ad_general
WHERE city = 708
) AS stats
WHERE ad_general.city = 708
ORDER BY ad_general.id DESC
LIMIT 15;
The explicit table names aren't required, but are used both for clarity and maintainability (the explicit table names will prevent any imbiguities should the schema for ad_general or the generated table ever change).
You can self-join (join the table to itself table) and apply aggregate function to the second.
SELECT `adgen`.`id`, COUNT(`adgen_count`.`id`) AS `count`
FROM `ad_general` AS `adgen`
JOIN `ad_general` AS `adgen_count` ON `adgen_count`.city = 708
WHERE `adgen`.`city` = 708
GROUP BY `adgen`.`id`
ORDER BY `adgen`.`id` DESC
LIMIT 15
However, it's impossible to say what the appropriate grouping is without knowing the structure of the table.