Inefficient SQL - mysql

I'm no MySQL expert, but I've managed until now to hack together something that works. Unfortunately, my latest bodged attempt results in the server dying, so obviously I'm doing something that is massively inefficient. Can anyone give me a hint as to where the problem is and how I might get the same results without bringing the whole site down everytime?
$sqlbest = "SELECT
wp_postmeta.meta_value
, wp_posts.post_title
, wp_posts.ID
, (TO_DAYS(CURDATE())- TO_DAYS(wp_posts.post_date))+1 AS days
FROM `wp_postmeta` , `wp_posts`
WHERE `wp_postmeta`.`post_id` = `wp_posts`.`ID`
AND `wp_posts`.`post_date` >= DATE_SUB( CURDATE( ) , INTERVAL 1 WEEK)
AND `wp_postmeta`.`meta_key` = 'views'
AND `wp_posts`.`post_status` = 'publish'
AND wp_posts.ID != '".$currentPostID."'
GROUP BY `wp_postmeta`.`post_id`
ORDER BY (CAST( `wp_postmeta`.`meta_value` AS UNSIGNED ) / days) DESC
LIMIT 0 , 4";
$results = $wpdb->get_results($sqlbest);
It uses a post views count to calculate views/day for posts published in the last, then orders them by that number, and grabs the top 4.
I think I see that it's inefficient in that it has to calculate that views/day everytime for a few thousand posts, but I don't know how to do it any better.
Thanks in advance.

You could eliminate the need to call those date functions every time by either passing them statically into the query from your PHP server (which may not be synced with your database) or you can instead write a stored procedure and save the results of those date functions to variables that will then be used in the query.

SELECT
wp_postmeta.meta_value
, wp_posts.post_title
, wp_posts.ID
, DATEDIFF(CURDATE(),wp_posts.post_date)+1 AS days <<--1: DATEDIFF
FROM wp_postmeta
INNER JOIN wp_posts ON (wp_postmeta.post_id = wp_posts.ID) <<--2: explicit join
WHERE wp_posts.post_date >= DATE_SUB( CURDATE( ) , INTERVAL 1 WEEK)
AND wp_postmeta.meta_key = 'views'
AND wp_posts.post_status = 'publish'
AND wp_posts.ID != '".$currentPostID."'
AND wp_postmeta.meta_value > 1 <<-- 3: extra filter
/*GROUP BY wp_postmeta.post_id */ <<-- 4: group by not needed
ORDER BY (CAST( wp_postmeta.meta_value AS UNSIGNED ) / days) DESC
LIMIT 0 , 4;
I've tried to make a few changes.
Replaced the two calls to TO_DAYS with one call to DATEDIFF.
Replaced the ugly implicit where join with an explicit inner join this does not do anything, just makes things clearer. One thing it shows, if wp_postmeta.post_id is unique, then you do not need the group by, because the inner join will only give one row per wp_postmeta.post_id.
Added an extra filter to filter out the posts with a low view count, this limits the amount of rows MySQL has to sort.
Eliminated group by this is only right if wp_postmeta.post_id is unique!

Related

Preventing random ordering on MYSQL when not using ORDER BY RAND()

I am perplexed, because every now and then I am getting a random order just by executing this MySQL command directly on Navicat. In my understanding unless you explicitly use ORDER BY RAND(), the order should be the same all the time, but in this case it's not the case at all.
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts
INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
INNER JOIN wp_product dg ON dg.post_id = wp_posts.ID
AND ( dg.location IN ('XV', 'QV', 'DH') )
AND (0 OR (dg.srp = 1) OR (dg.SoldDate > (now() - interval 30 DAY)
AND dg.isPremium = 0 AND dg.SoldDate != '0000-00-00 00:00:00'
AND dg.SoldDate IS NOT NULL AND price > 0 AND NoImage = 0))
AND ( dg.deleted IS NULL OR dg.deleted <> 1 )
WHERE 1=1 AND ( wp_postmeta.meta_key = '_product_info_new' )
AND wp_posts.post_type = 'product'
AND (wp_posts.post_status = 'publish')
GROUP BY wp_posts.ID
ORDER BY dg.SoldDate IS NULL, dg.SoldDate ASC, dg.isPremium DESC LIMIT 0, 30
wp_product has the same number of rows after executing COUNT(*) for 10 minutes straight. But executing the same query above gives me a different result every 5 seconds or so, which is really strange. Is there a way to prevent the same query from returning a different result set?
If you don't specify an ORDER BY, the rows are returned in an undefined order by the SQL standard.
In practice, if you use InnoDB tables, the rows are returned in the order of the index used to access them.
The query optimizer might choose a different index from time to time, depending on the values you search for. It chooses index using a cost-based algorithm, trying to estimate which index will result in examining the fewest rows, or otherwise eliminating overhead.
The estimate is based on statistics about frequency of values in each index, and these are rough estimates, not always precise.
In some cases, the optimizer is choosing between alternatives that it estimates have very close cost according to its algorithm, and very slight changes in the statistics can cause the choice to flip-flop between these alternatives.
The bottom line is that if you need the rows returned in a specific order, then use ORDER BY. This will override the default order-by-index-reads behavior, and cause the result to be sorted if necessary.

Where to start to optimize slow WP SQL query?

I have the following query, which currently takes about 0.3s to load, causing a heavy load on my Wordpress site.
SELECT SQL_CALC_FOUND_ROWS wp11_posts.ID
FROM wp11_posts
WHERE 1=1
AND ( wp11_posts.ID NOT IN (
SELECT object_id
FROM wp11_term_relationships
WHERE term_taxonomy_id IN (137,141) )
AND (
SELECT COUNT(1)
FROM wp11_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp11_posts.ID ) = 1 )
AND wp11_posts.post_type = 'post'
AND ((wp11_posts.post_status = 'publish'))
GROUP BY wp11_posts.ID
ORDER BY wp11_posts.post_date DESC
LIMIT 0, 5
Where should I start to make it execute faster? Is there an apparent mistake standing out, that should definitely had been done differently?
You have a so-called dependent subquery (a/k/a correlated subquery) in your example. It's a performance killer.
WHERE (
SELECT COUNT(1)
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp_posts.ID
) = 1
Refactoring it to an independent subquery looks like this:
SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
JOIN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
GROUP BY object_id
HAVING COUNT(*) = 1
) justone ON wp_posts.ID = justone.object_id
... WHERE ...
See how this works? It needs to scan term_relationships just one time looking for object_ids meeting your criterion (just one). Then the ordinary inner JOIN excludes posts rows that don't meet that criterion. (The dependent subquery loops to scan the table multiple times, while we wait.)
The SQL_FOUND_ROWS thing: WordPress puts it there to help with "pagination" -- it lets WordPress figure out how many pages (in your case of five items) there are to display. It provides data to the familiar
987 Items << < 2 of [ 20 ] > >>
page-selection interface you see in many parts of WordPress: it counts all the items matched by your query (987 in this example), not just one pageload of them.
If you don't need that pagination you can turn it off by giving a 'nopagination' => true element to WP_Query(). But if your query only yields a small number of items without the LIMIT clause, this probably doesn't matter much. If you wrote the query yourself, just leave it out along with the ORDER BY and LIMIT clauses.
So, leaving in the pagination stuff, a better query is
ANALYZE FORMAT=JSON SELECT wp_posts.ID
FROM wp_posts
JOIN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (53)
GROUP BY object_id
HAVING COUNT(*) = 1
) justone ON wp_posts.ID = justone.object_id
WHERE 1 = 1
AND (
wp_posts.ID NOT IN (
SELECT object_id
FROM wp_term_relationships
WHERE term_taxonomy_id IN (137,141)
)
AND wp_posts.post_type = 'post'
AND wp_posts.post_status = 'publish')
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_date DESC LIMIT 0, 5
You also have an unnecessary GROUP BY near the end of your query. It doesn't hurt performance: MySQL can tell it's not needed in this case and doesn't do anything with it. But it is extra stuff. If you wrote the query yourself leave it out.
SQL_CALC_FOUND_ROWS requires doing nearly as much work as the same query without the LIMIT. [However, removing it without doing most of the following things probably won't help much.]
Do you already have this plugin installed? https://wordpress.org/plugins/index-wp-mysql-for-speed/ If not, that may be a good starting point.
WP is not designed to handle millions of posts/attributes/terms; you may have move on beyond WP.
Using JOIN or LEFT JOIN or [NOT] EXISTS ( SELECT 1 ... ) may be more efficient than IN ( SELECT ... ), especially in older versions of MySQL.
Is your SELECT COUNT(1) attempting to demand exactly 1? That is, 2 would be disallowed? If you really wanted to know if any exist, then use
AND EXISTS( SELECT 1 FROM wp11_term_relationships
WHERE term_taxonomy_id IN (53)
AND object_id = wp11_posts.ID )`
A better index for wp11_posts [I don't know whether your WP or the Plugin has this already]:
INDEX(post_status, post_type, -- first, either order is OK
post_date, ID) -- last, in this order
Having the GROUP BY and ORDER BY the 'same' may eliminate a sort. The following change will probably give you the same results, but faster.
GROUP BY wp11_posts.ID
ORDER BY wp11_posts.post_date DESC
-->
GROUP BY wp11_posts.post_date, wp11_posts.ID
ORDER BY wp11_posts.post_date DESC, wp11_posts.ID DESC

Why is limit 0,1 slower than limit 0, 17

I'm trying to analyze why the following query is slower with LIMIT 0,1 than LIMIT 0,100
I've added SQL_NO_CACHE for testing purposes.
Query:
SELECT
SQL_NO_CACHE SQL_CALC_FOUND_ROWS wp_posts.*,
low_stock_amount_meta.meta_value AS low_stock_amount
FROM
wp_posts
LEFT JOIN wp_wc_product_meta_lookup wc_product_meta_lookup ON wp_posts.ID = wc_product_meta_lookup.product_id
LEFT JOIN wp_postmeta AS low_stock_amount_meta ON wp_posts.ID = low_stock_amount_meta.post_id
AND low_stock_amount_meta.meta_key = '_low_stock_amount'
WHERE
1 = 1
AND wp_posts.post_type IN ('product', 'product_variation')
AND (
(wp_posts.post_status = 'publish')
)
AND wc_product_meta_lookup.stock_quantity IS NOT NULL
AND wc_product_meta_lookup.stock_status IN('instock', 'outofstock')
AND (
(
low_stock_amount_meta.meta_value > ''
AND wc_product_meta_lookup.stock_quantity <= CAST(
low_stock_amount_meta.meta_value AS SIGNED
)
)
OR (
(
low_stock_amount_meta.meta_value IS NULL
OR low_stock_amount_meta.meta_value <= ''
)
AND wc_product_meta_lookup.stock_quantity <= 2
)
)
ORDER BY
wp_posts.ID DESC
LIMIT
0, 1
Explains shows the exact same output
1 SIMPLE wp_posts index PRIMARY,type_status_date PRIMARY 8 NULL 27071 Using where
1 SIMPLE low_stock_amount_meta ref post_id,meta_key meta_key 767 const 1 Using where
1 SIMPLE wc_product_meta_lookup eq_ref PRIMARY,stock_status,stock_quantity,product_id PRIMARY 8 woocommerce-admin.wp_posts.ID 1 Using where
The average query time is 350ms with LIMIT 0,1
The average query time is 7ms with LIMIT 0,100
The query performance gets faster starting with LIMIT 0,17
I've added another column to the order by clause as suggested in this question, but that triggers Using filesort in the explain output
Order by wp_posts.post_date, wp_posts.ID desc
1 SIMPLE wp_posts ALL PRIMARY,type_status_date NULL NULL NULL 27071 Using where; Using filesort
1 SIMPLE low_stock_amount_meta ref post_id,meta_key meta_key 767 const 1 Using where
1 SIMPLE wc_product_meta_lookup eq_ref PRIMARY,stock_status,stock_quantity,product_id PRIMARY 8 woocommerce-admin.wp_posts.ID 1 Using where
Is there a way to work around it without altering indices and why is this happening?
It's also interesting that the query time improves starting with LIMIT 0,17. I'm not sure why 17 is a magic number here.
Update 1: I just tried adding FORCE INDEX(PRIMARY) and now LIMIT 0,100 has same performance as LIMIT 0,1 smh
wp_postmeta has sloppy indexes; this slows down most queries involving it.
O. Jones and I have made a WordPress plugin to improve the indexing of postmeta. We detect all sorts of stuff like the presence of the Barracuda version of the InnoDB storage engine, and other MySQL arcana, and do the right thing.
The may speed up all three averages. It is likely to change the EXPLAINs.
Analyzing this query. I confess I don't understand the performance change from LIMIT 1 to LIMIT 17. Still, the problem for your store's customers (or managers) is the slowness on LIMIT 1. So let's address that.
The question you linked was for postgreSQL, not MySQL. postgreSQL has a more sophisticated way of handling ORDER BY ... LIMIT 1 than MySQL does. And, the resolution to that problem was the adding of an appropriate compound index for the required lookup.
It looks to me like the purpose of your query is to find the low-stock or out-of-stock WooCommerce product with the largest wp_posts.ID
The LEFT JOIN to the wp_wc_product_meta_lookup table should be, and is, straightforward: the ON-condition column mentioned is its primary key. This table is, basically, WooCommerce's materialized view of numeric values like stock_quantity stored in wp_postmeta. Numeric values in wp_postmeta can't be indexed because that table stores them as text strings. Yeah. I know.
The LEFT JOIN between wp_posts and wp_postmeta follows the very common ON-condition pattern ON posts.ID = meta.post_id AND meta.meta_key = 'constant'. That ON condition is notorious for poor support by WordPress's standard indexes. More or less the entire purpose of Rick and my Index WP MySQL For Speed plugin is to provide good compound indexes in wp_postmeta to work around that problem.
How so? This is the DDL it runs to add the indexes. The most important lines for this purpose: ((There's more to it, read the linked article.)
ALTER TABLE wp_postmeta ADD PRIMARY KEY (post_id, meta_key, meta_id);
ALTER TABLE wp_postmeta ADD KEY meta_key (meta_key, post_id);
These two indexes support the ON-condition pattern in the query. I am pretty sure that adding theses keys to postmeta will make your query more predictable and faster in performance.
If the ORDER BY post.ID DESC is a very common use case, an index could be added for that.
You could try refactoring the query (if you have control over its source) to defer the retrieval of details from the wp_posts table. Like this.
SELECT wp_posts.*, postid.low_stock_amount
FROM (
wp_posts.ID, low_stock_amount_meta.meta_value AS low_stock_amount
FROM
wp_posts
LEFT JOIN wp_wc_product_meta_lookup wc_product_meta_lookup ON wp_posts.ID = wc_product_meta_lookup.product_id
LEFT JOIN wp_postmeta AS low_stock_amount_meta ON wp_posts.ID = low_stock_amount_meta.post_id
AND low_stock_amount_meta.meta_key = '_low_stock_amount'
WHERE
1 = 1
AND wp_posts.post_type IN ('product', 'product_variation')
AND (
(wp_posts.post_status = 'publish')
)
AND wc_product_meta_lookup.stock_quantity IS NOT NULL
AND wc_product_meta_lookup.stock_status IN('instock', 'outofstock')
AND (
(
low_stock_amount_meta.meta_value > ''
AND wc_product_meta_lookup.stock_quantity <= CAST(
low_stock_amount_meta.meta_value AS SIGNED
)
)
OR (
(
low_stock_amount_meta.meta_value IS NULL
OR low_stock_amount_meta.meta_value <= ''
)
AND wc_product_meta_lookup.stock_quantity <= 2
)
)
ORDER BY
wp_posts.ID DESC
LIMIT
0, 1
) postid
LEFT JOIN wp_posts ON wp_posts.ID = postid.ID
This refactoring makes your complex query sort only the wp_posts.ID value and then retrieves the posts data once it has the appropriate value in hand. Lots of WordPress core code does something similar: retrieves a list of post ID values in one query, then retrieves the post data in a second query.
And, by the way, MySQL 8 ignores SQL_NO_CACHE.

Wordpress MySQL - custom meta key order by key and date

I have a meta key which is set by a select drop down so a user can select an option between 1 and 14 and then save their post. I want the posts to display on the page from 1 to 14 ordered by date but if the user creates a new set of posts the next day I also want this to happen so you have posts 1 to 14 each day displaying in that order.. the SQL i have so far is as follows
SELECT SQL_CALC_FOUND_ROWS
wp_postmeta.meta_key,
wp_postmeta.meta_value,
wp_posts.*
FROM wp_posts
INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id)
WHERE 1=1
AND wp_posts.post_type = 'projectgallery'
AND ( wp_posts.post_status = 'publish'
OR wp_posts.post_status = 'private')
AND (wp_postmeta.meta_key = 'gallery_area' )
GROUP BY wp_posts.post_date asc
ORDER BY CAST(wp_postmeta.meta_value AS UNSIGNED) DESC,
DATE(wp_posts.post_date) desc;
Which gives me the following output noticte thatthe posts entered at different dates with either 1 or 3 show up in sequence, ideally i want the latest ones to display directly after 14 so it starts over again. the number 14 should not be static either as if someone adds another option to the select then it will increase and decrease if an option is removed.
GROUP BY is confusingly named. It only makes sense when there's a SUM() or COUNT() or some such function in the SELECT clause. It's not useful here.
The canonical way of getting a post_meta.value into a result set of post items is this. You're close but this makes it easier to read.
SELECT SQL_CALC_FOUND_ROWS
ga.meta_value gallery_area,
p.*
FROM wp_posts p
LEFT JOIN wp_postmeta ga ON p.ID = ga.post_id AND ga.meta_key = 'gallery_area'
WHERE 1=1
AND p.post_status IN ('publish', 'private')
AND p.post_type = 'projectgallery'
Notice the two parts of the ON clause in the JOIN. That way of doing the SQL gets you just the meta_key value you want cleanly.
So, that's your result set. You'll get a row for every post. If the metadata is missing, you'll get a NULL value for gallery_area.
Then you have to order the result set the way you want. First order by date, then order by gallery_area, like so:
ORDER BY DATE(p.post_date) DESC,
0+gallery_area ASC
The 0+value trick is sql shorthand for casting the value as an integer.
Edit. Things can get fouled up if the meta_value items contain extraneous characters like leading spaces. Try diagnosing with these changes. Put
DATE(p.post_date) pdate,
0+ga.meta_value numga,
ga.meta_value gallery_area
in your SELECT clause. If some of the numga items come up zero, this is your problem.
Also try
ORDER BY DATE(p.post_date) DESC,
0+TRIM(gallery_area) ASC
in an attempt to get rid of the spaces. But they might not be spaces.

Optimising a MySQL query with a SUM in the sub-query

I'm trying to do a very specific thing in WordPress: expire posts over 30 days old that have no "likes" (or negative "likes") based on someone else's plugin. That plugin stores individual likes/dislikes for each user/post in a separate table (+1/-1), which means that my selection criteria are complex, based on a SUM.
Doing the SELECT is easy, as it is a simple JOIN on post ID with a "HAVING" clause to detect the total likes value of more than zero. It looks like this (with all the table names simplified for readability):
SELECT posts.id, SUM( wti_like_post.value )
FROM posts
JOIN wti_like_post
ON posts.ID = wti_like_post.post_id
WHERE posts.post_date < DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY posts.ID
HAVING SUM( wti_like_post.value ) < 1
But I'm stuck on optimising the UPDATE query. The unoptimised version takes 2 minutes to run, which is unacceptable.
UPDATE posts
SET posts.post_status = 'trash'
WHERE posts.post_status = 'publish'
AND posts.post_type = 'post'
AND posts.post_date < DATE_SUB(NOW(), INTERVAL 30 DAY)
AND ID IN
(SELECT post_id FROM wti_like_posts
GROUP BY post_id
HAVING SUM( wti_like_post.value ) < 1 )
This is obviously because of my inability to create an UPDATE query with a join based on a SUM result - I simply don't know how to do that (believe me, I've tried!).
If anyone could optimise that UPDATE for me, I'd be terribly grateful. It'd also teach me how to do it properly, which would be neat!
Thanks in advance.
Well it also depends on the no. of posts and also in subquery it will SUM the post ids which were trashed also there should be filter in the subquery rather than your update query try this one
UPDATE posts
SET posts.post_status = 'trash'
WHERE ID IN
(
SELECT posts.id
FROM posts
INNER JOIN wti_like_post
ON (posts.ID = wti_like_post.post_id AND posts.post_status = 'publish'
AND posts.post_type = 'post')
WHERE posts.post_date < DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY posts.ID
HAVING SUM( wti_like_post.value ) < 1
)
Well maybe sounds stupid but you could create a table out of the select, place an Index on it and then simply use the standard JOIN for update on that new table.
I guess even if you do that always on the fly, it should be faster then the non-indexed version.
EDIT:
Here is the code, sry it's out of my head haven't checked if it passes but it should give you at least an idea what I mean.
CREATE TABLE joinHelper(
id INT NOT NULL,
PRIMARY KEY ( id )
);
INSERT INTO joinHelper(id)
SELECT post_id FROM wti_like_posts
GROUP BY post_id
HAVING SUM( wti_like_post.value ) < 1
UPDATE posts JOIN joinHelper ON (posts.ID = joinHelper.id)
SET posts.post_status = 'trash'
WHERE posts.post_status = 'publish'
AND posts.post_type = 'post'
AND posts.post_date < DATE_SUB(NOW(), INTERVAL 30 DAY)