Query Optimization with Indexes - mysql

I am needing some assistance on optimizing this WordPress/WooCommerce query:
SELECT
p.ID AS order_id
,DATE(p.post_date) AS order_date
,SUBSTR(comment_content,17) AS csr
,SUBSTR(p.post_status,4) AS order_status
,UCASE(CONCAT((SELECT wp_postmeta.meta_value FROM wp_postmeta WHERE meta_key = '_billing_first_name' and wp_postmeta.post_id = p.ID),' ',(SELECT wp_postmeta.meta_value FROM wp_postmeta WHERE meta_key = '_billing_last_name' and wp_postmeta.post_id = p.ID))) AS customer
,(SELECT GROUP_CONCAT(DISTINCT order_item_name ORDER BY order_item_name ASC SEPARATOR ', ') FROM wp_woocommerce_order_items WHERE order_id = p.ID AND order_item_type = 'line_item' GROUP BY order_id) AS products
,(SELECT GROUP_CONCAT(CONCAT(serial_number,'',serial_feature_code)) FROM wp_custom_serial WHERE wp_custom_serial.order_id = p.ID GROUP BY wp_custom_serial.order_id) AS serials
FROM
wp_posts AS p
INNER JOIN wp_comments AS c ON p.ID = c.comment_post_ID
INNER JOIN wp_postmeta AS pm ON p.ID = pm.post_id
WHERE
p.post_type = 'shop_order'
AND comment_content LIKE 'Order placed by%'
GROUP BY p.ID
ORDER BY SUBSTR(comment_content,17) ASC, p.post_date DESC;
I do not understand what EXPLAIN is telling me and need some guidance on how to speed it up. Can someone describe what, in the EXPLAIN response, indicates where my issue is and where to look for answers?
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
c
NULL
ALL
comment_post_ID
NULL
NULL
NULL
20452
11.11
Using where; Using temporary; Using filesort
1
PRIMARY
p
NULL
eq_ref
PRIMARY,post_name,type_status_date,post_parent,post_author
PRIMARY
8
db.c.comment_post_ID
1
50.00
Using where
1
PRIMARY
pm
NULL
ref
post_id
post_id
8
db.c.comment_post_ID
33
100.00
Using index
2
DEPENDENT SUBQUERY
wp_postmeta
NULL
ref
post_id,meta_key
post_id
8
func
33
2.26
Using where
3
DEPENDENT SUBQUERY
wp_postmeta
NULL
ref
post_id,meta_key
post_id
8
func
33
2.30
Using where
4
DEPENDENT SUBQUERY
wp_woocommerce_order_items
NULL
ref
order_id
order_id
8
func
2
10.00
Using where
5
DEPENDENT SUBQUERY
wp_custom_serial
NULL
ALL
NULL
NULL
NULL
NULL
5160
10.00
Using where; Using filesort

Queries are processed in distinct stages. The first clauses processed are the FROM, then WHERE, and then the SELECT clause. Those dependent subqueries mean that for each row that you "have" after processing the FROM and WHERE clauses you are running separate, new subqueries for each row of those results. In your case you are doing that times four.
You can usually rework this to move these queries out of the SELECT clause and into the FROM clause.
Taking one column you have, the serials column, I think you would want to move that into the FROM clause in a way like this
SELECT p.ID AS order_id
, DATE(p.post_date) AS order_date
, SUBSTR(comment_content, 17) AS csr
, SUBSTR(p.post_status, 4) AS order_status
, UCASE(CONCAT((SELECT wp_postmeta.meta_value
FROM wp_postmeta
WHERE meta_key = '_billing_first_name' and wp_postmeta.post_id = p.ID), ' ',
(SELECT wp_postmeta.meta_value
FROM wp_postmeta
WHERE meta_key = '_billing_last_name' and wp_postmeta.post_id = p.ID))) AS customer
, (SELECT GROUP_CONCAT(DISTINCT order_item_name ORDER BY order_item_name ASC SEPARATOR ', ')
FROM wp_woocommerce_order_items
WHERE order_id = p.ID
AND order_item_type = 'line_item'
GROUP BY order_id) AS products
, serials_sub.serials
FROM wp_posts AS p
INNER JOIN wp_comments AS c ON p.ID = c.comment_post_ID
INNER JOIN wp_postmeta AS pm ON p.ID = pm.post_id
LEFT JOIN (
SELECT p.ID as post_id, GROUP_CONCAT(CONCAT(cs.serial_number, '', cs.serial_feature_code)) AS serials
FROM wp_custom_serial cs
JOIN wp_posts AS p ON cs.order_id = p.ID
WHERE p.post_type = 'shop_order'
AND comment_content LIKE 'Order placed by%'
GROUP BY cs.order_id
) as serials_sub ON serials_sub.post_id = p.ID
WHERE p.post_type = 'shop_order'
AND comment_content LIKE 'Order placed by%'
GROUP BY p.ID
ORDER BY SUBSTR(comment_content, 17) ASC, p.post_date DESC;
The difference here is that instead of separate queries being performed for each row, a single subquery is used in the initial FROM clause. So while perhaps looking more unwieldy, in fact this will give you much better performance.
Following this pattern for the other subqueries I think will resolve your issues.
If interested here is the documentation on the EXPLAIN.
https://dev.mysql.com/doc/refman/8.0/en/execution-plan-information.html
And I recommend the book High Performance MySQL.

The outer wp_postmeta does not seem to be used. Remove the JOIN if possible.
Hiding information in the middle of strings leads to SUBSTR() usage, which is inefficient.
The GROUP BY p.ID seems to be unnecessary.
The plugin WP Index Improvements would help with some parts.

Related

SQL - Slow SQL Query

I have a sql query (see below) for wordpress which is taking around 4-5secs to get results. It gives all order ids which have a product/variation id in it.
I want to make it more fast, any help?
SELECT p.ID order_id
FROM wp_posts p
INNER JOIN wp_woocommerce_order_items i ON p.ID=i.order_id
INNER JOIN wp_woocommerce_order_itemmeta im ON i.order_item_id=im.order_item_id
WHERE im.meta_key IN ('_product_id','_variation_id')
AND im.meta_value IN ('703899','981273','981274','981275')
AND p.post_status IN ('wc-completed')
GROUP BY p.ID HAVING COUNT(p.ID)>1
ORDER BY p.post_date desc
LIMIT 0, 20
Above query EXPLAIN:
Why do you join when you only want to select IDs from wp_posts anyway?
SELECT p.ID order_id
FROM wp_posts p
WHERE p.post_status = 'wc-completed'
AND p.ID IN
(
SELECT i.order_id
FROM wp_woocommerce_order_items i
JOIN wp_woocommerce_order_itemmeta im ON im.order_item_id = i.order_item_id
WHERE im.meta_key IN ('_product_id','_variation_id')
AND im.meta_value IN ('703899','981273','981274','981275')
GROUP BY i.order_id
HAVING COUNT(*) > 1
)
ORDER BY p.post_date DESC
LIMIT 0, 20;
Now let's think about how the DBMS can address this. It can look for posts with status 'wc-completed', if there are only few such rows and then check whether they represent an order with more than one of the desired items. This would ask for these indexes:
create index idx1 on wp_posts(post_status, id, post_date);
create index idx2 on wp_woocommerce_order_items(order_id, order_item_id);
create index idx3 on wp_woocommerce_order_itemmeta(order_item_id, meta_key, meta_value);
Or it could look for the desired products, see whether an order contains more than one of them and then check whther this relates to a post with status = 'wc-completed'. That would ask for these indexes:
create index idx4 on wp_woocommerce_order_itemmeta(meta_key, meta_value, order_item_id);
create index idx5 on wp_woocommerce_order_items(order_item_id, order_id);
create index idx6 on wp_posts(id, post_status, post_date);
We don't know which way the DBMS will prefer, so we create all six indexes. Then we look at the explain plan to see which are being used and remove the others. Maybe the DBMS even sees no advantage in using indexes here at all, but I find this unlikely.
The first thing you can try doing is trimming what data you fetch.
That means:
Not fetching fields that you don't need/check
Implementing our constrains before joining
SELECT
p.ID order_id
FROM
(SELECT id, post_status, post_date FROM wp_posts WHERE post_status = 'wc-completed') p,
(SELECT order_id, order_item_id FROM wp_woocommerce_order_items) i,
(
SELECT
order_item_id,
meta_key,
meta_value
FROM
wp_woocommerce_order_itemmeta
WHERE
meta_key IN ('_product_id','_variation_id')
AND meta_value IN ('703899','981273','981274','981275')
) im
WHERE
p.ID = i.order_id
AND i.order_item_id = im.order_item_id
GROUP BY
p.ID
HAVING
COUNT(p.ID)>1
ORDER BY
p.post_date desc
LIMIT
0, 20
Edit:
If Inner joins are necessary, you can try:
SELECT
p.ID order_id
FROM
(SELECT id, post_status, post_date FROM wp_posts WHERE post_status = 'wc-completed') p
INNER JOIN
(SELECT order_id, order_item_id FROM wp_woocommerce_order_items) i
ON
p.ID = i.order_id
INNER JOIN
(
SELECT
order_item_id,
meta_key,
meta_value
FROM
wp_woocommerce_order_itemmeta
WHERE
meta_key IN ('_product_id','_variation_id')
AND meta_value IN ('703899','981273','981274','981275')
) im
ON
i.order_item_id = im.order_item_id
GROUP BY
p.ID
HAVING
COUNT(p.ID)>1
ORDER BY
p.post_date desc
LIMIT
0, 20
ps* I hope my syntax is correct ˙ my SQL is quite rusty

Select random row per distinct field value while using joins

I have a Wordpress instance showing some posts. Each post is defined in a specific language and has a property _post_year set. So we can have several posts with the same language and referring to the same year.
MySQL tables:
wp-posts
Contains all posts.
ID | post_author | post_date | ...
==================================
1 | ...
2 | ...
...
wp_term_relationships
Contains information about a language of a post (amongst other things).
object_id | term_taxonomy_id | term_order |
===========================================
1 | ...
1 | ...
2 | ...
...
wp_postmeta
Contains post meta information (like an additional property "_post_year").
meta_id | post_id | meta_key | meta_value |
===========================================
1 | 1 | ...
2 | 1 | ...
...
I once was able to load one random post per year (for all years available) like this:
SELECT DISTINCT
wp_posts.*,
postmeta.meta_value as post_meta_year
FROM (
SELECT * FROM wp_posts
JOIN wp_term_relationships as term_relationships
ON term_relationships.object_id = wp_posts.ID
AND term_relationships.term_taxonomy_id IN ({LANGUAGE_ID})
ORDER BY RAND()
) as wp_posts
JOIN wp_postmeta as postmeta
ON postmeta.post_id = wp_posts.ID
AND postmeta.meta_key = '_post_year'
AND post_status = 'publish'
GROUP BY post_meta_year DESC
ORDER BY post_meta_year DESC
Since i upgraded MySQL to version 5.7 this doesn't work anymore:
Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'wp_posts.ID' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
How can i achieve to get a random post per year sorted descendingly?
One method you can try: From a derived table with the distinct years select the year and, in a correlated subquery, a random post ID with that year using ORDER BY rand() and LIMIT 1. Join the result of that second derived table with the posts.
SELECT po1.*,
ppmo1.meta_value
FROM (SELECT pmo1.meta_value,
(SELECT pi1.id
FROM wp_posts pi1
INNER JOIN wp_postmeta pmi2
ON pmi2.post_id = pi1.id
INNER JOIN wp_term_relationships tri1
ON tri1.object_id = pi1.id
WHERE tri1.term_taxonomy_id = {LANGUAGE_ID}
AND pmi2.meta_key = '_post_year'
AND pmi2.meta_value = pmo1.meta_value
ORDER BY rand()
LIMIT 1) id
FROM (SELECT DISTINCT
pmi1.meta_value
FROM wp_postmeta pmi1
WHERE pmi1.meta_key = '_post_year') pmo1) ppmo1
INNER JOIN wp_posts po1
ON po1.id = ppmo1.id
ORDER BY ppmo1.meta_value DESC;
(Untested because schema and sample data weren't given by consumable DDL and DML.)
In MySQL 5.7, where mode ONLY_FULL_GROUP_BY is (happily) enabled by default, I would recommend a correlated subquery for filtering:
select * -- better enumerate the actual column names here
from wp_posts p
inner join wp_postmeta pm on pm.post_id = p.id
where pm.meta_key = '_post_year' and p.id = (
select pm1.post_id
from wp_post p1
inner join wp_postmeta pm1 on pm1.post_id = p1.id
where p1.status = 'publish' and pm1.meta_key = '_post_year' and pm1.meta_value = pm.meta_value
order by rand() limit 1
)
Basically the subquery selects one random post id per group of records having the same '_post_year', which is used to filter the query.
Note that with this technique there is no need to filter again in the outer query on the post status, since the subquery does it already and returns a primary key column.

MySQL query stalling - is there a more efficient solution for this MySQL Query?

SELECT post_title,
count(*) AS c
FROM wp_posts
WHERE post_type = "product"
GROUP BY post_title
HAVING c > 1
ORDER BY c DESC
runs no problem, returns result in < 1 sec. Yet
select * from wp_posts where post_title in (
select post_title from wp_posts WHERE post_type = "product"
group by post_title having count(*) > 1
)
hangs up.
Yet they are fundamentally the same query except for the fact that in the second query I'm trying to pull out the entire record rather than just the post_title.
Have I erred? Is there a more efficient way to achieve the equivalent?
Edit: EXPLAIN query and SHOW CREATE TABLE wp_posts has been appended for your information.
you could avoid the IN clause on the subquery and use an inner join
select a.*
from wp_posts a
INNER JOIN (
select post_title
from wp_posts
WHERE post_type = "product"
group by post_title
having count(*) > 1
) t ON t.post_title = a.post_title
this should be more performant
The most efficient way to write this query is probably using exists . . . assuming wp_posts has a primary key:
select p.*
from wp_posts p
where p.post_type = 'product' and
exists (select 1
from wp_posts p2
where p2.post_title = p.post_title and
p2.post_type = p.post_type and
p2.id <> p.id
);
For performance, yu want an index on wp_posts(post_type, title, id).

SQL - ordering results from joined table

Note: This question is related to a WordPress specific question but I wanted to have an "outside look" at this from a pure SQL point of view: https://wordpress.stackexchange.com/questions/55263/order-posts-by-custom-field-and-if-custom-field-is-empty-return-remaining-posts
Let's say we have to tables with the following strucure:
Tabe posts: ID (key), Title
Table post_metadata: post_ID(FKEY), meta_key, meta_value
And I want to retrieve ID and Title of posts that have:
an entry in post_metadata with key = 'meta_1' and meta_value = 'value_1'
AND an entry in post_metadata with key = 'meta_2' and meta_value = 'value_2'
I want to order the results by the value of a third metadata with meta_key = "meta_3".
Now here is the tricky part:
Not all posts have an entry in post_metadata table with 'meta_3' as meta_key. Since Im not filtering posts by meta_3, only ordering, I wanted to keep these posts in my results, as if they had an empty value for this meta.
How can we achieve that?
Thanks
Edit:
There is SQL fiddle now: https://www.db-fiddle.com/f/kBNaaRFB5xfna5MniuTpaG/1
Perhaps:
Use an Left join once to get meta 3 value if it exists ensuring you keep all posts that have meta1 and meta2 with desired values.
and then use an exist and having to ensure you only get records having both meta1 and 2 with desired values.
UNTESTED...
SELECT P.ID, P.Title, PM.Meta_value
FROM Posts P
LEFT JOIN Post_MetaData PM
on P.ID = PM.Post_ID
and PM.key = 'meta_3'
WHERE exists (SELECT 1
FROM post_meta
WHERE ((Key 'meta_1' and meta_value = 'Value_1') OR
(Key 'meta_2' and meta_value = 'Value_2'))
and P.ID = Post_ID --Either here or an AND in the HAVING clause...need to test to know
GROUP BY Post_ID
HAVING count(*) = 2 )
ORDER BY -PM.meta_value desc, P.ID
This does assume that post_metaData has a unique constraint on the key per Post_ID. otherwise we could get meta_1 with value a and meta1 with value a and the count(*) would be 2; and incorrectly return it in the results.
To ensure nulls are last follow this approach;
MySQL Orderby a number, Nulls last
Doing this as an IN.... but would be slower I would think.
SELECT P.ID, P.Title, PM.Meta_value
FROM Posts P
LEFT JOIN Post_MetaData PM
on P.ID = PM.Post_ID
and PM.key = 'meta_3'
WHERE P.ID in (SELECT Post_meta.Post_ID
FROM post_meta
WHERE ((Key 'meta_1' and meta_value = 'Value_1') OR
(Key 'meta_2' and meta_value = 'Value_2'))
GROUP BY Post_ID
HAVING count(*) = 2 )
ORDER BY -PM.meta_value desc, P.ID

MySQL: order by value from second table, use default if value not set

This takes place inside WordPress, but it's a general MySQL question.
There are two tables, one of which contains posts, the other metadata, linked by ID.
post_title | ID post_id | meta_key | meta_value
-----------+--- --------+----------+-----------
title | 1 1 | key_1 | aaa
-----------+--- --------+----------+-----------
title | 2 1 | key_2 | bbb
--------+----------+-----------
1 | mykey | 1
--------+----------+-----------
2 | key_n | ccc ddd
I'm trying to order results on some column value, which might not be set for all rows. Basically, I want to see rows with this column/value pair set first, followed by all the others. Each post might have some metadata associated with it, based on meta_key and meta_value pairs. There may be more keys for a single post and they need not include the one I want to sort by.
The problem is that using a MySQL query with a WHERE meta_key = mykey will exclude all the posts where this key doesn't exist. So what I need is a way to display a default value for all those posts, where this meta key doesn't exist.
First step: It's easy to select all rows with a certain meta_key:
SELECT
p.ID, p.post_title, p.post_type, p.post_date, m.meta_value
FROM wp_posts AS p
LEFT JOIN wp_postmeta AS m ON p.ID = m.post_id
WHERE
m.meta_key = 'mykey'
Second step: how do I select all the rows where this meta_key doesn't exist?
Here's what I mean, but this is probably a bad solution:
SELECT
p.ID, p.post_title, p.post_type, p.post_date, "some_default"
FROM wp_posts AS p
WHERE
p.ID NOT IN (
SELECT
p.ID
FROM wp_posts AS p
LEFT JOIN wp_postmeta AS m ON p.ID = m.post_id
WHERE
m.meta_key = 'mykey'
)
Third step: show combined results. This could be a UNION of both queries above.
I'm sure there must be a better sulution. What's more important, I don't know how to specify additional paramaters – e. g., first find all posts with some given meta key, or title, or category etc. and then order by said mykey as layed out above.
FINAL EDIT
If anyone's interested, here's the final solution in context. RedFilter's answer made it possible, thanks again.
SELECT p1.ID, p1.post_title, p1.post_type, p1.post_date, m1.meta_value AS meta1, meta2
FROM wp_posts AS p1
LEFT JOIN wp_postmeta AS m1 ON m1.post_id = p1.ID
LEFT JOIN wp_term_relationships AS tr0 ON tr0.object_id = p1.ID
LEFT JOIN wp_term_taxonomy AS tt0 ON tr0.term_taxonomy_id = tt0.term_taxonomy_id
LEFT JOIN wp_terms AS t0 ON tt0.term_id = t0.term_id
LEFT JOIN
(
SELECT
p.ID, IF (m.meta_value = 'on', 1, 0) AS meta2
FROM wp_posts AS p
LEFT JOIN wp_postmeta AS m
ON p.ID = m.post_id
and m.meta_key = 'mykey'
) as extra
ON extra.ID = p1.ID
WHERE 1 = 1
AND m1.meta_key = 'some-other-meta-key'
AND p1.post_type IN ('post', 'some-custom-post-type')
AND tt0.taxonomy = 'some-taxonomy'
AND t0.term_id = 'some-id'
ORDER BY meta2 DESC, meta1 ASC, p1.post_date DESC
SELECT p.ID, p.post_title, p.post_type, p.post_date,
ifnull(m.meta_value, 'default val') as meta_value
FROM wp_posts AS p
LEFT JOIN wp_postmeta AS m ON p.ID = m.post_id
and m.meta_key = 'mykey'