Hopefully this is the right forum, my question seems to overlap the stack exchange community so this seemed best.
I have some custom reports for my WooCommerce orders on my wordpress site. I have one query that is just freezing locally, meaning in my localhost my CPU goes to 100% and it never finishes and I don't understand why. To the point here is the query:
SELECT SUM(postmeta.meta_value)
FROM pca_postmeta AS postmeta
LEFT JOIN pca_woocommerce_order_items AS orders ON orders.order_id = postmeta.post_id
WHERE postmeta.meta_key = '_order_total'
AND orders.order_item_id IN (
SELECT item_meta.order_item_id
FROM pca_woocommerce_order_itemmeta AS item_meta
LEFT JOIN pca_woocommerce_order_items AS orders ON item_meta.order_item_id = orders.order_item_id
LEFT JOIN pca_posts AS posts ON posts.ID = orders.order_id
WHERE item_meta.meta_value = '23563'
AND posts.post_status IN ('wc-processing','wc-completed')
GROUP BY orders.order_id
)
As you can hopefully see the goal here is to get the summation of all orders from this specific campaign (23563). The nested query works exactly as expected on its own, returning just a list of IDs like so:
NOTE: little curious if 2.6289 secs is long when it only returned 65 total, although there are 148220 total
The problem is this query doesn't seem to like the nested part. Any suggestions? Completely different approach in mind?
P.S. I use that nested query at other times as well to represent all orders by campaign id in my php reporting class. But for my question PHP has nothing to do with it.
UPDATE/FOLLOW UP:
Is it possible to convert this into a join as described here: Using a SELECT statement within a WHERE clause ? I'm a little light on my SQL so not sure how I would do that but it seems promising
GROUP BY orders.order_id
does not make sense because you are selecting only order_item_id.
pca_woocommerce_order_itemmeta would benefit from
INDEX(meta_value, order_item_id)
An this might be an equivalent query, but avoiding the IN(SELECT...):
SELECT SUM(pm.meta_value)
FROM
( SELECT im.order_item_id
FROM pca_woocommerce_order_itemmeta AS im
LEFT JOIN pca_woocommerce_order_items AS o
ON im.order_item_id = o.order_item_id
LEFT JOIN pca_posts AS posts ON posts.ID = o.order_id
WHERE im.meta_value = '23563'
AND posts.post_status IN ('wc-processing','wc-completed')
GROUP BY o.order_id
) AS w
JOIN pca_woocommerce_order_items AS o ON w.order_item_id = o.order_item_id
JOIN pca_postmeta AS pm ON o.order_id = pm.post_id
WHERE pm.meta_key = '_order_total'
Edit
Some principles behind what I did. Here I am guessing at what the optimizer will do with various possible formulations of the query.
I got rid of LEFT -- This may have changed the output. But I needed to avoid LEFT JOIN ( SELECT ... ) which would not be optimizable.
By having one subquery in the list of "tables" being JOINed, the optimizer will (almost certainly) start with the subquery and do "Nested Loop Joins" to the other tables. NLJ is the common way to perform a query.
A subselect like that has no index, so it needs to be first in the order, else it will be very inefficient.
Without subqueries, the optimizer generally likes to start with whichever table has something in the WHERE clause.
The requirement to start with the subquery "table" is stronger than the desire to pick the table based on WHERE pm.meta_key = '_order_total'.
Inside the subquery, the only "=" test (WHERE im.meta_value = '23563) provides the likely starting point for that set of JOINs. This is further enhanced by it not being 'right' of a LEFT JOIN. Hence, I suggested that index.
Related
in query here i have https://www.db-fiddle.com/f/32Kc3QisUEwmSM8EmULpgd/1
SELECT p.prank, d.dare
FROM dares d
INNER JOIN pranks p ON p.id = d.prank_id
WHERE d.condo_id = 1;
i have one condo with id 1 and it have unique connection to dares that has connection to pranks and unique connection to condos_pranks
and i wanna have all unique pranks from both tables and i used this query above to get relation of
dares to pranks and expected result was L,M,N - Yes,No,Maybe and it is correct but i also wanna have those in condos_pranks which ids are 1,4,5,6 = L,O,P,Q
so i tried to join the table with left join because it might not have condos_pranks row
SELECT p.prank, d.dare
FROM dares d
INNER JOIN pranks p ON p.id = d.prank_id
LEFT JOIN condos_pranks pd ON pd.condo_id = d.condo_id AND pd.prank_id = p.id
WHERE d.condo_id = 1;
but result is same as first and what i want is
prank
dare
L
Yes
M
No
N
Maybe
O
No
P
No
Q
No
with default being No = 2 if prank_id of condos_pranks is not in dares
how to connect it?
This seems like an exercise in identifying extraneous information more than anything. You are unable to join something to a table that has no key, however if you know your default then you may use something like coalesce to identify the records where there was no data to join NULL and replace them with your default.
I mentioned in a comment above that this table schema makes little sense. You have keys all over the place that doing have all sorts of circular references. If this is your derived schema, consider stopping here and revisiting the relationships. If it is not and it is something educational, which I suspect it is, disregard and recognize the logical flaws in what you are working in. Perhaps consider taking the data provided and creating a new table schema that is more normalized and uses other tables to handle the many to many and one to many relationships.
dbfiddle
SELECT
pranks.prank,
COALESCE(dares.dare, 'No')
FROM pranks LEFT OUTER JOIN
dares ON pranks.id = dares.prank_id
ORDER BY pranks.prank ASC;
clearlyclueless gave correct explanations
To achieve the result, the following SELECT can also be used:
SELECT
pranks.prank,
case
when dare is null then 'No'
else dare
end
FROM pranks LEFT OUTER JOIN
dares ON pranks.id = dares.prank_id
I can't for the life of me get this statement to work.
SELECT max(pm.timestamp), pm.id, pm.p_media_user_id, pm.p_media_type,
pm.p_media_file, pm.wall_post, pm.p_media_location,pm.p_media_location_name,
pm.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN p_users as pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts as pa ON pm.id = pa.post_id AND pa.p_source_alert_id ='3849084'
group by pm.p_media_user_id;
The only thing that I am having issues with is the max(pm.timestamp), after the grouping I would expect it to show the NEWEST rows in the p_media table, but to the contrary it's doing the exact opposite and showing the oldest rows. So, I need the newest rows from the p_media table grouped by the user id which Join the p_users table.
Thanks in advance, if anyone helps.
As others have already pointed out, you are aggregating by the p_media_user_id column but then selecting other non aggregate columns. This either won't run at all, or it will run but give non determistic results. However, it looks like you just want the most recent record from the p_media table, for each p_media_user_id.
If so, then this would seem to be the query you intended to run:
SELECT
pm1.timestamp, pm1.id, pm1.p_media_user_id, pm1.p_media_type, pm1.p_media_file,
pm1.wall_post, pm1.p_media_location, pm1.p_media_location_name,
pm1.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm1
INNER JOIN
(
SELECT p_media_user_id, MAX(timestamp) AS max_timestamp
FROM p_media
GROUP BY p_media_user_id
) pm2
ON pm1.p_media_user_id = pm2.p_media_user_id AND
pm1.timestamp = pm2.max_timestamp
INNER JOIN p_users AS pu
ON pm1.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts AS pa
ON pm1.id = pa.post_id AND
pa.p_source_alert_id = '3849084';
Your query is not doing what you think it is doing. When you use GROUP BY, only the columns that appear in the GROUP BY clause can be used in the SELECT without an aggregate function. All columns that are not in the GROUP BY clause MUST be using in an aggregate function when adding them to the SELECT.
This is the standard, and for all databases that follow the standards, you will get an error from your query. For some reason, MySQL decided not to follow the standards on this and no error is returned. This is really bad, because your query will run, but the results cannot be predicted. So you will think that the query is fine and will wonder why you get the wrong results, while in fact your query is invalid.
MySQL has finally addressed the problem and starting with MySQL 5.7.5, the ONLY_FULL_GROUP_BY SQL mode is enabled by default. The reason they gave is rather silly: because GROUP BY processing has become more sophisticated to include detection of functional dependencies., but at least they've changed the default and starting with MySQL 5.7.5, it will behave like most other databases. For earlier versions, if you have access to change the settings, I recommend enabling ONLY_FULL_GROUP_BY so you get a clear error for such invalid queries.
In some cases, you really don't care about the value returned for the non-aggregate columns, if all the values are exactly the same. To let the query pass while ONLY_FULL_GROUP_BY is enabled, use the ANY_VALUE() function on those columns. The is a better approach as it clearly indicate your intention.
To learn how you can fix your query, you can read How do we select non-aggregate columns in a query with a GROUP BY clause. You need to self-join the p_media table with only the p_media_user_id and MAX(timestamp) selected on the grouping:
SELECT pm.timestamp, pm.id, pm.p_media_user_id, pm.p_media_type, pm.p_media_file,
pm.wall_post, pm.p_media_location, pm.p_media_location_name, pm.p_media_category,
pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN (SELECT p_media_user_id, MAX(timestamp) AS max_time
FROM p_media
GROUP BY p_media_user_id
) pmm ON pm.p_media_user_id = pmm.p_media_user_id
AND pm.timestamp = pmm.max_time
INNER JOIN p_users AS pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts AS pa ON pm.id = pa.post_id
AND pa.p_source_alert_id = '3849084';
You should be able to add an ORDER BY after the grouping and tell SQL what column you want to sort by [ASC or DESC].
SELECT max(pm.timestamp), pm.id, pm.p_media_user_id, pm.p_media_type,
pm.p_media_file, pm.wall_post, pm.p_media_location,pm.p_media_location_name,
pm.p_media_category, pa.p_source_alert_id, pa.post_id, pa.p_target_alert_id,
pu.fb_id, pu.username, pu.city, pu.sex, pu.main_image
FROM p_media as pm
INNER JOIN p_users as pu ON pm.p_media_user_id = pu.fb_id
LEFT JOIN p_alerts as pa ON pm.id = pa.post_id AND pa.p_source_alert_id ='3849084'
group by pm.p_media_user_id
ORDER BY pm.p_media_user_id DESC;
I have read through many tutorials online and here on stackoverflow but I still can't figure out how to solve the problem I'm facing right now.
I would like to tell you guys that I'm a mysql newbie so please forgive my noobness.
Alright, the query is this and it grabs the information that I need from wordpress database
SELECT
product.ID productId,
product.guid productLink,
product.post_title productTitle,
post.ID postId,
post.post_title postTitle,
post.post_content postContent,
post.post_date postDate,
tm.slug typeSlug, tm.name typeName,
tm2.slug langSlug, tm2.name langName,
tm3.slug pubSlug, tm3.name pubName,
IFNULL(wl.id,0) wishlist
FROM wp_posts product
JOIN wp_postmeta meta ON meta.meta_key = 'p2m' AND meta.meta_value=product.ID
JOIN wp_posts post ON post.ID = meta.post_id
JOIN wp_term_relationships tr ON tr.object_id = product.ID
JOIN wp_term_taxonomy tt ON tt.term_taxonomy_id = tr.term_taxonomy_id AND tt.taxonomy = 'mtype'
JOIN wp_terms tm ON tm.term_id = tt.term_id
JOIN wp_term_relationships tr2 ON tr2.object_id = product.ID
JOIN wp_term_taxonomy tt2 ON tt2.term_taxonomy_id = tr2.term_taxonomy_id AND tt2.taxonomy = 'language'
JOIN wp_terms tm2 ON tm2.term_id = tt2.term_id
JOIN wp_term_relationships tr3 ON tr3.object_id = product.ID
JOIN wp_term_taxonomy tt3 ON tt3.term_taxonomy_id = tr3.term_taxonomy_id AND tt3.taxonomy = 'publisher'
JOIN wp_terms tm3 ON tm3.term_id = tt3.term_id
LEFT JOIN wp_yith_wcwl wl ON wl.user_id = 1 AND wl.prod_id = product.ID AND wl.post_id = post.ID
WHERE product.post_type = 'product'
ORDER BY post.post_date DESC LIMIT 0,35
When I remove "ORDER BY post.post_date DESC" the speed of the query gets down to .03 seconds which is freaking amazing.. But with the addition of the "ORDER BY post.post_date DESC" the speed of the query goes to amazing 10+ seconds which is way too long..
I've used EXPLAIN and it seems that there is usage of filesort when the ORDER BY by date gets into the query.
I need to have my query reply back the results according to the post_date so I can't figure out what I could do at this point...
Additionally, I would like to point it out that in Database Description of wordpress there is an INDEX referred as "type_status_date" which could be used in my case. However, I'm totally clueless where to use it and how to do it. If there is anyone who can point out the flaw in the logic of my query or help me out with the optimization of the query (or index) please do so. Thanks for you kind attention!
P.S: I don't know how to create an index too :)
Initial Result of EXPLAIN with ORDER BY
JOIN wp_postmeta meta
ON meta.meta_key = 'p2m' -- filters
AND meta.meta_value=product.ID -- shows relation
is confusing. JOIN...ON is used to say how two tables are related. Filters belong in WHERE:
WHERE ...
AND meta.meta_key = 'p2m'
...
wp_postmeta is not well indexed. More discussion here .
Adding INDEX(post_date) may or may not help performance -- It depends on how quickly 35 good rows are found.
From the EXPLAIN, we see that the worst part is getting into meta -- something like 30K rows to look through. This _estimates that there are 30 rows with meta_key = 'p2m'. How many rows are there?
Unfortunately wp_postmeta is not designed to efficiently start with the meta_key+meta_value. This is a general problem with key-value stores (such as Posts in WP), especially when the 'value' is LONGTEXT.
The index on wp_post for type_status_date, has the date field as the third field of the index,
type_status_date INDEX
post_type
post_status
post_date
ID
So you have a predicate for post type, but post status is not included in your query in any predicate, at best it can therefore do a partial index scan of the index (sometimes called a skip scan or range scan depending on the db, my sql is not going to play ball easily on that) but it will be slower.
That is a best case scenario though, with all those joins and additional fields not covered by the index the cost of the index scan and row lookups could be far too high to consider even touching the index vs a straight scan.
It would help if you post the explain plan, it would help confirm what the optimizer was doing. The above is a more generic DB engine commentary.
type_status_date is a combined index so it's used only if you order by all it's components. It cannot be used by MySQL to order by only post_date. So the best solution is to add an index by post_date.
I am trying to optimize mysql to decrease my server load.. And here i have a complex query which will be used about 1k times/minute on a quad core server with 8gb ram and my server is going down.
I have tried many ways to rewrite this query :
SELECT *
FROM (
SELECT a.id,
a.url
FROM surf a
LEFT JOIN users b
ON b.id = a.user
LEFT JOIN surfed c
ON c.user = 'asdf' AND c.site = a.id
WHERE a.active = '0'
AND (b.coins >= a.cpc AND a.cpc >= '2')
AND (c.site IS NULL AND a.user !='asdf')
ORDER BY a.cpc DESC, b.premium DESC
LIMIT 100) AS records
ORDER BY RAND()
LIMIT 1
But it didn't work. So can you guys help me to rewrite the above query so that it would not waste any resources ?
Also this query doesn't have any indexes :( . It would be very helpful to guide me creating indexes for this.
The problem is most likely the inner sort.
You should have indexes on surfed(site, user, cpc), surf(active, user, site), and user(id, coins).
You can also perhaps make minor improvements by switching the join to inner joins from outer joins. The where clause is undoing the left outer join anyway, so this won't affect the results.
But I don't think these changes will really help. The problem is the sort of the result set in the inner query. The outer sort by rand() is a minor issue, because you are only doing that on 100 rows.
If you are running this 1,000/minute, you will need to rethink your data structures and develop an approach that has the data you need more readily available.
I am having trouble figuring our how I can get results only when products.published, product_types.published, and product_cats.published = 1 but my query isn't working. Please help:
SELECT
`products`.`title`,
`products`.`menu_id`,
`products`.`short_description`,
`products`.`datasheet_icon`,
`products`.`datasheet`,
`products`.`ordering`,
`products`.`product_type_id`,
CASE WHEN CHAR_LENGTH(`products`.`alias`)
THEN CONCAT_WS(':', `products`.`id`, `products`.`alias`)
ELSE `products`.`id`
END AS slug
FROM
`products`,
`product_cats`,
`product_types`
WHERE
`products`.published=1 AND
`product_cats`.published=1 AND
`product_types`.published=1 AND
`products`.`product_cat_id`='42' AND
`product_types`.`id` IN (1,40,48,49,50)
GROUP BY `products`.`id`
ORDER BY `product_types`.`ordering`, `products`.`ordering`
I want to assume tables product_cats and product_types have product ids in them as well. And I call them pid in this:
SELECT
p.title,
p.menu_id,
p.short_description,
p.datasheet_icon,
p.datasheet,
p.ordering,
p.product_type_id,
CASE
WHEN CHAR_LENGTH(p.alias)
THEN CONCAT_WS(':', p.id, p.alias)
ELSE p.id
END AS slug
FROM products p
JOIN product_cats pc ON pc.pid = p.id
JOIN product_types pt ON pt.pid = p.id
WHERE
p.published=1 AND
pc.published=1 AND
pt.published=1
GROUP BY p.id
ORDER BY pt.ordering,p.ordering
You need join tables!
FROM
`products`,
`product_cats`,
`product_types`
Use relational fields to do it and your problem will be gone!
I'm afraid your query is a bit of a mess. Without the table structures we can only guess at what you're trying to do. The critical information is how the three tables are related to each other.
Note the following:
You are using three tables in your SELECT, but are not JOINing them. You will need to explicitly JOIN the tables you use. The lack of explicit JOINs is the reason you're getting too many rows back and are having to use GROUP BY to eliminate duplicates. Your final solution should not use GROUP BY.
If you're only searching for product.cat_id of 42, I presume you know whether than cat_id is published and you don't need to involve the product_cats table. Is that correct?
Presumably there's a column product.type_id or something similar. Since you are searching for a limited number of these, do you know in advance that the ids in that list are published?