Performance: Double nested subquery unknown column - mysql

If I do COUNT()/AVG() in the first subquery MySQL process all rows inside the table, because of that reason it is necessary to filter at from all rows with another subquery.
As example if I have 3 rows, but only 1 row has the id which should get count, MySQL process all 3 rows (according to EXPLAIN) and does the where clause after.
If I'm able to select in a double nested sub query this single row and call the count outside it would be a lot better performance wise.
The problem MySQL does not allow using outer values in a second level subquery.
Simple example of my code:
SELECT
pr.id, pr.catid, ...
(
SELECT COUNT(pra.id)
FROM (
SELECT id
FROM productsrating
WHERE pr.id = productid
) pra
) as ratingcount,
...
FROM
(
SELECT id, ...
FROM products
WHERE active = 1
) pr
-> Unknown column pr.id
I do also tried to use the COUNT in the main select but it isn't allowed to have multiple values inside a subquery.
Edit: I have an index on productid.
EDIT2 SOLUTION:
Sorry at all its working fine with the first single subquery, server problems caused bad behavior.

It seems you want the count of ratings occurring for an active product. Is this correct?
So; why is a simple left join not working? The count of PRA should be based on only those products which are active; so index usage should work here.
I'd need to see sample data / expected results to figure out the overall goal here.
SELECT PR.*, count(PRA.ID)
FROM products PR
LEFT JOIN productsRating PRA
on PR.ID = PRA.ProductID
WHERE PR.Active = 1
GROUP BY PR.*
Substitute all fields needed for PR.*
Maybe this... seems like an odd thing to have to do to get the products rating to be filtered before the average/count is done though.
SELECT PR.*, count(PRA.ID)
FROM products PR
LEFT JOIN (SELECT * FROM productsRating PRI
WHERE EXISTS (SELECT 1
FROM Products P
WHERE active = 1 and PRI.ProductID = P.ID)) PRA
on PR.ID = PRA.ProductID
WHERE PR.Active = 1
GROUP BY PR.*

Try count with distinct
SELECT
pr.id, pr.catid, ...
(
SELECT COUNT(Distinct productsrating.id)
FROM productsrating
WHERE pr.id = productid
) as ratingcount,
...
FROM
(
SELECT id, ...
FROM products
WHERE active = 1
) pr

Related

Mysql getting only 1 result, rather than multiple

Short setup
consider the following.
SELECT forum_category.groupid,
forum_category.categoryid,
forum_category.categoryname,
forum_category.categorydescription,
forum_category.category_url,
forum_category.accesslevel ,
COUNT(DISTINCT forum_topic.topicid) AS topics ,
COUNT(DISTINCT forum_post.postid) AS posts
FROM forum_category
INNER JOIN forum_topic ON forum_topic.categoryid=forum_category.categoryid
INNER JOIN forum_post ON forum_post.topicid=forum_topic.topicid
WHERE groupid = 1
result
This gives me actually one result, while i expect multiple rows (in this case 2) to come back. What am I missing here?

select count with another select and inner join

Is it possible to use two "select" in the same query?
I tried it but got the syntax error several times.
My query example:
SELECT
comp.id,
comp.document,
comp.dateStart,
comp.companyName,
comp.fantasyName,
comp.legalNature,
comp.mainActivity,
comp.situation,
comp.shareCapital,
comp.idCompanyStatus,
pp.userCredentialId,
uc.name,
cs.name AS 'nameStatus',
cs.color AS 'colorStatus',
cs.description,
comp.idPurchasedProduct,
comp.actived,
comp.createAt,
comp.updateAt,
comp.phone
FROM `PurchasedProduct` pp
INNER JOIN
`Company` comp on comp.idPurchasedProduct = pp.id
INNER JOIN
`UserCustomer` uc on pp.userCredentialId = uc.credentialId
INNER JOIN
`CompanyStatus` cs on cs.id = comp.idCompanyStatus
WHERE
comp.actived = 1
LIMIT 0,5;
SELECT COUNT(id) AS totalItems, CEILING(COUNT(id) / 10) AS totalPages FROM Company;
I would like the result shown to be all queries on the screen.
Basically, what I want is that the result shown when executing the query is the first and second "select" together.
I really don't know how or don't understand how to do this.
Example:
first result with seconde result
I want to show both results at once.
The documents is fake, not real. Only for demo.
You should be able to do by having the second query as its own JOIN query. Since there is no group by, it is only returning a single row. By no join condition, the value will be available for every row otherwise. So you SHOULD be able to get by doing
select
[ all your other columns ],
JustCounts.TotalItems,
JustCounts.TotalPages
from
[PurchasedProduct and all your other joins]
JOIN ( SELECT
COUNT(id) AS totalItems,
CEILING(COUNT(id) / 10) AS totalPages
FROM Company ) as JustCounts
where
[rest of your original query]

Understaing the difference between two queries from performance point

I have this two version of the same query. Both produce same results (164 rows). But the second one takes .5 sec while the 1st one takes 17 sec. Can someone explain what's going on here?
TABLE organizations : 11988 ROWS
TABLE transaction_metas : 58232 ROWS
TABLE contracts_history : 219469 ROWS
# TAKES 17 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
# TAKES .6 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
left join (select * from `transaction_metas` where contract_token in (select token from `contracts_history` where seller_id = 850)) as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Explain Results:
First Query: https://prnt.sc/hjtiw6
Second Query: https://prnt.sc/hjtjjg
As based on my debugging of the first query it was clear that left join to transaction_metas table was making it slow, So I tried to limit its rows instead of joining to the full table. It seems to work but I don't understand why.
Join is a set of combinations from rows in your tables. That in mind, in the first query the engine combines all the results to filter just after. In second case one it applies the filter before it tries make the combinations.
The best case would make use of filter in JOIN clause without subquery.
Much like this:
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
AND `contracts_history`.`seller_id` = '850'
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token`
AND `tm`.`field` = 1
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Note: When you reduce the size of the join tables by filtering with subqueries, it may allow the rows fit into the buffer. Nice trick to small buffer limit.
A Better explication:
https://dev.mysql.com/doc/refman/5.5/en/explain-output.html

MySQL Inner Join with where clause sorting and limit, subquery?

Everything in the following query results in one line for each invBlueprintTypes row with the correct information. But I'm trying to add something to it. See below the codeblock.
Select
blueprintType.typeID,
blueprintType.typeName Blueprint,
productType.typeID,
productType.typeName Item,
productType.portionSize,
blueprintType.basePrice * 0.9 As bpoPrice,
productGroup.groupName ItemGroup,
productCategory.categoryName ItemCategory,
blueprints.productionTime,
blueprints.techLevel,
blueprints.researchProductivityTime,
blueprints.researchMaterialTime,
blueprints.researchCopyTime,
blueprints.researchTechTime,
blueprints.productivityModifier,
blueprints.materialModifier,
blueprints.wasteFactor,
blueprints.maxProductionLimit,
blueprints.blueprintTypeID
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
So what I need to get in here is the following table with the columns below it so I can use the values timestamp and sort the entire result by profitHour
tablename: invBlueprintTypesPrices
columns: blueprintTypeID, timestamp, profitHour
I need this information with the following select in mind. Using a select to show my intention of the JOIN/in-query select or whatever that can do this.
SELECT * FROM invBlueprintTypesPrices
WHERE blueprintTypeID = blueprintType.typeID
ORDER BY timestamp DESC LIMIT 1
And I need the main row from table invBlueprintTypes to still show even if there is no result from the invBlueprintTypesPrices. The LIMIT 1 is because I want the newest row possible, but deleting the older data is not a option since history is needed.
If I've understood correctly I think I need a subquery select, but how to do that? I've tired adding the exact query that is above with a AS blueprintPrices after the query's closing ), but did not work with a error with the
WHERE blueprintTypeID = blueprintType.typeID
part being the focus of the error. I have no idea why. Anyone who can solve this?
You'll need to use a LEFT JOIN to check for NULL values in invBlueprintTypesPrices. To mimic the LIMIT 1 per TypeId, you can use the MAX() or to truly make sure you only return a single record, use a row number -- this depends on whether you can have multiple max time stamps for each type id. Assuming not, then this should be close:
Select
...
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Left Join (
SELECT MAX(TimeStamp) MaxTime, TypeId
FROM invBlueprintTypesPrices
GROUP BY TypeId
) blueprintTypePrice On blueprints.blueprintTypeID = blueprintTypePrice.typeID
Left Join invBlueprintTypesPrices blueprintTypePrices On
blueprintTypePrice.TypeId = blueprintTypePrices.TypeId AND
blueprintTypePrice.MaxTime = blueprintTypePrices.TimeStamp
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
Order By
blueprintTypePrices.profitHour
Assuming you might have the same max time stamp with 2 different records, replace the 2 left joins above with something similar to this getting the row number:
Left Join (
SELECT #rn:=IF(#prevTypeId=TypeId,#rn+1,1) rn,
TimeStamp,
TypeId,
profitHour,
#prevTypeId:=TypeId
FROM (SELECT *
FROM invBlueprintTypesPrices
ORDER BY TypeId, TimeStamp DESC) t
JOIN (SELECT #rn:=0) t2
) blueprintTypePrices On blueprints.blueprintTypeID = blueprintTypePrices.typeID AND blueprintTypePrices.rn=1
You don't say where you are putting the subquery. If in the select clause, then you have a problem because you are returning more than one value.
You can't put this into the from clause directly, because you have a correlated subquery (not allowed).
Instead, you can put it in like this:
from . . .
(select *
from invBLueprintTypesPrices ibptp
where ibtp.timestamp = (select ibptp2.timestamp
from invBLueprintTypesPrices ibptp2
where ibptp.blueprintTypeId = ibptp2.blueprintTypeId
order by timestamp desc
limit 1
)
) ibptp
on ibptp.blueprintTypeId = blueprintType.TypeID
This identifies the most recent records for all the blueprintTypeids in the subquery. It then joins in the one that matches.

Comparing two values from the same select query

I have a select query which selects all products from my inventory table and joins them with two other tables (tables l_products and a_products)
SELECT
i.*,
b.title,
ROUND((i.price/100*80) - l.price,2) AS margin,
l.price AS l_price,
a.price AS a_price,
ROUND((a.price/100*80) - l.price, 2) AS l_margin
FROM inventory i
LEFT JOIN products b ON i.id = b.id
LEFT JOIN a_products a ON i.id = a.id
LEFT JOIN l_products l ON i.id = l.id
WHERE
a.condition LIKE IF(i.condition = 'New', 'New%', 'Used%')
AND l.condition LIKE IF(i.condition = 'New', 'New%', 'Used%')
This select query will normally give me a table such as...
id, title, condition, margin, l_price, a_price ...
001-new ... new 10 20 10
001-used ... used 10 25 20
002....
Now I need a condition in the query which will ignore all used products that are more expensive (have a higher a_price) than their 'new' counterparts, such as in the example above you can see that 001-used has a higher a_price than 001-new.
How can I achieve this with out having to resolve to using php
FULL JOIN this query with it self on a column which has a uniquely same value for each id prefix.
You may achieve this effect by adding another field to your SELECT call which produces same unique value for 001-new and 001-used, 002-new and 002-used...
Such value generation can be done by defining your own SQL Routine to extract first 3 characters from a column.