Wrong MySQL subquery count comparison result - mysql

So I found a bug in my application, it wasn't showing me a result it should have. I traced it back to the following SQL query (I removed the irrelevant parts).
As you can see the query selects rows from wc_hncat_products of which the corresponding id's have a count of >= 1 inside the wc_hncat_product_category_has_product table
If I execute the query with the subquery result as a column, you can see the result is 1. But when I use it in the WHERE clause the >= 1 comparison fails.
Proof that the subquery DOES return 1:
SELECT `wc_hncat_products`.`id`,
(SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) count
FROM `wc_hncat_products`
WHERE `id` IN ( '785' )
This query returns one row, with the column count value being 1
No results with subquery count comparison in WHERE clause
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE (SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) >= 1
AND `id` IN ( '785' )
This query selects 0 rows..
How is this possible? You can see the count actually is 1, but the comparison still fails as no results are being returned while the subqueries are identical in both scenarios.

The standard way to implement that type of check is to use EXISTS.
Something like this should work for you:
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE EXISTS (SELECT NULL
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' ))
AND `id` IN ( '785' )

Related

SQL Execution order (HAVING befor SELECT)

Everywhere where I looked I saw that HAVING is executed before SELECT, then why can I refer to the row count (that is created in the SELECT part) in the HAVING part?
SELECT `title`, (SELECT COUNT(*) FROM `comments` WHERE `post_id` = `posts`.`id`) as `count` FROM `posts` HAVING `count` != 0
MySQL has uniquely overloaded its HAVING operator such that it may be used similar to a WHERE clause, with the added functionality that it may refer to aliases defined at the same level of the query. On other databases, your query would fail with a syntax error. In that case, to get the same logic you would have to subquery to filter on the alias:
SELECT title, count
FROM
(
SELECT title, (SELECT COUNT(*)
FROM comments
WHERE post_id = posts.id) AS count
FROM posts
) t
WHERE count != 0;

Mysql long execution query

I have table with 38k rows and I use this query to compare item id from items table with item id from posted_domains table.
select * from `items`
where `items`.`source_id` = 2 and `items`.`source_id` is not null
and not exists (select *
from `posted_domains`
where `posted_domains`.`item_id` = `items`.`id` and `domain_id` = 1)
order by `item_created_at` asc limit 1
This query took 8s. I don't know if is a problem with my query or my mysql is bad configured. This query is generated by Laravel relations like
$items->doesntHave('posted', 'and', function ($q) use ($domain) {
$q->where('domain_id', $domain->id);
});
CORRELATED subqueries can be rather slow (as they are often executed repeatedly, once for each row in the outer query), this might be faster.
select *
from `items`
where `items`.`source_id` = 2
and `items`.`source_id` is not null
and item_id not in (
select DISTINCT item_id
from `posted_domains`
where `domain_id` = 1)
order by `item_created_at` asc
limit 1
I say might because subqueries in where are also rather slow in MySQL.
This LEFT JOIN will probably be the fastest.
select *
from `items`
LEFT JOIN (
select DISTINCT item_id
from `posted_domains`
where `domain_id` = 1) AS subQ
ON items.item_id = subQ.item_id
where `items`.`source_id` = 2
and `items`.`source_id` is not null
and subQ.item_id is null
order by `item_created_at` asc
limit 1;
Since it is a no matches scenario, it technically doesn't even need to be a subquery; and might be faster as a direct left join, but that will depend on indexes, and possibly actual data values.

Combine subqueries used to retrieve values for comparison into a single query?

I have an SQL query like this:
SELECT COUNT(*) AS count_value FROM submissions WHERE username = (
SELECT username FROM submissions WHERE id = '1'
) AND number = (
SELECT number FROM submissions WHERE id = '1'
) AND tstmp < (
SELECT tstmp FROM submissions WHERE id = '1'
);
Notice how I am using this query to find all rows with similar column values, but with a timestamp value that is less than row number 1.
This works for me, but I was wondering, is there a way I could combine the three subqueries into one? They all select information from the same table, so I thought it might be possible, but I have no clue how to do it.
I think you could merge the subqueries into one and use it as a derived table in a join. Please try this:
SELECT COUNT(*) AS count_value
FROM submissions s
JOIN (
SELECT username, number, tstmp
FROM submissions WHERE id = 1
) o ON s.number = o.number AND s.username = o.username AND s.tstmp < o.tstmp

Alter and Optimize sql query

I need to please change this SQL query to NOT use sub-query with IN, I need for this query to work faster.
here is the query i am working on. About 7 million rows.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `UserID` IN (
SELECT `UserID` FROM `download`
WHERE `MovieID` = 995
)
GROUP BY `MovieID`
ORDER BY `Count` DESC
Thanks
Something like this - but (in the event that you switch to an OUTER JOIN) make sure you're counting the right thing...
SELECT MovieID
, COUNT(*) ttl
FROM download x
JOIN download y
ON y.userid = x.userid
AND y.movieid = 995
GROUP
BY x.MovieID
ORDER
BY ttl DESC;
Use Exists instead, see Optimizing Subqueries with EXISTS Strategy:
Consider the following subquery comparison:
outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where) MySQL
evaluates queries “from outside to inside.” That is, it first obtains
the value of the outer expression outer_expr, and then runs the
subquery and captures the rows that it produces.
A very useful optimization is to “inform” the subquery that the only
rows of interest are those where the inner expression inner_expr is
equal to outer_expr. This is done by pushing down an appropriate
equality into the subquery's WHERE clause. That is, the comparison is
converted to this:
EXISTS (SELECT 1 FROM ... WHERE subquery_where AND
outer_expr=inner_expr) After the conversion, MySQL can use the
pushed-down equality to limit the number of rows that it must examine
when evaluating the subquery.
filter direct on movieId..you does not need to add sub query. it can be done by using movieID =995 in where clause.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `MovieID` = 995
GROUP BY `MovieID`
ORDER BY `Count` DESC

How to add two SUMs

Why wont the following work?
SELECT SUM(startUserThreads.newForStartUser)+SUM(endUserThreads.newForEndUser) AS numNew ...
It returns an empty string.
The following is returning 1 for my data set however:
SELECT SUM(startUserThreads.newForStartUser) AS numNew ...
How do I add the two sums correctly?
The whole thing:
SELECT t.*,
COUNT(startUserThreads.id) + COUNT(endUserThreads.id) AS numThreads,
SUM(startUserThreads.newForStartUser) + SUM(endUserThreads.newForEndUser) AS numNew
FROM `folder` `t`
LEFT OUTER JOIN `thread` `startUserThreads`
ON ( `startUserThreads`.`startUserFolder_id` = `t`.`id` )
LEFT OUTER JOIN `thread` `endUserThreads`
ON ( `endUserThreads`.`endUserFolder_id` = `t`.`id` )
WHERE user_id = :user
FYI, only two users can share a thread in my model. That should explain my column names
SELECT COALESCE(SUM(startUserThreads.newForStartUser),0)+COALESCE(SUM(endUserThreads.newForEndUser),0) AS numNew ...
From the MySQL docs
SUM([DISTINCT] expr)
Returns the sum of expr. If the return set has no rows, SUM() returns
NULL. The DISTINCT keyword can be used in MySQL 5.0 to sum only the
distinct values of expr.
SUM() returns NULL if there were no matching rows.
Aggregate (summary) functions such as COUNT(), MIN(), and SUM() ignore
NULL values. The exception to this is COUNT(*), which counts rows and
not individual column values.
Maybe try COALESCE( SUM(x), 0 ) + COALESCE( SUM(y), 0 )?