Mysql long execution query - mysql

I have table with 38k rows and I use this query to compare item id from items table with item id from posted_domains table.
select * from `items`
where `items`.`source_id` = 2 and `items`.`source_id` is not null
and not exists (select *
from `posted_domains`
where `posted_domains`.`item_id` = `items`.`id` and `domain_id` = 1)
order by `item_created_at` asc limit 1
This query took 8s. I don't know if is a problem with my query or my mysql is bad configured. This query is generated by Laravel relations like
$items->doesntHave('posted', 'and', function ($q) use ($domain) {
$q->where('domain_id', $domain->id);
});

CORRELATED subqueries can be rather slow (as they are often executed repeatedly, once for each row in the outer query), this might be faster.
select *
from `items`
where `items`.`source_id` = 2
and `items`.`source_id` is not null
and item_id not in (
select DISTINCT item_id
from `posted_domains`
where `domain_id` = 1)
order by `item_created_at` asc
limit 1
I say might because subqueries in where are also rather slow in MySQL.
This LEFT JOIN will probably be the fastest.
select *
from `items`
LEFT JOIN (
select DISTINCT item_id
from `posted_domains`
where `domain_id` = 1) AS subQ
ON items.item_id = subQ.item_id
where `items`.`source_id` = 2
and `items`.`source_id` is not null
and subQ.item_id is null
order by `item_created_at` asc
limit 1;
Since it is a no matches scenario, it technically doesn't even need to be a subquery; and might be faster as a direct left join, but that will depend on indexes, and possibly actual data values.

Related

Wrong MySQL subquery count comparison result

So I found a bug in my application, it wasn't showing me a result it should have. I traced it back to the following SQL query (I removed the irrelevant parts).
As you can see the query selects rows from wc_hncat_products of which the corresponding id's have a count of >= 1 inside the wc_hncat_product_category_has_product table
If I execute the query with the subquery result as a column, you can see the result is 1. But when I use it in the WHERE clause the >= 1 comparison fails.
Proof that the subquery DOES return 1:
SELECT `wc_hncat_products`.`id`,
(SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) count
FROM `wc_hncat_products`
WHERE `id` IN ( '785' )
This query returns one row, with the column count value being 1
No results with subquery count comparison in WHERE clause
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE (SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) >= 1
AND `id` IN ( '785' )
This query selects 0 rows..
How is this possible? You can see the count actually is 1, but the comparison still fails as no results are being returned while the subqueries are identical in both scenarios.
The standard way to implement that type of check is to use EXISTS.
Something like this should work for you:
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE EXISTS (SELECT NULL
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' ))
AND `id` IN ( '785' )

Mysql very slow subquery optimizing

I am building a sql query with a large set of data but query is too slow
I've got 3 tables; movies, movie_categories, skipped_movies
The movies table is normalized and I am trying to query a movie based on a category while excluding ids from skipped_movies table.
However I am trying to use WHERE IN and WHERE NOT IN to in my query.
movies table has approx. 2 million rows (id, name, score)
movie_categories approx. 5 million (id, movie_id, category_id)
skipped_movies has approx. 1k rows (id, movie_id, user_id)
When the skipped_movies table is very small 10 - 20 rows the query is quite fast. (about 40 - 50 ms) but when the table gets somewhere around 1k of data I get somewhere around 7 to 8 seconds on the query.
This is the query I'm using.
SELECT SQL_NO_CACHE * FROM `movies` WHERE `id` IN (SELECT `movie_id` FROM `movie_categories` WHERE `category_id` = 1) AND `id` NOT IN (SELECT `movie_id` FROM `skipped_movies` WHERE `user_id` = 1) AND `score` <= 9 ORDER BY `score` DESC LIMIT 1;
I've tried many ways that came to mind but this was the fastest one. I even tried the EXISTS method to no extent.
I'm using the SQL_NO_CACHE just for testing.
And I guess that the ORDER BY statement is running very slow.
Assuming that (movie_id,category_id) is unique in movies_categories table, I'd get the specified result using join operations, rather than subqueries.
To exclude "skipped" movies, an anti-join pattern would suffice... that's a left outer join to find matching rows in skipped_movies, and then a predicate in the WHERE clause to exclude any matches found, leaving only rows that didn't have a match.
SELECT SQL_NO_CACHE m.*
FROM movies m
JOIN movie_categories c
ON c.movie_id = m.id
AND c.category_id = 1
LEFT
JOIN skipped_movies s
ON s.movie_id = m.id
AND s.user_id = 1
WHERE s.movie_id IS NULL
AND m.score <= 9
ORDER
BY m.score DESC
LIMIT 1
And appropriate indexes will likely improve performance...
... ON movie_categories (category_id, movie_id)
... ON skipped_movies (user_id, movie_id)
Most IN/NOT IN queries can be expressed using JOIN/LEFT JOIN, which usually gives the best performance.
Convert your query to use joins:
SELECT m.*
FROM movies m
JOIN movie_categories mc ON m.id = mc.movie_id AND mc.category_id = 1
LEFT JOIN skipped_movies sm ON m.id = sm.movie_id AND sm.user_id = 1
WHERE sm.movie_id IS NULL
AND score <= 9
ORDER BY score DESC
LIMIT 1
Your query seem to be all right. Just a small tweak need. You can replace * with with the column/attribute names in your table. It will make this query work faster then ever. Since * operation is really slow

Alter and Optimize sql query

I need to please change this SQL query to NOT use sub-query with IN, I need for this query to work faster.
here is the query i am working on. About 7 million rows.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `UserID` IN (
SELECT `UserID` FROM `download`
WHERE `MovieID` = 995
)
GROUP BY `MovieID`
ORDER BY `Count` DESC
Thanks
Something like this - but (in the event that you switch to an OUTER JOIN) make sure you're counting the right thing...
SELECT MovieID
, COUNT(*) ttl
FROM download x
JOIN download y
ON y.userid = x.userid
AND y.movieid = 995
GROUP
BY x.MovieID
ORDER
BY ttl DESC;
Use Exists instead, see Optimizing Subqueries with EXISTS Strategy:
Consider the following subquery comparison:
outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where) MySQL
evaluates queries “from outside to inside.” That is, it first obtains
the value of the outer expression outer_expr, and then runs the
subquery and captures the rows that it produces.
A very useful optimization is to “inform” the subquery that the only
rows of interest are those where the inner expression inner_expr is
equal to outer_expr. This is done by pushing down an appropriate
equality into the subquery's WHERE clause. That is, the comparison is
converted to this:
EXISTS (SELECT 1 FROM ... WHERE subquery_where AND
outer_expr=inner_expr) After the conversion, MySQL can use the
pushed-down equality to limit the number of rows that it must examine
when evaluating the subquery.
filter direct on movieId..you does not need to add sub query. it can be done by using movieID =995 in where clause.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `MovieID` = 995
GROUP BY `MovieID`
ORDER BY `Count` DESC

What is faster in MySQL? WHERE sub request = 0 or IN list

I was wondering what is better in MySQL. I have a SELECT query that exclude every entry associated to a banned userID.
Currently I have a subquery clause in the WHERE statement that goes like
AND (SELECT COUNT(*)
FROM TheBlackListTable
WHERE userID = userList.ID
AND blackListedID = :userID2 ) = 0
Which will accept every userID not present in the TheBlackListTable
Would it be faster to retrieve first all Banned ID in a previous request and replace the previous clause by
AND creatorID NOT IN listOfBannedID
LEFT JOIN / IS NULL and NOT IN are fastest:
SELECT *
FROM mytable
WHERE id NOT IN
(
SELECT userId
FROM blacklist
WHERE blackListedID = :userID2
)
or
SELECT m.*
FROM mytable m
LEFT JOIN
blacklist b
ON b.userId = m.id
AND b.blackListedID = :userID2
WHERE b.userId IS NULL
NOT EXISTS yields the same plan but due to implementation flaws is marginally less efficient:
SELECT *
FROM mytable
WHERE NOT EXISTS
(
SELECT NULL
FROM blacklist b
WHERE b.userId = m.id
AND b.blacklistedId = :userID2
)
All these queries stop on the first match in blacklist (hence performing a semi-join)
The COUNT(*) solution is the least efficient, since MySQL will calculate the actual COUNT(*) rather than stopping on the first match.
However, if you have a UNIQUE index on (userId, blacklistedId), this is not much of problem as there cannot be more than one match anyway.
Use EXISTS clause to check for user not in blacklist.
Sample Query
Select * from userList
where not exists( Select 1 from TheBlackListTable where userID = userList.ID)
IN clause is used when there is fixed values or low count of values.

How can I make a WHERE clause only apply to the right table in a left join?

I have two tables.
TableA: field_definitions
field_id, field_type, field_length, field_name, field_desc, display_order, field_section, active
TableB: user_data
response_id, user_id, field_id, user_response
I need a query that will return all rows from table A and, if they exist, matching rows from table B based on a particular user_id.
Here is what I have so far...
SELECT field_definitions. * , user_data.user_response
FROM field_definitions
LEFT JOIN user_data
USING ( field_id )
WHERE (
user_data.user_id =8
OR user_data.user_id IS NULL
)
AND field_definitions.field_section =1
AND field_definitions.active =1
ORDER BY display_order ASC
This only works if table B has zero rows or matching rows for the user_id in the WHERE clause. If table B has rows with matching field_id but not user_id, I get zero returned rows.
Essentially, once rows in table B exist for user X, the query no longer returns rows from table A when searching for user Z responses and none are found.
I need the result to always contain rows from table A even if there are no matching rows in B with the correct user_id.
You can move those constraints from the WHERE clause to the ON clause (which first requires that you change the USING clause into an ON clause: ON clauses are much more flexible than USING clauses). So:
SELECT field_definitions.*,
user_data.user_response
FROM field_definitions
LEFT
JOIN user_data
ON user_data.field_id = field_definitions.field_id
AND user_data.user_id = 8
WHERE field_definitions.field_section = 1
AND field_definitions.active = 1
ORDER
BY field_definitions.display_order ASC
;
Conceptually, the join is performed first and then the where clause is applied to the virtual resultset. If you want to filter one table first, you have to code that as a sub-select inside the join. Something along these lines:
SELECT
field_definitions. * ,
user8.user_response
FROM
field_definitions
LEFT JOIN (select * from user_data where user_id=8 or user_id is null) as user8
USING ( field_id )
WHERE
field_definitions.field_section =1
AND field_definitions.active =1
ORDER BY display_order ASC
You can move the WHERE clause inside as follows
SELECT field_definitions. * , user_data.user_response
FROM (
select * from
field_definitions
WHERE field_definitions.field_section =1
AND field_definitions.active =1 ) as field_definitions
LEFT JOIN (
select * from
user_data
where user_data.user_id =8
OR user_data.user_id IS NULL ) as user_data
USING ( field_id )
ORDER BY display_order ASC
A literal translation of the sepc:
SELECT field_definitions. * , '{{MISSING}}' AS user_response
FROM field_definitions
UNION
SELECT field_definitions. * , user_data.user_response
FROM field_definitions
NATURAL JOIN user_data
WHERE user_data.user_id = 8;
However, I suspect that you don't really want "all rows from table A".