Everywhere where I looked I saw that HAVING is executed before SELECT, then why can I refer to the row count (that is created in the SELECT part) in the HAVING part?
SELECT `title`, (SELECT COUNT(*) FROM `comments` WHERE `post_id` = `posts`.`id`) as `count` FROM `posts` HAVING `count` != 0
MySQL has uniquely overloaded its HAVING operator such that it may be used similar to a WHERE clause, with the added functionality that it may refer to aliases defined at the same level of the query. On other databases, your query would fail with a syntax error. In that case, to get the same logic you would have to subquery to filter on the alias:
SELECT title, count
FROM
(
SELECT title, (SELECT COUNT(*)
FROM comments
WHERE post_id = posts.id) AS count
FROM posts
) t
WHERE count != 0;
Related
Im trying to select * from all duplicate rows in users, where a duplicate is defined as two users sharing the same first_name and last_name. (I need to process the other columns that might differ)
Im using MySQL 8.0.28.
My first try was to literally translate my requirement:
select * from `users` AS u1 where exists (select 1 from `users` AS u2 WHERE `u2`.`first_name` = `u1`.`first_name` AND `u2`.`last_name` = `u1`.`last_name` AND `u2`.`id` != `u1`.`id`)
Which, obviously, has a horrendous execution time.
My current query is
SELECT * from users where Concat(first_name," ",last_name) IN (select Concat(first_name," ",last_name) from `users` GROUP BY first_name, last_name HAVING COUNT(*)>1)
which is vastly more efficient, but still takes more than 100ms for 8000 records. I suppose a solution that doesn't use concat could benefit from indicies and would not need to calculate the result for each row.
Also, I couldn't get group by to work because I need so select all columns of all rows that are duplicates, not just the distinct first_name's and last_name's. Also because I don't want to disable ONLY_FULL_GROUP_BY (not sure if disabling that would help anyway).
Is there a more efficient, proper way to select these duplicate rows?
I would just use an aggregation approach here:
SELECT *
FROM users
WHERE (first_name, last_name) IN (
SELECT first_name, last_name
FROM users
GROUP BY 1, 2
HAVING COUNT(*) > 1
);
On MySQL 8+, we can also use COUNT() as an analytic function here:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY first_name, last_name) AS cnt
FROM users
)
SELECT *
FROM cte
WHERE cnt > 1;
I want to select all the matching results in a database table with also random results but with the matching results being at the top. With the way, I am doing now I am using two queries first one being the matching query, and if the count is zero I now select random results. I would like to do this with just one query.
You could attempt using a UNION ALL query as follows.
select product_name,price
from marketing_table
where price >=5000 /*user supplied filter*/
and price <=10000 /*user supplied filter*/
union all
select m.product_name,m.price
from marketing_table m
where not exists (select *
from marketing_table m1
where m1.price >=5000 /*user supplied filter*/
and m1.price <=10000 /*user supplied filter*/
)
What I understand from you comment, you may try something simple like this first:
SET #product := 'purse'; -- search term
SELECT * FROM product
ORDER BY product_name LIKE CONCAT('%',#product,'%') DESC, price ASC;
This is the simplest I can think of and it could be a starting point for you.
Here's a demo : https://www.db-fiddle.com/f/31jrR27dFJqYQQigzBqLcs/2
If this is not what you want, you have to edit your question and insert some example data with expected output. Your current question tend to be flagged as too broad and need focus/clarity.
Did you try using a UNION subquery with a LIMIT?
SELECT *
FROM (
SELECT 0 priority, t.*
FROM first_table t
UNION ALL
SELECT 1 priority, t.*
FROM second_table t
)
ORDER BY priority
LIMIT 20
If you do not want to include any second_table records if first_table returns, you would need to do a subquery on the second query to confirm that no rows exist.
SELECT *
FROM (
SELECT 0 priority, t.*
FROM first_table t
UNION ALL
SELECT 1 priority, t.*
FROM second_table t
LEFT JOIN (SELECT ... FROM first_table) a
WHERE a.id IS NULL
)
ORDER BY priority
LIMIT 20
I think it would be possible to use the Common Table Expressions (CTE) feature in MySQL 8, if you are using that version.
https://dev.mysql.com/doc/refman/8.0/en/with.html
So I found a bug in my application, it wasn't showing me a result it should have. I traced it back to the following SQL query (I removed the irrelevant parts).
As you can see the query selects rows from wc_hncat_products of which the corresponding id's have a count of >= 1 inside the wc_hncat_product_category_has_product table
If I execute the query with the subquery result as a column, you can see the result is 1. But when I use it in the WHERE clause the >= 1 comparison fails.
Proof that the subquery DOES return 1:
SELECT `wc_hncat_products`.`id`,
(SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) count
FROM `wc_hncat_products`
WHERE `id` IN ( '785' )
This query returns one row, with the column count value being 1
No results with subquery count comparison in WHERE clause
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE (SELECT Count(*)
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' )) >= 1
AND `id` IN ( '785' )
This query selects 0 rows..
How is this possible? You can see the count actually is 1, but the comparison still fails as no results are being returned while the subqueries are identical in both scenarios.
The standard way to implement that type of check is to use EXISTS.
Something like this should work for you:
SELECT `wc_hncat_products`.`id`
FROM `wc_hncat_products`
WHERE EXISTS (SELECT NULL
FROM `wc_hncat_product_categories`
INNER JOIN `wc_hncat_product_category_has_product`
ON `wc_hncat_product_categories`.`id` =
`wc_hncat_product_category_has_product`.`category_id`
WHERE `wc_hncat_product_category_has_product`.`product_id` =
`wc_hncat_products`.`id`
AND `category_id` IN ( '1' ))
AND `id` IN ( '785' )
I need to please change this SQL query to NOT use sub-query with IN, I need for this query to work faster.
here is the query i am working on. About 7 million rows.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `UserID` IN (
SELECT `UserID` FROM `download`
WHERE `MovieID` = 995
)
GROUP BY `MovieID`
ORDER BY `Count` DESC
Thanks
Something like this - but (in the event that you switch to an OUTER JOIN) make sure you're counting the right thing...
SELECT MovieID
, COUNT(*) ttl
FROM download x
JOIN download y
ON y.userid = x.userid
AND y.movieid = 995
GROUP
BY x.MovieID
ORDER
BY ttl DESC;
Use Exists instead, see Optimizing Subqueries with EXISTS Strategy:
Consider the following subquery comparison:
outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where) MySQL
evaluates queries “from outside to inside.” That is, it first obtains
the value of the outer expression outer_expr, and then runs the
subquery and captures the rows that it produces.
A very useful optimization is to “inform” the subquery that the only
rows of interest are those where the inner expression inner_expr is
equal to outer_expr. This is done by pushing down an appropriate
equality into the subquery's WHERE clause. That is, the comparison is
converted to this:
EXISTS (SELECT 1 FROM ... WHERE subquery_where AND
outer_expr=inner_expr) After the conversion, MySQL can use the
pushed-down equality to limit the number of rows that it must examine
when evaluating the subquery.
filter direct on movieId..you does not need to add sub query. it can be done by using movieID =995 in where clause.
SELECT `MovieID`, COUNT(*) AS `Count`
FROM `download`
WHERE `MovieID` = 995
GROUP BY `MovieID`
ORDER BY `Count` DESC
I have a table which counts occurrences of one specific action by different users on different objects:
CREATE TABLE `Actions` (
`object_id` int(10) unsigned NOT NULL,
`user_id` int(10) unsigned NOT NULL,
`actionTime` datetime
);
Every time a user performs this action, a row is inserted. I can count how many actions were performed on each object, and order objects by 'activity':
SELECT object_id, count(object_id) AS action_count
FROM `Actions`
GROUP BY object_id
ORDER BY action_count;
How can I limit the results to the top n objects? The LIMIT clause is applied before the aggregation, so it leads to wrong results. The table is potentially huge (millions of rows) and I probably need to count tens of times per minute, so I'd like to do this as efficient as possible.
edit: Actually, Machine is right, and I was wrong with the time at which LIMIT is applied. My query returned the correct results, but the GUI presenting them to me threw me off...this kind of makes this question pointless. Sorry!
Actually... LIMIT is applied last, after a eventual HAVING clause. So it should not give you incorrect results. However, since LIMIT is applied last, it will not provide any faster execution of your query, since a temporary table will have to be created and sorted in order of action count before chopping off the result. Also, remember to sort in descending order:
SELECT object_id, count(object_id) AS action_count
FROM `Actions`
GROUP BY object_id
ORDER BY action_count DESC
LIMIT 10;
You could try adding an index to object_id for optimization. In that way only the index will need to be scanned instead of the Actions table.
How about:
SELECT * FROM
(
SELECT object_id, count(object_id) AS action_count
FROM `Actions`
GROUP BY object_id
ORDER BY action_count
)
LIMIT 15
Also, if you have some measure of what must be the minimum number of actions to be included (e.g. the top n ones are surely more than 1000), you can increase the efficiency by adding a HAVING clause:
SELECT * FROM
(
SELECT object_id, count(object_id) AS action_count
FROM `Actions`
GROUP BY object_id
HAVING action_count > 1000
ORDER BY action_count
)
LIMIT 15
I know this thread is 2 years old but stackflow still finds it relevant so here goes my $0.02. ORDER BY clauses are computationally very expensive so they should be avoided in large tables. A trick I used (in part from Joe Celko's SQL for Smarties) is something like:
SELECT COUNT(*) AS counter, t0.object_id FROM (SELECT COUNT(*), actions.object_id FROM actions GROUP BY id) AS t0, (SELECT COUNT(*), actions.object_id FROM actions GROUP BY id) AS t1 WHERE t0.object_id < t1.object_id GROUP BY object_id HAVING counter < 15
Will give you the top 15 edited objects without sorting. Note that as of v5, mysql will only cache result sets for exactly duplicate (whitespace incl) queries so the nested query will not get cached. Using a view would resolve that problem.
Yes, it's three queries instead of two and and the only gain is the not having to sort the grouped query but if you have a lot of groups, it will be faster.
Side note: the query is really handy for median functions w/o sorts
SELECT * FROM (SELECT object_id, count(object_id) AS action_count
FROM `Actions`
GROUP BY object_id
ORDER BY action_count) LIMIT 10;