How can I optimize the following SQL query

How can I optimize the following SQL query - mysql

Right now it is taking a long long time to run.
The query is:
select count(id), variety_id, name
from tblItem
where order_id IN (
select order_id
from tblItem
where variety_id=4005
order by order_id DESC)
AND variety_id != 4005
GROUP BY variety_id
order by count(id) DESC
LIMIT 5;
I have indexes on variety_id and order_id. I'm basically trying to build a recommendation engine. The query is looking for the top 5 items people buy when they also bought variety_id 4005. But like i said it takes way to long to run.
Does anyone have a way to optimize this query?

Try this:
select count(t1.id), t1.variety_id, t1.name
from tblItem t1
inner join tblItem t2 ON t2.order_id = t1.order_id and t2.variety_id = 4005
where t1.variety_id != 4005
GROUP BY t1.variety_id, t1.name
ORDER BY count(t1.id) DESC
LIMIT 5;

I've often found that MySQL optimizes WHERE ... IN (SELECT ...) poorly, and JOIN works better; I've read that recent MySQL versions are better, so it may be version-dependent. Also, you should use COUNT(*) unless the column can be NULL and you need to ignore the null values in the count.
SELECT COUNT(*) count, variety_id, name
FROM tblItem AS t1
JOIN (SELECT DISTINCT order_id
FROM tblItem
WHERE variety_id = 4005) AS t2
ON t1.order_id = t2.order_id
WHERE t1.variety_id != 4005
GROUP BY variety_id
ORDER BY count DESC
LIMIT 5
The subquery with DISTINCT is needed to prevent multiplying the counts by the number of matching rows in the cross-product.

Related

how to group by with order by on somewhat complex mysql query

(SELECT dtable.*, new_apps.top as t1 FROM app_detailsvvv as dtable INNER JOIN new_apps ON
new_apps.trackId=dtable.trackId WHERE (SELECT COUNT(*) AS c FROM compositions as dtablev
WHERE parent='169469' AND trackId=dtable.trackId AND new_apps.top > 0)>0) UNION
(SELECT *, 301 FROM app_detailsvvv as dtable WHERE (SELECT COUNT(*) AS c FROM compositions
as dtablev WHERE parent='169469' AND trackId=dtable.trackId)>0) ORDER BY t1 ASC, trackName
ASC LIMIT 0,12
this query brings me duplicates, how can I group by it by trackId ?

Place trackid as part of your SELECT statement ...
SELECT new_apps.trackId,
..etc...
..
..
GROUP BY 1
Make sure the rest of your SELECT items (including the sub-SQL query) have been aggregated - such as MAX, FIRST, SUM or something that groups them into a single result.

Count subselect elements according to parent id

I'm having this MySQL Query
SELECT
t1.article_id,
t1.user_id,
t1.like_date,
(
SELECT
COUNT(*)
FROM liketbl t2
WHERE
t1.article_id=t2.article_id
) as totallike
FROM liketbl t1
WHERE
user_id = 1;
I need to get article id, user id and liked date in one run with the number of total entries.
Subselect is, in my opinion the easiest was to achieve this.
(Don't want to run several queries in client entviroment.
But is not working.
Don't know why, help is appreceated.

try this :
SELECT t1.article_id,
t1.user_id,
t1.like_date,
COUNT(*) as totallike
FROM liketbl t1 inner join liketbl t2 on t1.article_id=t2.article_id
WHERE user_id = 1
group by t1.article_id,t1.user_id,t1.like_date;

My guess is that you need to filter on user_id = 1 in the subquery to get what you expect.
The where only operates on the outer select.

This should work
SELECT
t1.article_id,
t1.user_id,
t1.like_date,
count(SELECT * FROM liketbl t2 WHERE
t1.article_id=t2.article_id ) as totallike
FROM liketbl t1
WHERE
user_id = 1;

Scalar Subqueries tend to be the worst case, it's usually more efficient to rewrite them.
Depending on the number of rows in both tables this is another approach using a Derived Table:
SELECT
t1.article_id
,t1.user_id
.t1.like_date
,t2.totallike
FROM liketbl t1
JOIN
(
SELECT
article_id
,COUNT(*) AS totallike
FROM liketbl
GROUP BY article_id
) AS t2
ON t1.article_id=t2.article_id
WHERE
user_id = 1;

Slow MySQL query using LEFT JOIN

I'm using a simple left join query to fetch two rows of data from two separate tables. They both hold a common column named domain and I join them on this column to calculate a value based on the one tables visits and the other tables earnings.
SELECT t1.`domain` AS `domain`,
(SUM(earnings)/SUM(visits)) AS `rpv`
FROM hat_adsense_stats t1
LEFT JOIN hat_analytics_stats t4 ON t4.`domain`=t1.`domain`
WHERE(t1.`hat_analytics_id`='91' OR t1.`hat_analytics_id`='92')
AND t1.`date`>='2013-02-18'
AND t4.`date`>='2013-02-18'
GROUP BY t1.`domain`
ORDER BY rpv DESC
LIMIT 10;
this is the query i run and it takes 9.060 sec to execute.
The hat_adsense_stats table contains 60887 records
The hat_analytics_stats table contains 190780 records
but by grouping by domain it returns 186 rows of data that needs comparing.
Any suggestions on in-efficient code or on better way to resolve this will be appreciated!

thanks raheel for opening the door, this is what worked in the end, with a execution time of 0.051sec. :)
SELECT
t1.`domain` AS `domain`,
SUM(earnings)/visits AS `rpv`
FROM hat_adsense_stats t1
INNER JOIN (SELECT
domain,
SUM(visits) AS visits
FROM hat_analytics_stats
WHERE `date` >= "2013-02-18"
GROUP BY domain) AS t4
ON t4.domain = t1.domain
WHERE t1.`hat_analytics_id` IN('91','92')
AND t1.`date`>='2013-02-18'
GROUP BY t1.`domain`
ORDER BY rpv DESC
LIMIT 10

Change your query like this
SELECT
t1.`domain` AS `domain`,
t2.earnings/t2.visits AS `rpv`
FROM hat_adsense_stats t1
INNER JOIN (SELECT
domain,
sum(earnings) AS earnings,
SUM(visits) AS visits
FROM hat_adsense_stats
GROUP BY domain) AS t2
on t2.domain = t1.domain
LEFT JOIN hat_analytics_stats t4
ON t4.`domain` = t1.`domain`
WHERE t1.`hat_analytics_id` IN('91','92')
AND t1.`date` >= '2013-02-18'
AND t4.`date` >= '2013-02-18'
GROUP BY t1.`domain`
ORDER BY rpv DESC
LIMIT 10;

The LEFT JOIN is unnecessary as you check the value of an item from the right side of the join. An INNER JOIN would work just as well here and might well be quicker

Selecting rows with unique field values in mysql

I have these columns for table comments:
id
content
add_date
uid
school_id
Rows can have the same school_id.
I want to select the latest data according to add_date, but only 1 row per school_id (no duplicate for school_id) with limit of 10.
I've tried many codes already and its not working for me.
Any help would be appreciated.

This is what we call Greatest N per Group. You can achieved this by putting into a subquery so it can be joined against the non-grouped table (comments).
Try this:
SELECT c.*
FROM
(
SELECT school_id, MAX(add_date) maxDate
FROM comments
GROUP BY school_id
) x INNER JOIN comments c
ON x.school_id = c.school_ID AND
x.maxDate = c.add_date
ORDER BY x.maxDate desc
LIMIT 10

select C.ID, C.Content, t1.MaxDate as [add_date], C.uid, t1.school_id
from (selet school_id, max(add_Date) as 'MaxDate'
from comments
group by school_id) T1
inner join comments C on T1.school_id = C.school_id and C.add_Date= T1.MaxDate
LIMIT 10
If you want to choose which 10 rows return, add an order by, or a Where clause

select c1.*
from comments c1
where add_date = (select max(add_date) from comments c2 where c2.school_id =c1.school_id)
order by add_date desc
limit 10
create indexes on comments(add_date) and comments(school_id, add_date)

sql pulling from sub query

Is it possible to pull 2 results from a sub query in a sql statement?
I have:
"SELECT
(SELECT bid FROM auction_bids WHERE itemID=a.id ORDER BY bid DESC LIMIT 1) as topbid,
a.* FROM auction_items a ORDER BY a.date DESC LIMIT 15"
The part where it returns the topbid, i'd also like it to pull not only bid (as topbid) but also date (as topdate) as well. How can I do that? Do I need another sub query or can it pull both in one?

Dependent subquery (depending on some values outside, like a.id in your case) is not a very efficient way to find maximum values in subsets.
Instead use a subquery with GROUP BY:
SELECT b.topbid, b.topdate, a.*
FROM auction_items a
LEFT JOIN
( SELECT itemID, MAX(bid) as topbid, MAX(date) as topdate
FROM auction_bids
GROUP BY itemID ) b
ON a.id = b.itemID
ORDER BY a.date DESC
LIMIT 15

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How can I optimize the following SQL query - mysql

Try this: select count(t1.id), t1.variety_id, t1.name from tblItem t1 inner join tblItem t2 ON t2.order_id = t1.order_id and t2.variety_id = 4005 where t1.variety_id != 4005 GROUP BY t1.variety_id, t1.name ORDER BY count(t1.id) DESC LIMIT 5;

Related

how to group by with order by on somewhat complex mysql query

Count subselect elements according to parent id

Slow MySQL query using LEFT JOIN

Selecting rows with unique field values in mysql

sql pulling from sub query

Categories

Resources