Couchbase different result of nested loop WHERE clause - couchbase

I am facing a wierd issue in couchbase: i was executing the following two queries:
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM ssb_lineorder
LIMIT 10000) AS cte0
INNER JOIN ssb_ddate ON cte0.ssb_lineorder.lo_orderdate = ssb_ddate.d_datekey) AS cte1
JOIN ssb_part USE NL ON cte1.cte0.ssb_lineorder.lo_partkey = ssb_part.p_partkey
WHERE ssb_part.p_size > 10
and
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM ssb_lineorder
LIMIT 10000) AS cte0
INNER JOIN ssb_ddate ON cte0.ssb_lineorder.lo_orderdate = ssb_ddate.d_datekey) AS cte1
JOIN ssb_part USE NL ON cte1.cte0.ssb_lineorder.lo_partkey = ssb_part.p_partkey ) AS cte2
WHERE cte2.ssb_part.p_size > 10
These two are exactly the same except the final WHERE clause. According to my knowledge of relational DBMS, the results should be exactly the same. but I am getting different result: 1 for the first query, 7972 for the second query.
I am wondering if I misunderstood the n1ql mechenism ?

There should not be any different.
LIMIT inside without order by can cause inconsistent results. 1 vs 7972 that is way off.
As this data dependent you need to debug that.
Execute UI and go to Plan Text tab and take look ItemsIn#, ItemsOut# of each operator and take look where things gone wrong.
Also add predicate and reduce data and see what is wrong.
As no OUTER JOIN try the following.
CREATE INDEX ix1 ON ssb_part(p_size, p_partkey);
CREATE INDEX ix2 ON ssb_lineorder(lo_partkey, lo_orderdate);
CREATE INDEX ix3 ON ssb_ddate(d_datekey);
SELECT *
FROM ssb_part AS sp
JOIN ssb_lineorder AS sl ON sp.p_partkey = sl.lo_partkey
JOIN ssb_ddate AS sd ON sl.lo_orderdate = sd.d_datekey
WHERE sp.p_size > 10
SELECT *
FROM ssb_part AS sp
JOIN ssb_lineorder AS sl USE HASH (PROBE) ON sp.p_partkey = sl.lo_partkey
JOIN ssb_ddate AS sd USE HASH (PROBE) ON sl.lo_orderdate = sd.d_datekey
WHERE sp.p_size > 10 ;

Related

How to optimize this query as this strted hanging due to large "IN" clause

I have followinf query:
SELECT DISTINCT (
device_id
)
FROM ABC
WHERE app_id ='$appid'
AND device_type='Android'
AND device_id
IN (
SELECT device_id
FROM XYZ
WHERE application_id = '$appid'
AND device_type='Android'
AND device_mode='$device_mode'
)
In the IN clause I am having array of 6000+ items. So it start hanging. Please let me know how can I optimize this.
PS: I have read other solution, try to use join but still its hanging.
I didn't test the query, but it shouldn't be difficult to adjust this or rewrite it as necessary. Instead of large IN() list, simply perform an INNER JOIN and supply correct WHERE
SELECT first.device_id
FROM ABC first
INNER JOIN XYZ second ON first.device_id = second.device_id
WHERE first.app_id = '$appid'
AND first.device_type = 'Android'
AND second.application_id = '$appid'
AND second.device_type = 'Android'
AND second.device_mode = '$device_mode'
GROUP BY first.device_id

MySQL Query Optimization for the derived & Sub Query Combination queries

The following sort of the queries are running on the server which uses the derived table and subquery. The constraint is that the subqueries are generated from the multiple modules based on the current situation so cannot really convert it into the join combination.
Please suggest the possible solution to optimize the query
SELECT COUNT(1)
AS total
FROM member tlb_m
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and (tlb_m.hometown_list_id in
(SELECT l2.list_id
FROM ((
SELECT t12.list_id
from list_tree_idx t12
INNER JOIN list_tree_idx t11
ON t12.list_parent_id=t11.list_id
where t11.list_parent_id='205546'
) UNION ALL (
SELECT list_id
from list_tree_idx
where list_parent_id='205546'
) ) as l2
) or tlb_m.hometown_list_id = 205546
)
I would suggest to use a closure table for optimal hierarchical queries.
For example, having a closure table with columns ANCESTOR_ID, CHILD_ID and DEPTH your query will look like this
SELECT COUNT(1) AS total
FROM member AS tlb_m
LEFT JOIN hometown_closure AS c ON c.child_id = tlb_m.hometown_list_id
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and c.ancestor_id = 205546

Query for self join in MySQL

I have a query which gets the correct result but it is taking 5.5 sec to get the output.. Is there any other way to write a query for this -
SELECT metricName, metricValue
FROM Table sm
WHERE createdtime = (
SELECT MAX(createdtime)
FROM Table b
WHERE sm.metricName = b.metricName
AND b.sinkName='xx'
)
AND sm.sinkName='xx'
In your code, the subselect has to be run for every result row of the outer query, which should be quite expensive. Instead, you could select your filter data in a separate query and join both accordingly:
SELECT `metricName`, `metricValue` FROM Table sm
INNER JOIN (SELECT max(`createdtime`) AS `maxTime, `metricName` from Table b WHERE b.sinkName='xx' GROUP BY `metricName` ) filter
ON (sm.`createdtime` = filter.`maxTime`) AND ( sm.`metricName` = filter.`metricName`)
WHERE sm.sinkName='xx'

How to optimize this complected query?

While working with following query on mysql, Its getting locked,
SELECT event_list.*
FROM event_list
INNER JOIN members
ON members.profilenam=event_list.even_loc
WHERE (even_own IN (SELECT frd_id
FROM network
WHERE mem_id='911'
GROUP BY frd_id)
OR even_own = '911' )
AND event_list.even_active = 'y'
GROUP BY event_list.even_id
ORDER BY event_list.even_stat ASC
The Inner query inside IN constraint has many frd_id, So because of that above query is slooow..., So please help.
Thanks.
Try this:
SELECT el.*
FROM event_list el
INNER JOIN members m ON m.profilenam = el.even_loc
WHERE el.even_active = 'y' AND
(el.even_own = 911 OR EXISTS (SELECT 1 FROM network n WHERE n.mem_id=911 AND n.frd_id = el.even_own))
GROUP BY el.even_id
ORDER BY el.even_stat ASC
You don't need the GROUP BY on the inner query, that will be making the database engine do a lot of unneeded work.
If you put even_own = '911' before the select from network, then if even_own IS 911 then it will not have to do the subquery.
Also why do you have a group by on the subquery?
Also run explain plan top find out what is taking the time.
This might work better:
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
JOIN network AS n ON e.even_own = n.frd_id
WHERE n.mem_id = '911'
AND e.even_active = 'y'
ORDER BY e.even_stat ASC )
UNION DISTINCT
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
WHERE e.even_own = '911'
AND e.even_active = 'y' )
ORDER BY e.even_stat ASC
Since I don't know whether the JOINs one-to-many (or what), I threw in DISTINCT to avoid dups. There may be a better way, or it may be unnecessary (that is, UNION ALL).
Notice how I avoid two things that are performance killers:
OR -- turned into UNION
IN (SELECT...) -- turned into JOIN.
I made aliases to cut down on the clutter. I moved the ORDER BY outside the UNION (and added parens to make it work right).

Mysql: same query, different results

EDIT:
Sorry about unreadable query, I was under deadline. I managed to solve problem by breaking this query into two smaller ones, and doing some business logic in Java. Still want to know why this query can random times return two different results.
So, it randomly returns once all expected results, other time just half. I noticed that when I write it join per join, and execute after each join, in the end it returns all expected results. So am wandering if there's some kind of MySql memory or other limitation that it doesn't take whole tables in joins. Also read on undeterministic queries but not sure what to tell.
Please help, ask if needs clarification, and thank you in advance.
RESET QUERY CACHE;
SET SQL_BIG_SELECTS=1;
set #displayvideoaction_id = 2302;
set #ticSessionId = 3851;
select richtext.id,richtextcross.name,richtextcross.updates_demo_field,richtext.content from
(
select listitemcross.id,name,updates_demo_field,listitem.text_id from
(
select id,name, updates_demo_field, items_id from
(
SELECT id, name, answertype_id, updates_demo_field,
#student:=CASE WHEN #class <> updates_demo_field THEN 0 ELSE #student+1 END AS rn,
#class:=updates_demo_field AS clset FROM
(SELECT #student:= -1) s,
(SELECT #class:= '-1') c,
(
select id, name, answertype_id, updates_demo_field from
(
select manytomany.questions_id from
(
select questiongroup_id from
(
select questiongroup_id from `ticnotes`.`scriptaction` where ticsession_id=#ticSessionId and questiongroup_id is not null
) scriptaction
inner join
(
select * from `ticnotes`.`questiongroup`
) questiongroup on scriptaction.questiongroup_id=questiongroup.id
) scriptgroup
inner join
(
select * from `ticnotes`.`questiongroup_question`
) manytomany on scriptgroup.questiongroup_id=manytomany.questiongroup_id
) questionrelation
inner join
(
select * from `ticnotes`.`question`
) questiontable on questionrelation.questions_id=questiontable.id
where updates_demo_field = 'DEMO1' or updates_demo_field = 'DEMO2'
order by updates_demo_field, id desc
) t
having rn=0
) firstrowofgroup
inner join
(
select * from `ticnotes`.`multipleoptionstype_listitem`
) selectlistanswers on firstrowofgroup.answertype_id=selectlistanswers.multipleoptionstype_id
) listitemcross
inner join
(
select * from `ticnotes`.`listitem`
) listitem on listitemcross.items_id=listitem.id
) richtextcross
inner join
(
select * from `ticnotes`.`richtext`
) richtext on richtextcross.text_id=richtext.id;
My first impression is - don't use short cuts to describe your tables. I am lost at which td3 is where ,then td6, tdx3... I guess you might be lost as well.
If you name your aliases more sensibly there will be less chance to get something wrong and mix 6 with 8 or whatever.
Just a sugestion :)
There is no limitation on mySQL so my bet would be on human error - somewhere there join logic fails.