How to avoid full table scan in mysql join query

How to avoid full table scan in mysql join query - mysql

Consider the following query:
SELECT
`banner`.`id`,
`region`.*
FROM
`nms_section_region_banner` AS `section`
JOIN `aw_rbslider_slide_region` AS `region`
ON
FIND_IN_SET(
region.region_id,
section.region_id
) <> 0
JOIN `aw_rbslider_banner` AS `banner`
ON
`section`.`banner_id` = `banner`.`id`
JOIN `aw_rbslider_slide_banner` AS `slide_banner`
ON
`slide_banner`.`banner_id` = `banner`.`id`
JOIN `aw_rbslider_slide` AS `slide`
ON
`slide_banner`.`slide_id` = `slide`.`id` AND `slide`.`status` = 1
JOIN `aw_rbslider_slide_store` AS `store`
ON
`slide`.`id` = `store`.`slide_id`
WHERE
`section`.`section_id` = '414' AND(
`region`.`region_type` = '1' OR FIND_IN_SET('400020', region.region_code) <> 0 OR
FIND_IN_SET(
'PANINDIABEAUTY',
region.region_code
) <> 0 OR FIND_IN_SET(
'PANINDIADIGITAL',
region.region_code
) <> 0 OR FIND_IN_SET('6210', region.region_code) <> 0 OR FIND_IN_SET(
'PANINDIAJEWEL',
region.region_code
) <> 0 OR FIND_IN_SET('MH', region.region_code) <> 0 OR FIND_IN_SET('Mumbai',
region.region_code) <> 0
) AND(
`slide`.`display_from` <= '2021-07-23 02:05:16' OR `slide`.`display_from` IS NULL OR
`slide`.`display_from` = '0000-00-00 00:00:00'
) AND(
`slide`.`display_to` >= '2021-07-23 02:05:16' OR `slide`.`display_to` IS NULL OR
`slide`.`display_to` = '0000-00-00 00:00:00'
) AND(
`store`.`store_id` = '0' OR `store`.`store_id` = '2'
)
GROUP BY
`banner`.`id`
ORDER BY
FIELD(
region.region_type,
3,
2,
5,
4,
1
)
Need to avoid the full table scan.
My query is being like,
Picture1 and picture 2 describes type, keys and possible keys information for the table
Can someone guide me to avoid full table scan on those 6 tables.

First, a little cleanup so I can see and follow the hierarchy of your query and tables. Next, you are using a bunch of FIND_IN_SET() tests against the region code. From what this implies, your region code is a capacity of a long string of multiple values such that a region might be "MH, 400020, PANIDIAJEWEL, ETC", so you are looking for some "keyword" value within the region code. Is this accurate? -- OR -- does the region_code only have a single value. Please confirm.
With your join from section to region, they are both just "ID" keys, dont use Find_In_Set(), instead, direct equality. You can not optimize a join based on a function (hence my change) and MAY be a big issue on your query
For your group by, you originally had banner.id, but since that is already equal to section.banner_id via the join, and the section is the primary table, the index on section table can help optimize that grouping vs secondary table.
SELECT
section.banner_id id,
region.*
FROM
nms_section_region_banner section
JOIN aw_rbslider_slide_region region
ON section.region_id = region.region_id
JOIN aw_rbslider_banner banner
ON section.banner_id = banner.id
JOIN aw_rbslider_slide_banner slide_banner
ON section.banner_id = slide_banner.banner_id
JOIN aw_rbslider_slide slide
ON slide_banner.slide_id = slide.id
AND slide.status = 1
JOIN aw_rbslider_slide_store store
ON slide_banner.slide_id = store.slide_id
-- if IDs are integer, dont wrap in quotes
AND ( store.store_id in ( 0, 2 ) )
WHERE
-- dont use quotes if IDs are actually numbers
section.section_id = 414
AND ( -- unsure if region_type is integer vs string...
region.region_type = '1'
OR FIND_IN_SET( '400020', region.region_code ) <> 0
OR FIND_IN_SET( 'PANINDIABEAUTY', region.region_code ) <> 0
OR FIND_IN_SET( 'PANINDIADIGITAL', region.region_code ) <> 0
OR FIND_IN_SET( '6210', region.region_code) <> 0
OR FIND_IN_SET( 'PANINDIAJEWEL', region.region_code ) <> 0
OR FIND_IN_SET( 'MH', region.region_code) <> 0
OR FIND_IN_SET( 'Mumbai', region.region_code) <> 0 )
AND ( slide.display_from IS NULL
OR slide.display_from = '0000-00-00 00:00:00'
OR slide.display_from <= '2021-07-23 02:05:16' )
AND ( slide.display_to IS NULL
OR slide.display_to = '0000-00-00 00:00:00'
OR slide.display_to >= '2021-07-23 02:05:16' )
GROUP BY
section.banner_id
ORDER BY
FIELD( region.region_type,
3,
2,
5,
4,
1 )
To also help, I am sure indexes already exist on primary keys. But if you have compisite keys for the primary ID and the key to the next
table, that can help. In addition, a covering index to include other fields used within where/group possibilities can help.
I would try to have the following indexes.
Table Index
nms_section_region_banner ( banner_id, region_id) -- and in this specific order
aw_rbslider_slide_region ( region_id, region_type, region_code )
aw_rbslider_slide_banner ( banner_id, slide_id)
aw_rbslider_slide slide ( id, status, display_from, display_to )
aw_rbslider_slide_store ( slide_id, store_id )
Finally, your ORDER BY clause by doing the FIELD() function vs individually naming the field columns vs numbers.
Having explicit field names from the region table is more explicit and readable

Related

Optimize MySQL query - Data fetching form views table and Main table

I am fetching data from MySQL views table and Main table. I have created Indexes and Primary keys in Main table but I cannot create Indexes and primary keys on views table.
When I execute the below query it is taking around 10 seconds. I want to optimize the below query to less time.
SELECT DISTINCT
`Emp_No`, `Name`
FROM
`ResLookup`
WHERE
`IsActive` = 1
AND `Department` IN ('SDG' , 'HDD', 'ENG', 'PDN')
AND (`Emp_No` IN (SELECT DISTINCT
ProjList.PM_No
FROM
ProjList
WHERE
ProjList.PM_No != 1749 UNION SELECT DISTINCT
ProjList.PL_No
FROM
ProjList
WHERE
ProjList.PL_No != 1749)
OR Emp_No IN (SELECT
MEMBER_ID
FROM
s_group_details
WHERE
GROUP_ID = 'GRP109'
AND MEMBERSHIP_LEVEL = 30));
Only s_group_details table have Indexes and primary key. Remaining all tables are fetching from views table.
Using Explain Query I have the below output

I don't know your query requirements but still check below query helpful or not
SELECT DISTINCT
`Emp_No`, `Name`
FROM
`ResLookup` inner join (SELECT DISTINCT
ProjList.PM_No ,ProjList.PL_No
FROM
ProjList
WHERE
ProjList.PM_No != 1749
or
ProjList.PL_No != 1749) a
on ResLookup.Emp_No = a.PM_No
and ResLookup.Emp_No = a.PL_No
OR Emp_No IN (SELECT
MEMBER_ID
FROM
s_group_details
WHERE
GROUP_ID = 'GRP109'
AND MEMBERSHIP_LEVEL = 30)
WHERE
`IsActive` = 1
AND `Department` IN ('SDG' , 'HDD', 'ENG', 'PDN');

It may be better to turn things somewhat inside-out:
SELECT `Emp_No`,
( SELECT Name
FROM ResLookup
WHERE Emp_No = u.PM_No
) AS Name
FROM
( SELECT PM_No FROM ProjList WHERE PM_No != 1749 )
UNION DISTINCT
( SELECT PL_No FROM ProjList WHERE PL_No != 1749 )
UNION DISTINCT
( SELECT MEMBER_ID
FROM s_group_details AS d
WHERE d.GROUP_ID = 'GRP109'
AND d.MEMBERSHIP_LEVEL = 30
) AS u
JOIN `ResLookup` AS r ON u.PM_No = r.Emp_No
WHERE r.`IsActive` = 1
AND r.`Department` IN ('SDG' , 'HDD', 'ENG', 'PDN');
Indexes needed:
ResLookup: (Emp_No, IsActive, Department)
s_group_details: (GROUP_ID, MEMBERSHIP_LEVEL, MEMBER_ID)

What would be the best indices for this kind of slow MySQL-Query? (InnoDB)

this kind of MySQL-Query is very slow at the moment.
What would be the best indices for this to speed it up? (InnoDB)
SELECT item_id,
Group_concat(storage_nr SEPARATOR ',') AS storage_nr,
Group_concat(condition SEPARATOR ',') AS condition,
Group_concat(number SEPARATOR ',') AS number,
Group_concat(price SEPARATOR ',') AS price,
last_calc
FROM items
WHERE number > 0
AND bottomlimit IS NOT NULL
AND condition IN (1, 2, 3)
AND ( price_date IS NULL
OR price_date < Date_sub(Now(), INTERVAL 1 hour) )
AND ( NOT ( price = bottomlimit
AND pricebefore = bottomlimit
AND pricebefore2 = bottomlimit )
OR price IS NULL
OR pricebefore IS NULL
OR pricebefore2 IS NULL
OR Date(price_date) <> Curdate() )
GROUP BY item_id
ORDER BY last_calc
LIMIT 20
Thanks a lot in advance!
Best regards!

I'm inclined to agree with Gordon's comment for the most part, but one thing you could try is conditional aggregation. (This could be one of those unique scenarios where processing more of the data and discarding what you don't need is faster than filtering to what you do need, especially since it seems some OR conditions tend to wreck index use in MySQL).
Something similar to this might help.
SELECT item_id
, Group_concat(
IF(condition IN (1, 2, 3)
AND ( price_date IS NULL OR price_date < Date_sub(Now(), INTERVAL 1 hour) )
AND ( NOT ( price = bottomlimit AND pricebefore = bottomlimit AND pricebefore2 = bottomlimit )
OR price IS NULL
OR pricebefore IS NULL
OR pricebefore2 IS NULL
OR Date(price_date) <> Curdate()
)
, storage_nr, NULL)
SEPARATOR ','
) AS storage_nr
, [etc...]
FROM items
WHERE number > 0 AND bottomlimit IS NOT NULL
GROUP BY item_id
HAVING storage_nr IS NOT NULL
ORDER BY last_calc
LIMIT 20

You only hope to use indexed is to break down your complex query in a UNION of more simple and indexed query.
First of all, you can use the FROM [subquery] syntax
SELECT item_id,
Group_concat(storage_nr SEPARATOR ',') AS storage_nr,
Group_concat(condition SEPARATOR ',') AS condition,
Group_concat(number SEPARATOR ',') AS number,
Group_concat(price SEPARATOR ',') AS price,
last_calc
FROM
SUBQUERY
AS items
GROUP BY item_id
ORDER BY last_calc
LIMIT 20
This could be your subquery:
SELECT *
FROM items
WHERE number > 0
AND bottomlimit IS NOT NULL
AND condition IN (1, 2, 3)
AND price_date IS NULL
AND ( NOT ( price = bottomlimit
AND pricebefore = bottomlimit
AND pricebefore2 = bottomlimit )
OR price IS NULL
OR pricebefore IS NULL
OR pricebefore2 IS NULL
OR Date(price_date) <> Curdate() )
UNION ALL
SELECT *
FROM items
WHERE number > 0
AND bottomlimit IS NOT NULL
AND condition IN (1, 2, 3)
AND price_date < Date_sub(Now(), INTERVAL 1 hour)
AND ( NOT ( price = bottomlimit
AND pricebefore = bottomlimit
AND pricebefore2 = bottomlimit )
OR price IS NULL
OR pricebefore IS NULL
OR pricebefore2 IS NULL
OR Date(price_date) <> Curdate() )
Define the following index
- condition
- price_date
- bottomLimit
Please give me a feedback on the results. Thanks.

Thanks for helping me!
The best solution for me was after testing everything to convert the table to MyISAM.
With this change i can speed the query up 3 to 5 times faster - from round about 12 seconds to less then 3 seconds.

Alternate to this sql query that will return rows having 0 documents

Is there any alternative to the following query which will return me records having zero documents?? Please help me.
SELECT * FROM ( SELECT ROW_NUMBER() OVER( Order By ParentID ) AS RowNumber_ps, UPPER(HostApplicationLocalData.ParentID) asParentID, COUNT(Document.ID) as Documents FROM HostApplicationLocalData LEFT OUTER JOIN Document ON HostApplicationLocalData.ID=Document.HostApplicationLocalData_ID WHERE HostApplicationLocalData.TransactionType_ID = 1 AND (Document.Removed=0 OR Document.HostApplicationLocalData_ID IS NULL) AND HostApplicationLocalData.Company_ID = 9000 AND (SharePointURI is not null or ((SharePointURI isnull and Content_ID is not null ) or ((HostApplicationLocalData_ID is not null andHostApplicationLocalData_ID != 0 and Content_ID isnull)) ) ) GROUP BY HostApplicationLocalData.ParentID )q WHERE Documents = 0

Join on max(T.<column>) including further information of T

I have two tables
create table item( id int )
insert into item ( id ) values ( 1 ), ( 2 ), ( 3 )
create table itemstatus
(
itemid int
, ts datetime
, "status" int
)
insert into itemstatus ( itemid, ts, status ) values
( 1, '2013-12-01T12:00:00.000', 1 ),
( 1, '2013-12-01T11:00:00.000', 2 ),
( 1, '2014-01-01T12:00:00.000', 1 ),
( 2, '2011-01-01T12:00:00.000', 1 )
I'd like to get all items with the last status set, in this case
1, '2014-01-01T12:00:00.000', 1
2, '2011-01-01T12:00:00.000', 1
3, NULL, NULL
What's the most efficient way to solve this?
I tried with a subselect and I get the latest timestamp, but I'm not able to add the status since this field is not included in aggregate-function or group-by. If I add it, the results got grouped by status - logically - but that leads to the fact, that I get too much result-lines and would have to add a further condition / subselect.
You may use the Fiddle-link for created tables and testdata. The second query includes the status-field.
Edit:
adding a further join does the trick, but I doubt that's the way to do it.
select
i.*
, d.*
, s.status
from
item i
left join ( select ts = max(ts), itemid from itemstatus group by itemid ) d
on 1 = 1
and i.id = d.itemid
left join itemstatus s
on 1 = 1
and s.itemid = d.itemid
and s.ts = d.ts
See SQL-fiddle for testing.

You can use row_number partitioned by itemid and ordered by ts desc to get the latest registration in itemstatus per itemid.
select I.id,
S.ts,
S.status
from item as I
left outer join (
select S.status,
S.ts,
S.itemid,
row_number() over(partition by S.itemid
order by S.ts desc) as rn
from itemstatus as S
) as S
on I.id = S.itemid and
S.rn = 1

Filter out orphan table entries

Suppose there is a table with only two columns (an example is shown below). Every '1' entry should be followed (in the sorted order given below) by a '0'. However, as you can see, in the table, there are some 'orphans' where there are two consecutive '1's.
How can I create a query that returns all the rows, except for the first of any consecutive '1's? (This would reduce the example below from 16 rows to 14)
1 E
0 A
1 T
0 S
1 R
0 E
1 F
0 T
1 G
1 T
0 R
1 X
1 R
0 R
1 E
0 T
I'm going to try and clarify my problem, I think that above I simplified it too much. Imagine one table called logs, with four columns:
user (a string containing a username)
machine (a string uniquely identifying various PCs)
type (event's type: a 1 for login and a 0 for logout)
time (the time of the event being logged)
[The machine/time pair provides a unique key, as no machine can be logged in or out of twice at the same instant. Presumably an 'ID' column could be artificially created based on machine/time sort if needed.]
The idea is that every login event should be accompanied by a logout event. In an ideal word it would be fairly easy to match logins to logouts, and hence analyse the time spent logged in.
However, in the case of a power cut, the logout will not be recorded. Therefore (considering only one machine's data, sorted by time) if there are two login events in a row, we want to ignore the first login, because we don't have any reliable data from it. This is the problem I am trying to solve.

Provided, that
only 1's are dupes, never 0's
You want to get rid of all the first 1's if there are more.
Your text says "except for the first of any consecutive", but I think, this is what you want. Or there can only ever be 2, then it is the same.
SELECT x.*
FROM x
LEFT JOIN x y on y.id = (x.id + 1)
WHERE (x.nr = y.nr) IS NOT TRUE -- OR x.nr = 0
ORDER BY x.id
If you want to preserve double 0's, use the commented clause additionally, but probably not needed.
Edit after question edit:
You may want to add an auto-increment column to your data to make this simpler:
Generate (i.e. write) a row number index column in MySQL
Other RDBMS (PostgreSQL, Oracle, SQL Server, ..) have window functions like row_number() or lag() and lead() that make such an operation much easier.

Assuming you get an id (add column, set column id = record number in database) use:
select a.*
from the_table a
left join the_table b on b.id = a.id + 1
and b.col1 = 0
where a.col1 = 1
and b.id is null

Try:
select l.*
from logs l
where l.type = 0 or
not (select type
from (select * from logs order by `time` desc) n
where n.machine = l.machine and
n.user = l.user and
n.time > l.time)
group by () )

USING a CTE to separate the lag-logic from the selection criteria.
DROP TABLE tmp.bits;
CREATE TABLE tmp.bits
( id SERIAL NOT NULL
, bit INTEGER NOT NULL
, code CHAR(1)
);
INSERT INTO tmp.bits(bit, code) VALUES
(1, 'T' )
, (0, 'S' )
, (1, 'R' )
, (0, 'E' )
, (1, 'F' )
, (0, 'T' )
, (1, 'G' )
, (1, 'T' )
, (0, 'R' )
, (1, 'X' )
, (1, 'R' )
, (0, 'R' )
, (1, 'E' )
, (0, 'T' )
;
SET search_path='tmp';
SELECT * FROM bits;
-- EXPLAIN ANALYZE
WITH prevnext AS (
SELECT
bt.id AS thisid
, bt.bit AS thisbit
, bt.code AS thiscode
, bp.bit AS prevbit
, bp.code AS prevcode
FROM bits bt
LEFT JOIN bits bp ON (bt.id > bp.id)
AND NOT EXISTS ( SELECT * FROM bits nx
WHERE nx.id > bp.id
AND nx.id < bt.id
)
)
SELECT thisid, thisbit, thiscode
FROM prevnext
WHERE thisbit=0
OR prevbit IS NULL OR thisbit <> prevbit
;
EDIT:
for those poor soals that cannot use CTEs, it is easy to create a view instead:
CREATE VIEW prevnext AS (
SELECT
bt.id AS thisid
, bt.bit AS thisbit
,bt.code AS thiscode
, bp.bit AS prevbit
, bp.code AS prevcode
FROM bits bt
LEFT JOIN bits bp ON (bt.id > bp.id)
AND NOT EXISTS ( SELECT * FROM bits nx
WHERE nx.id > bp.id
AND nx.id < bt.id
)
)
;
SELECT thisid, thisbit, thiscode
FROM prevnext
WHERE thisbit=0
OR prevbit IS NULL OR thisbit <> prevbit
;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to avoid full table scan in mysql join query - mysql

Related

Optimize MySQL query - Data fetching form views table and Main table

What would be the best indices for this kind of slow MySQL-Query? (InnoDB)

Alternate to this sql query that will return rows having 0 documents

Join on max(T.<column>) including further information of T

Filter out orphan table entries

Categories

Resources