Mysql - inner join with or condition taking long time mysql - mysql

Need help with MySQL query.
I have indexed mandatory columns but still getting results in 160 seconds.
I know I have a problem with Contact conditions without it results are coming in 15s.
Any kind of help is appreciated.
My Query is :
SELECT `order`.invoicenumber, `order`.lastupdated_by AS processed_by, `order`.lastupdated_date AS LastUpdated_date,
`trans`.transaction_id AS trans_id,
GROUP_CONCAT(`trans`.subscription_id) AS subscription_id,
GROUP_CONCAT(`trans`.price) AS trans_price,
GROUP_CONCAT(`trans`.quantity) AS prod_quantity,
`user`.id AS id, `user`.businessname AS businessname,
`user`.given_name AS given_name, `user`.surname AS surname
FROM cdp_order_transaction_master AS `order`
INNER JOIN `cdp_order_transaction_detail` AS trans ON `order`.transaction_id=trans.transaction_id
INNER JOIN cdp_user AS user ON (`order`.user_id=user.id OR CONCAT( user.id , '_CDP' ) = `order`.lastupdated_by)
WHERE `order`.xero_invoice_status='Completed' AND `order`.order_date > '2021-01-01'
GROUP BY `order`.transaction_id
ORDER BY `order`.lastupdated_date
DESC LIMIT 100

1. Index the columns used in the join, where section so that sql does not scan the entire table and only scans the desired columns. A full scan of the table works extremely badly.
create index for cdp_order_transaction_master table :
CREATE INDEX idx_cdp_order_transaction_master_transaction_id ON cdp_order_transaction_master(transaction_id);
CREATE INDEX idx_cdp_order_transaction_master_user_id ON cdp_order_transaction_master(user_id);
CREATE INDEX idx_cdp_order_transaction_master_lastupdated_by ON cdp_order_transaction_master(lastupdated_by);
CREATE INDEX idx_cdp_order_transaction_master_xero_invoice_status ON cdp_order_transaction_master(xero_invoice_status);
CREATE INDEX idx_cdp_order_transaction_master_order_date ON cdp_order_transaction_master(order_date);
create index for cdp_order_transaction_detail table :
CREATE INDEX idx_cdp_order_transaction_detail_transaction_id ON cdp_order_transaction_detail(transaction_id);
create index for cdp_user table :
CREATE INDEX idx_cdp_user_id ON cdp_user(id);
2. Use Owner/Schema Name
If the owner name is not specified, the SQL Server engine tries to find it in all schemas to find the object.

Related

MYSQL OR query problem (scans full table even when using indexes)

I am using EXPLAIN to get performance analysis of my below query:
SELECT `wf_cart_items` . `id`
FROM `wf_cart_items`
WHERE (`wf_cart_items` . `docket_number` = '405-2844' OR
match( `wf_cart_items` . `multi_docket_number` ) against ( '405-2844' )
)
The problem is that it shows rows to be searched 597151 while individual OR queries examine only 1 row each. How is it possible that when I use OR it is doing a full table scan?
P.S.: I have FULL-TEXT index on multi_docket_number & BTREE index on docket_number
OR is quite tricky for SQL optimizers -- both in the WHERE clause and in ON clauses.
The recommendation is to switch this to union all:
SELECT ci.id
FROM wf_cart_items ci
WHERE ci.docket_number = '405-2844'
UNION ALL
SELECT ci.id
FROM wf_cart_items ci
WHERE MATCH(ci.multi_docket_number) AGAINST ( '405-2844' ) AND
ci.docket_number <> '405-2844';
Based on the naming of your columns, I feat that multi-docket_number actually contains multiple docket numbers. If that is the case, you probably want to fix the data model, but that is another conversation.

Postgresl select count(*) time-consuming

I am using spring-data-jpa & postgresql-9.4.
There is a table: tbl_oplog. This table has about seven million rows of data, and data is need to be displayed on the front end.(paged).
I use Spring#PagingAndSortingRepository , and then I found that the data query was very slow. From the logs, I found that two SQL queries were issued:
select
oplog0_.id as id1_8_,
oplog0_.deleted as deleted2_8_,
oplog0_.result_desc as result_d3_8_,
oplog0_.extra as extra4_8_,
oplog0_.info as info5_8_,
oplog0_.login_ipaddr as login_ip6_8_,
oplog0_.level as level7_8_,
oplog0_.op_type as op_type8_8_,
oplog0_.user_name as user_nam9_8_,
oplog0_.op_obj as op_obj10_8_,
oplog0_.op as op11_8_,
oplog0_.result as result12_8_,
oplog0_.op_time as op_time13_8_,
oplog0_.login_name as login_n14_8_
from
tbl_oplog oplog0_
where
oplog0_.deleted=false
order by
oplog0_.op_time desc limit 10
And:
select
count(oplog0_.id) as col_0_0_
from
tbl_oplog oplog0_
where
oplog0_.deleted=?
(The second SQL statement is used to populate the page object,which is necessary)
I found the second statement to be very time-consuming. Why does it take so long?
How to optimize? Does this happen with Mysql?
Or is there any other way I can optimize this requirement? (It seems that select count is inevitable).
EDIT:
I'll use another table for the demonstration(same):
Table:
select count(*) from tbl_gather_log; // count is 6300931.cost 5.408S
EXPLAIN select count(*) from tbl_gather_logļ¼š
Aggregate (cost=246566.58..246566.59 rows=1 width=0)
-> Index Only Scan using tbl_gather_log_pkey on tbl_gather_log (cost=0.43..230814.70 rows=6300751 width=0)
EXPLAIN ANALYSE select count(*) from tbl_gather_log:
Aggregate (cost=246566.58..246566.59 rows=1 width=0) (actual time=6697.102..6697.102 rows=1 loops=1)
-> Index Only Scan using tbl_gather_log_pkey on tbl_gather_log (cost=0.43..230814.70 rows=6300751 width=0) (actual time=0.173..4622.674 rows=6300936 loops=1)
Heap Fetches: 298
Planning time: 0.312 ms
Execution time: 6697.267 ms
EDIT2:
TABLE:
create table tbl_gather_log (
id bigserial not null primary key,
event_level int,
event_time timestamp,
event_type int,
event_dis_type int,
event_childtype int,
event_name varchar(64),
dev_name varchar(32),
dev_ip varchar(32),
sys_type varchar(16),
event_content jsonb,
extra jsonb
);
And:
There are probably many filtering criteria supported, so i can't simply do special operations on deleted.For example, a query might be issued select * from tbl_oplog where name like xxx and type = xxx limit 10,so, there will be a query:select count * from tbl_oplog where name like xxx and type = xxx . Futhermore, i have to know exact counts. because I need to show how many pages there are on the front end.
The second statement takes a long time because it has to scan the whole table in order to count the rows.
One thing you can do is use an index:
CREATE INDEX ON tbl_oplog (deleted) INCLUDE (id);
VACUUM tbl_oplog; -- so you get an index only scan
Assuming that id is the primary key, it would be much better to use count(*) and omit the INCLUDE clause from the index.
But the best is probably to use an estimate:
SELECT t.reltuples * freq.f AS estimated_rows
FROM pg_stats AS s
JOIN pg_namespace AS n
ON s.schemaname = n.nspname
JOIN pg_class AS t
ON s.tablename = t.relname
AND n.oid = t.relnamespace
CROSS JOIN LATERAL
unnest(s.most_common_vals::text::boolean[]) WITH ORDINALITY AS val(v,id)
JOIN LATERAL
unnest(s.most_common_freqs) WITH ORDINALITY AS freq(f,id)
USING (id)
WHERE s.tablename = 'tbl_oplog'
AND s.attname = 'deleted'
AND val.v = ?;
This uses the distribution statistics to estimate the desired count.
If it is just about pagination, you don't need exact counts.
Read my blog for more on the topic of counting in PostgreSQL.

Need Help Speeding up an Aggregate SQLite Query

I have a table defined like the following...
CREATE table actions (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
end BOOLEAN,
type VARCHAR(15) NOT NULL,
subtype_a VARCHAR(15),
subtype_b VARCHAR(15),
);
I'm trying to query for the last end action of some type to happen on each unique (subtype_a, subtype_b) pair, similar to a group by (except SQLite doesn't say what row is guaranteed to be returned by a group by).
On an SQLite database of about 1MB, the query I have now can take upwards of two seconds, but I need to speed it up to take under a second (since this will be called frequently).
example query:
SELECT * FROM actions a_out
WHERE id =
(SELECT MAX(a_in.id) FROM actions a_in
WHERE a_out.subtype_a = a_in.subtype_a
AND a_out.subtype_b = a_in.subtype_b
AND a_in.status IS NOT NULL
AND a_in.type = "some_type");
If it helps, I know all the unique possibilities for a (subtype_a,subtype_b)
eg:
(a,1)
(a,2)
(b,3)
(b,4)
(b,5)
(b,6)
Beginning with version 3.7.11, SQLite guarantees which record is returned in a group:
Queries of the form: "SELECT max(x), y FROM table" returns the value of y on the same row that contains the maximum x value.
So greatest-n-per-group can be implemented in a much simpler way:
SELECT *, max(id)
FROM actions
WHERE type = 'some_type'
GROUP BY subtype_a, subtype_b
Is this any faster?
select * from actions where id in (select max(id) from actions where type="some_type" group by subtype_a, subtype_b);
This is the greatest-in-per-group problem that comes up frequently on StackOverflow.
Here's how I solve it:
SELECT a_out.* FROM actions a_out
LEFT OUTER JOIN actions a_in ON a_out.subtype_a = a_in.subtype_a
AND a_out.subtype_b = a_in.subtype_b
AND a_out.id < a_in.id
WHERE a_out.type = "some type" AND a_in.id IS NULL
If you have an index on (type, subtype_a, subtype_b, id) this should run very fast.
See also my answers to similar SQL questions:
Fetch the row which has the Max value for a column
Retrieving the last record in each group
SQL join: selecting the last records in a one-to-many relationship
Or this brilliant article by Jan Kneschke: Groupwise Max.

MySQL Query Optimization for complex query

I am working with an existing site and I came across the following MySQL query that needs optimization:
select
mo.mmrrc_order_oid,
mo.completed_by_email,
mo.completed_by_name,
mo.completed_by_title,
mo.order_submission_oid,
mo.order_dt,
mo.center_id,
mo.po_num_tx,
mo.mod_dt,
ste_s.state_cd,
group_concat(distinct osr.status_cd order by osr.status_cd) as test,
case group_concat(distinct osr.status_cd order by osr.status_cd)
when 'Fulfilled' then 'Fulfilled'
when 'Fulfilled,N/A' then 'Fulfilled'
when 'N/A' then 'N/A'
when 'Pending' then 'Pending'
else 'In Process'
end as restriction_status,
max(osr.closed_dt) as restriction_update_dt,
ot.milestone,
ot.completed_dt as tracking_update_dt,
dc.first_name,
dc.last_name,
inst.institution_name,
order_search.products as products_ordered,
mo.other_emails,
mo.customer_label,
mo.grant_numbers
from
t_mmrrc_order mo
join ste_state ste_s using(state_id)
left join t_order_contact oc
on oc.mmrrc_order_oid=mo.mmrrc_order_oid and oc.role_cd='Recipient'
left join t_distrib_cont_instn dci using(distrib_cont_instn_oid)
left join t_institution inst using(institution_oid)
left join t_distribution_contact dc using(distribution_contact_oid)
left join t_order_tracking ot
on ot.mmrrc_order_oid=mo.mmrrc_order_oid
and ifnull(ot.order_tracking_oid, '0000-00-00')= ifnull(
(
select max(order_tracking_oid)
from t_order_tracking ot3
where
ot3.mmrrc_order_oid=mo.mmrrc_order_oid
and ot3.completed_dt= (
select max(completed_dt)
from t_order_tracking ot2
where ot2.mmrrc_order_oid=mo.mmrrc_order_oid
)
), '0000-00-00')
left join t_order_strain_restriction osr
on osr.mmrrc_order_oid = mo.mmrrc_order_oid
left join order_search on order_search.mmrrc_order_oid=mo.mmrrc_order_oid
group by
mo.mmrrc_order_oid
LIMIT 0, 5
this query takes 10+ seconds to run regardless of the limit. When run without a limit, there are a total of 5,727 results and runtime is 10.624 seconds.
With "LIMIT 0, 5" it took 18.47 seconds.
I understand that there are a bunch of joins and nested selects, which is why it is so slow. Any ideas on how to optimize this without having to change the database structure?
MySQL version: 5.0.95
Most tables have over 10,000 records.
This simpler query takes about 9 seconds:
select
mo.mmrrc_order_oid,
mo.completed_by_email,
mo.completed_by_name,
mo.completed_by_title,
mo.order_submission_oid,
mo.order_dt,
mo.center_id,
mo.po_num_tx,
mo.mod_dt,
dc.first_name,
dc.last_name,
inst.institution_name,
order_search.products as products_ordered,
mo.other_emails,
mo.customer_label,
mo.grant_numbers
from
t_mmrrc_order mo
join ste_state ste_s using(state_id)
left join t_order_contact oc
on oc.mmrrc_order_oid=mo.mmrrc_order_oid and oc.role_cd='Recipient'
left join t_distrib_cont_instn dci using(distrib_cont_instn_oid)
left join t_institution inst using(institution_oid)
left join t_distribution_contact dc using(distribution_contact_oid)
left join t_order_strain_restriction osr
on osr.mmrrc_order_oid = mo.mmrrc_order_oid
left join order_search on order_search.mmrrc_order_oid=mo.mmrrc_order_oid
group by mo.mmrrc_order_oid
limit 0,5
I suppose the grouping slows it down the most. In this case, without grouping takes only 0.17 seconds. Any help would be appreciated. Thanks.
Additional details - here is what EXPLAIN gives me for the first query:
View Image
I found that order_search is a view that is causing most of the slow down. The query for the view is:
SELECT
t_oi.mmrrc_order_oid AS mmrrc_order_oid,
group_concat(t_im.icc_item_code separator ',') AS products
FROM
t_order_item t_oi
JOIN t_item_master t_im on t_oi.item_master_oid = t_im.item_master_oid
JOIN t_strain_archive on t_im.strain_archive_oid = t_strain_archive.strain_archive_oid
WHERE t_oi.item_status_cd IN (_utf8'Active',_utf8'Modified')
GROUP BY t_oi.mmrrc_order_oid
ORDER BY t_im.icc_item_code
Just assuming you haven't index the coloumns so i create some indexes for your coloumns this would help you and there are still much coloumns to index like in your join conditions you should apply this operation on that coloumns also for better execution
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamemmrrc_order_oid` (`mmrrc_order_oid`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamecompleted_by_email` (`completed_by_email`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamecompleted_by_name` (`completed_by_name`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamecompleted_by_title` (`completed_by_title`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnameorder_submission_oid` (`order_submission_oid`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnameorder_dt` (`order_dt`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamecenter_id` (`center_id`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamepo_num_tx` (`po_num_tx`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamemod_dt` (`mod_dt`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnameother_emails` (`other_emails`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamecustomer_label` (`customer_label`);
ALTER TABLE `t_mmrrc_order` ADD INDEX `Indexnamegrant_numbers` (`grant_numbers`);
ALTER TABLE `t_distribution_contact ` ADD INDEX `Indexnamefirst_name` (`first_name`);
ALTER TABLE `t_distribution_contact ` ADD INDEX `Indexnamelast_name` (`last_name`);
ALTER TABLE `order_search` ADD INDEX `Indexnameproducts` (`products`);
I managed to solve this problem by doing two separate queries from my PHP script.
First, I query the order_search view by itself and save all the data in a PHP array indexed by the mmrrc_order_oid, which then serves as a quick lookup table for products. This resulting lookup table is an array of about about 6000 strings.
Next, I perform the big complex query with order_search table omitted. This only takes about a second now. For each resulting record, I simply use the lookup table by mmrrc_order_oid to get the products for that order.

Optimizing MySql query

I would like to know if there is a way to optimize this query :
SELECT
jdc_organizations_activities.*,
jdc_organizations.orgName,
CONCAT(jos_hpj_users.firstName, ' ', jos_hpj_users.lastName) AS nameContact
FROM jdc_organizations_activities
LEFT JOIN jdc_organizations ON jdc_organizations_activities.organizationId =jdc_organizations.id
LEFT JOIN jos_hpj_users ON jdc_organizations_activities.contact = jos_hpj_users.userId
WHERE jdc_organizations_activities.status LIKE 'proposed'
ORDER BY jdc_organizations_activities.creationDate DESC LIMIT 0 , 100 ;
Now When i see the query log :
Query_time: 2
Lock_time: 0
Rows_sent: 100
Rows_examined: **1028330**
Query Profile :
2) Should i put indexes on the tables having in mind that there will be a lot of inserts and updates on those tables .
From Tizag Tutorials :
Indexes are something extra that you
can enable on your MySQL tables to
increase performance,cbut they do have
some downsides. When you create a new
index MySQL builds a separate block of
information that needs to be updated
every time there are changes made to
the table. This means that if you
are constantly updating, inserting and
removing entries in your table this
could have a negative impact on
performance.
Update after adding indexes and removing the lower() , group by and the wildcard
Time: 0.855ms
Add indexes (if you haven't) at:
Table: jdc_organizations_activities
simple index on creationDate
simple index on status
simple index on organizationId
simple index on contact
And rewrite the query by removing call to function LOWER() and using = or LIKE. It depends on the collation you have defined for this table but if it's a case insensitive one (like latin1), it will still show same results. Details can be found at MySQL docs: case-sensitivity
SELECT a.*
, o.orgName
, CONCAT(u.firstName,' ',u.lastName) AS nameContact
FROM jdc_organizations_activities AS a
LEFT JOIN jdc_organizations AS o
ON a.organizationId = o.id
LEFT JOIN jos_hpj_users AS u
ON a.contact = u.userId
WHERE a.status LIKE 'proposed' --- or (a.status = 'proposed')
ORDER BY a.creationDate DESC
LIMIT 0 , 100 ;
It would be nice if you posted the execution plan (as it is now) and after these changes.
UPDATE
A compound index on (status, creationDate) may be more appopriate (as Darhazer suggested) for this query, instead of the simple (status). But this is more guess work. Posting the plans (after running EXPLAIN query) would provide more info.
I also assumed that you already have (primary key) indexes on:
jdc_organizations.id
jos_hpj_users.userId
Post the result from EXPLAIN
Generally you need indexes on jdc_organizations_activities.organizationId, jdc_organizations_activities.contact, composite index on jdc_organizations_activities.status and jdc_organizations_activities.creationDate
Why you are using LIKE query for constant lookup (you have no wildcard symbols, or maybe you've edited the query)
The index on status can be used for LIKE 'proposed%' but can't be used for LIKE '%proposed%' - in the later case better leave only index on creationDate
What indexes do you have on these tables? Specifically, have you indexed jdc_organizations_activities.creationDate?
Also, why do you need to group by jdc_organizations_activities.id? Isn't that unique per row, or can an organization have multiple contacts?
The slowness is because mysql has to apply lower() to every row. The solution is to create a new column to store the result of lower, then put an index on that column. Let's also use a trigger to make the solution more luxurious. OK, here we go:
a) Add a new column to hold the lower version of status (make this varchar as wide as status):
ALTER TABLE jdc_organizations_activities ADD COLUMN status_lower varchar(20);
b) Populate the new column:
UPDATE jdc_organizations_activities SET status_lower = lower(status);
c) Create an index on the new column
CREATE INDEX jdc_organizations_activities_status_lower_index
ON jdc_organizations_activities(status_lower);
d) Define triggers to keep the new column value correct:
DELIMITER ~;
CREATE TRIGGER jdc_organizations_activities_status_insert_trig
BEFORE INSERT ON jdc_organizations_activities
FOR EACH ROW
BEGIN
NEW.status_lower = lower(NEW.status);
END;
CREATE TRIGGER jdc_organizations_activities_status_update_trig
BEFORE UPDATE ON jdc_organizations_activities
FOR EACH ROW
BEGIN
NEW.status_lower = lower(NEW.status);
END;~
DELIMITER ;
Your query should now fly.