mysql query takes 3 hours to run and process - mysql

I have a query that is ran on a cron job late at night. This query is then processed through a generator as it has to populate another database and I make some additional processes and checks before it is sent to the other DB.
I am wondering is there anyway for me to speed up this query and hopefully keep it as a single query. Or will I be forced to create other queries and join the data within PHP? This queries the main mautic database.
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
(SELECT DISTINCT COUNT(cl.lead_id)) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
Here is a image of the explain
https://prnt.sc/qQtUaLK3FIpQ
Definitions
Campaign Table:
https://prnt.sc/6JXRGyMsWpcd
Campaign_leads table
https://prnt.sc/pOq0_SxW2spe
lead_tags_xref table
https://prnt.sc/oKYn92O82gHL
lead_tags table
https://prnt.sc/ImH81ECF6Ly1
categories table
https://prnt.sc/azQj_Xwq3dw9
lead_lists_lead table
https://prnt.sc/x5C5fiBFP2N7
lead_lists table
https://prnt.sc/bltkM0f3XeaH
email_list_xref table
https://prnt.sc/kXABVJSYWEUI
emails table
https://prnt.sc/7fZcBir1a6QT
I am only expected 871 rows to be completed, I have identified that the joins can be very large, in the tens of thousands.

Seems you have an useless select DISTINCT .. could you are looking for a count(distinct .. )
In this way you can avoid nested select for each rows in main select ..
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
COUNT(DISTINCT cl.lead_id) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
anyway be sure you have a proper composite index on
table campaign_leads columns campaign_id, lead_id
table lead_tags_xref columns lead_id, tag_id
table lead_lists_leads columns lead_id, leadlist_id
table email_list_xref columns leadlist_id, email_id

Related

SQL Optimization - Query take 15 seconds

I have 400 rows every tables. So, I will try to relationship every tables using LEFT JOIN based on ID, but my query takes 15 seconds, and this is my query:
SELECT
sender.id AS id,
sender.letter AS letter,
sender.date AS date,
mediaseller.contract_number AS contract,
sender.company AS company,
brand.value AS brand,
sender.message_category AS message_category,
sender.message_format AS message_format,
sender.senderid AS senderid,
cpname.value AS cpname,
sid.value AS sid,
status.status AS status,
sender.remarks AS remarks,
user.name AS name,
sender.id AS download,
mediaseller.value AS mediaseller,
lob.value AS lob,
lob.subvalue AS sublob,
sms_type.value AS type_sms,
status.approval_date,
status.batch_date,
status.done_date,
status.decline_date
FROM status
LEFT JOIN sender ON status.trxid = sender.trxid
LEFT JOIN user ON status.userid = user.id
LEFT JOIN mediaseller ON sender.mediaseller = mediaseller.id
LEFT JOIN lob ON sender.industry_category = lob.id
LEFT JOIN sms_type ON sender.type_sms = sms_type.id
LEFT JOIN cpname ON sender.cpname = cpname.id
LEFT JOIN sid ON sender.trxid = sid.trxid
LEFT JOIN brand ON sender.brand = brand.id
WHERE status.hidden = 0
ORDER BY status.id DESC LIMIT 10
I hopeful is query takes one seconds :D
Please give me advice, Thankyou!
You are not filtering by anything other than the status. So try this:
FROM (SELECT s.*
FROM status s
WHERE s.hidden = 0
ORDER BY status.id DESC
LIMIT 10
) status
LEFT JOIN sender ON status.trxid = sender.trxid
LEFT JOIN user ON status.userid = user.id
LEFT JOIN mediaseller ON sender.mediaseller = mediaseller.id
LEFT JOIN lob ON sender.industry_category = lob.id
LEFT JOIN sms_type ON sender.type_sms = sms_type.id
LEFT JOIN cpname ON sender.cpname = cpname.id
LEFT JOIN sid ON sender.trxid = sid.trxid
LEFT JOIN brand ON sender.brand = brand.id
WHERE status.hidden = 0
ORDER BY status.id DESC LIMIT 10
You still need the outer ORDER BY and LIMIT, but they should be on much less data resulting in a performance improvement.
Note: I assume that you have declared all the ids as primary keys, so they have indexes.

mysql query not working with where in condition

I am running following query in mysql.
SELECT jobtype_has_trade.jobtype_id,users_jobtype.user_id
FROM users
LEFT JOIN subcontractor ON users.subcontractor_id = subcontractor.id
AND subcontractor.quotations = "YES"
LEFT JOIN users_jobtype ON users.id = users_jobtype.user_id
AND users_jobtype.status = "A"
LEFT JOIN jobtype_has_trade ON users_jobtype.jobtype_trade_id = jobtype_has_trade.id
WHERE users.is_subcontractor = "YES"
AND users.has_android = "YES"
AND jobtype_has_trade.jobtype_id IN (1,3,4)
ORDER by users_jobtype.user_id ASC
and i am getting this output
In above records i need only that user_id that have jobtype_id 1,3 and 4. so, user_id 3 and 7 is acceptable and 9 is not because it has only 3 and 4 as jobtype_id... in sort how can i get only 3 and 7 as user_id in above query?
one option could be to use having clause with group by
SELECT users_jobtype.user_id
FROM users
LEFT JOIN subcontractor ON users.subcontractor_id = subcontractor.id
AND subcontractor.quotations = "YES"
LEFT JOIN users_jobtype ON users.id = users_jobtype.user_id
AND users_jobtype.status = "A"
LEFT JOIN jobtype_has_trade ON users_jobtype.jobtype_trade_id = jobtype_has_trade.id
WHERE users.is_subcontractor = "YES"
AND users.has_android = "YES"
AND jobtype_has_trade.jobtype_id IN (1,3,4)
group by users_jobtype.user_id
having count(distinct jobtype_has_trade.jobtype_id)=3
just check existing of jobtype_id in 1,3,4 for each user. I don't know about your schema
if you have jobtype_id in users_jobtype then you don't need to join to jobtype_has_trade in subquery in where (in my solution)
SELECT jobtype_has_trade.jobtype_id,
users_jobtype.user_id
FROM users
LEFT JOIN subcontractor ON users.subcontractor_id = subcontractor.id
AND subcontractor.quotations = "YES"
LEFT JOIN users_jobtype ON users.id = users_jobtype.user_id
AND users_jobtype.status = "A"
LEFT JOIN jobtype_has_trade ON users_jobtype.jobtype_trade_id = jobtype_has_trade.id
WHERE users.is_subcontractor = "YES"
AND users.has_android = "YES"
AND jobtype_has_trade.jobtype_id IN (1,
3,
4)
and exists( select jobtype_id from jobtype_has_trade inner join users_jobtype on users_jobtype.jobtype_trade_id=jobtype_has_trade.id where users_jobtype.user_id=users.id and
jobtype_has_trade.jobtype_id=1
)
and exists( select jobtype_id from jobtype_has_trade inner join users_jobtype on users_jobtype.jobtype_trade_id=jobtype_has_trade.id where users_jobtype.user_id=users.id and
jobtype_has_trade.jobtype_id=3
)
and exists( select jobtype_id from jobtype_has_trade inner join users_jobtype on users_jobtype.jobtype_trade_id=jobtype_has_trade.id where users_jobtype.user_id=users.id and
jobtype_has_trade.jobtype_id=4
)
ORDER BY users_jobtype.user_id ASC

MySQL - I want my table JOIN to return a single row based on a condition

in the below code there are multiple entries in 'leads' table with the same 'account_id'. I want it to return a single row - the one with the minimal value of another field 'date_entered'. I cannot use 'group by' on account_id as I intend to use 'group by' on BU and get summation accordingly. Please help.
select uc.business_unit_dp_c,
FORMAT(SUM(CASE
WHEN lc.source_leads_c not in ('Discovery','Discovery SuperEmail','Self Generated','Partner','Channel_Partner') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c not in ('Discovery','Discovery SuperEmail','Self Generated','Partner','Channel_Partner') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Online,
FORMAT(SUM(CASE
WHEN lc.source_leads_c in ('Discovery', 'Discovery SuperEmail') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c in ('Discovery', 'Discovery SuperEmail') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Discovery,
FORMAT(SUM(CASE
WHEN lc.source_leads_c in ('Partner','Channel_Partner') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c in ('Partner','Channel_Partner') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Self_Generated_CP
from opportunities as o
left join opportunities_cstm as oc on o.id=oc.id_c
left join opportunities_knw_caf_1_c as ok on o.id=ok.opportunities_knw_caf_1opportunities_ida
left join knw_caf as k on ok.opportunities_knw_caf_1knw_caf_idb=k.id
left join opportunities_knw_sivr_caf_1_c as os on os.opportunities_knw_sivr_caf_1opportunities_ida=o.id
left join knw_sivr_caf as s on s.id=os.opportunities_knw_sivr_caf_1knw_sivr_caf_idb
left join accounts_opportunities as ao on ao.opportunity_id=o.id
left join leads as l on l.account_id=ao.account_id and l.account_id <> ''
left join leads_cstm as lc on lc.id_c=l.id
left join users_cstm as uc on uc.id_c=o.assigned_user_id
where o.sales_stage='clw' and
(k.id<>'' or s.id<>'') and o.jira_raise_date <> '' and
(o.tranjection_type in ('Fresh Plan / New Customer','Number Activation','Revival','Balance Amount') or o.transaction_sivr in ('Paid Project','Number Allocation','New Feature')) and
o.jira_raise_date between '2016-06-01' and curdate()
group by uc.business_unit_dp_c
Write SQL just as you described
Select *
from from opportunities o
left join opportunities_cstm oc
on o.id = oc.id_c
left join opportunities_knw_caf_1_c ok
on o.id = ok.opportunities_knw_caf_1opportunities_ida
left join knw_caf k
on ok.opportunities_knw_caf_1knw_caf_idb = k.id
left join opportunities_knw_sivr_caf_1_c os
on os.opportunities_knw_sivr_caf_1opportunities_ida=o.id
left join knw_sivr_caf s
on s.id = os.opportunities_knw_sivr_caf_1knw_sivr_caf_idb
left join accounts_opportunities ao
on ao.opportunity_id=o.id
left join leads l
on l.account_id=ao.account_id
and l.account_id <> ''
left join leads_cstm lc
on lc.id_c = l.id
left join users_cstm uc
on uc.id_c = o.assigned_user_id
where o.sales_stage = 'clw' and
and (k.id <> '' or s.id <> '')
and o.jira_raise_date <> ''
and (o.tranjection_type in
('Fresh Plan / New Customer',
'Number Activation','Revival','Balance Amount') or
o.transaction_sivr in
('Paid Project','Number Allocation','New Feature'))
and o.jira_raise_date between '2016-06-01' and curdate()
-- next, add this additional predicate to Where clause...
use table w/DateEntered column
and date_entered =
(Select Min(date_entered)
From accounts_opportunities os
join tableWithDateEntered dr -- Table w/DateEntered
on ????? -- proper join criteria here
Where os.account_id = l.account_id)
--- or as constructed by op ( and simplified by me, since both account_id and date_entered are in table leads, that's the only table that needs to be referenced in the subquery).....
and l.date_entered =
(select min(date_entered)
from leads
where account_id = l.account_id)
select min(C.date),C.Customer_Code from (
select InvoiceNo,month(InvoiceDate) as date,Customer_Code,Createddate from tbl_Invoices A Inner Join tbl_customer B on A.customer_Code=B.CustomerCode
where YEAR(InvoiceDate)='2017'
and CustomerCode not in (select CustomerCode from tbl_customer where year(createddate) in (year(getdate())))
and CustomerCode not in (select customer_Code from tbl_Invoices where year(InvoiceDate) in (year(getdate())-1))
and CustomerCode in (select customer_Code from tbl_Invoices where year(InvoiceDate) not in (year(getdate())))
--group by Customer_Code,Createddate,InvoiceNo
)C group by C.Customer_Code

Selecting multiple tables with matching columns but only displaying unique results

Here is my existing SQL query
SELECT DISTINCT
par.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
lp.WorkDate
FROM
parsed AS par
LEFT JOIN emails AS eml ON eml.EmailID = par.OriginID
LEFT JOIN list AS l ON par.WorkOrder = l.WorkOrder
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE
par.Status != 0
AND l.Completed = 0
AND (lp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
par.WorkOrder
Right now it will only select WorkOrder matches from the parsed table. How can I also have it select WorkOrder from the locateparsed table and combine them? The best way to describe this would be something like
SELECT DISTINCT par.WorkOrder OR lp.WorkOrder FROM....
UPDATE:
Here is the completed query I am using. I just need to sort out an issue with dates now of results that come up uncompleted that were mistakenly entered.
SELECT
temp.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
temp.WorkDate
FROM (
(SELECT par.WorkOrder, lp.WorkDate, par.OriginID, par.Status
FROM parsed AS par
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder)
UNION ALL
(SELECT lp.WorkOrder, lp.WorkDate, par.OriginID, '1' AS Status
FROM parsed AS par
RIGHT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE par.WorkOrder IS NULL)
) AS temp
LEFT JOIN emails AS eml ON eml.EmailID = temp.OriginID
LEFT JOIN list AS l ON temp.WorkOrder = l.WorkOrder
WHERE
(temp.Status != 0 OR eml.Parsed IS NULL)
AND l.Completed = 0
AND (temp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
temp.WorkOrder
SELECT DISTINCT par.WorkOrder and GROUP BY par.WorkOrder will accomplish the same thing, so you don't need to use both.
As Yuri suggested, use a full outer join:
SELECT
temp.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
temp.WorkDate
FROM (
(SELECT par.WorkOrder, lp.WorkDate, par.OriginID, par.Status
FROM parsed AS par
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder)
UNION ALL
(SELECT lp.WorkOrder, lp.WorkDate, par.OriginID, '1' AS Status
FROM parsed AS par
RIGHT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE par.WorkOrder IS NULL)
) AS temp
LEFT JOIN emails AS eml ON eml.EmailID = temp.OriginID
LEFT JOIN list AS l ON temp.WorkOrder = l.WorkOrder
WHERE
temp.Status != 0
AND l.Completed = 0
AND (temp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
temp.WorkOrder
Note: In order to include the results from locateparsed, I added '1' AS Status, otherwise the WHERE condition would exclude all rows from locateparsed without a matching WorkOrder in parsed.

SQL returns incorrect data using 2 left joins

I have written a MYSQL script, that returns incorrect data. I am quite fluent in SQL, but this query is not returning correct results. Can someone have a look and see whats going on. The problem is the noOfBids, and noOfRatedTimes. The values are the same for both columns and are large values too.
select
a.user_name as userName,
coalesce(count(b.sp_user_name),0) as noOfBids,
coalesce(ROUND(AVG(b.a_amount),2),0) as avgAmount,
coalesce(count(d.sp_user_name),0) as noOfRatedTimes,
coalesce(ROUND(AVG(d.user_rate),2),0)
from users a
left join project_imds b
on b.sp_user_name = a.user_name
left join projects c
on b.project_code = c.project_code
left join sp_user_rating d
on d.sp_user_name = b.sp_user_name
where a.user_type = 'SP'
and a.active = 'Y'
group by a.user_name
order by coalesce(ROUND(AVG(d.user_rate),2),0) desc;
I have created a workaround on this, by creating a temp table to get the avg values and joining this to the main query.
Since I don't know the specifics of the data behind your query, this is only a guess. But perhaps you'd rather join "sp_user_rating" directly to "users", changing
left join sp_user_rating d
on d.sp_user_name = b.sp_user_name
to
left join sp_user_rating d
on d.sp_user_name = a.user_name
select
a.user_name as userName,
coalesce(count(b.sp_user_name),0) as noOfBids,
coalesce(ROUND(AVG(b.a_amount),2),0) as avgAmount,
coalesce(count(d.sp_user_name),0) as noOfRatedTimes,
coalesce(ROUND(AVG(d.user_rate),2),0)
from users as a
left join project_imds as b
on b.sp_user_name = a.user_name
left join projects as c
on b.project_code = c.project_code
left join sp_user_rating as d
on d.sp_user_name = b.sp_user_name
where a.user_type = 'SP'
and a.active = 'Y'
group by a.user_name
order by coalesce(ROUND(AVG(d.user_rate),2),0) desc;