Selecting multiple tables with matching columns but only displaying unique results - mysql

Here is my existing SQL query
SELECT DISTINCT
par.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
lp.WorkDate
FROM
parsed AS par
LEFT JOIN emails AS eml ON eml.EmailID = par.OriginID
LEFT JOIN list AS l ON par.WorkOrder = l.WorkOrder
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE
par.Status != 0
AND l.Completed = 0
AND (lp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
par.WorkOrder
Right now it will only select WorkOrder matches from the parsed table. How can I also have it select WorkOrder from the locateparsed table and combine them? The best way to describe this would be something like
SELECT DISTINCT par.WorkOrder OR lp.WorkOrder FROM....
UPDATE:
Here is the completed query I am using. I just need to sort out an issue with dates now of results that come up uncompleted that were mistakenly entered.
SELECT
temp.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
temp.WorkDate
FROM (
(SELECT par.WorkOrder, lp.WorkDate, par.OriginID, par.Status
FROM parsed AS par
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder)
UNION ALL
(SELECT lp.WorkOrder, lp.WorkDate, par.OriginID, '1' AS Status
FROM parsed AS par
RIGHT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE par.WorkOrder IS NULL)
) AS temp
LEFT JOIN emails AS eml ON eml.EmailID = temp.OriginID
LEFT JOIN list AS l ON temp.WorkOrder = l.WorkOrder
WHERE
(temp.Status != 0 OR eml.Parsed IS NULL)
AND l.Completed = 0
AND (temp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
temp.WorkOrder

SELECT DISTINCT par.WorkOrder and GROUP BY par.WorkOrder will accomplish the same thing, so you don't need to use both.
As Yuri suggested, use a full outer join:
SELECT
temp.WorkOrder,
l.Address,
l.Subdivision,
eml.MailDate,
temp.WorkDate
FROM (
(SELECT par.WorkOrder, lp.WorkDate, par.OriginID, par.Status
FROM parsed AS par
LEFT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder)
UNION ALL
(SELECT lp.WorkOrder, lp.WorkDate, par.OriginID, '1' AS Status
FROM parsed AS par
RIGHT JOIN locateparsed AS lp ON par.WorkOrder = lp.WorkOrder
WHERE par.WorkOrder IS NULL)
) AS temp
LEFT JOIN emails AS eml ON eml.EmailID = temp.OriginID
LEFT JOIN list AS l ON temp.WorkOrder = l.WorkOrder
WHERE
temp.Status != 0
AND l.Completed = 0
AND (temp.WorkDate IS NOT NULL OR eml.MailDate IS NOT NULL)
GROUP BY
temp.WorkOrder
Note: In order to include the results from locateparsed, I added '1' AS Status, otherwise the WHERE condition would exclude all rows from locateparsed without a matching WorkOrder in parsed.

Related

mysql query takes 3 hours to run and process

I have a query that is ran on a cron job late at night. This query is then processed through a generator as it has to populate another database and I make some additional processes and checks before it is sent to the other DB.
I am wondering is there anyway for me to speed up this query and hopefully keep it as a single query. Or will I be forced to create other queries and join the data within PHP? This queries the main mautic database.
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
(SELECT DISTINCT COUNT(cl.lead_id)) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
Here is a image of the explain
https://prnt.sc/qQtUaLK3FIpQ
Definitions
Campaign Table:
https://prnt.sc/6JXRGyMsWpcd
Campaign_leads table
https://prnt.sc/pOq0_SxW2spe
lead_tags_xref table
https://prnt.sc/oKYn92O82gHL
lead_tags table
https://prnt.sc/ImH81ECF6Ly1
categories table
https://prnt.sc/azQj_Xwq3dw9
lead_lists_lead table
https://prnt.sc/x5C5fiBFP2N7
lead_lists table
https://prnt.sc/bltkM0f3XeaH
email_list_xref table
https://prnt.sc/kXABVJSYWEUI
emails table
https://prnt.sc/7fZcBir1a6QT
I am only expected 871 rows to be completed, I have identified that the joins can be very large, in the tens of thousands.
Seems you have an useless select DISTINCT .. could you are looking for a count(distinct .. )
In this way you can avoid nested select for each rows in main select ..
SELECT c.id as "campaign_id",
c.created_by_user,
c.name,
c.date_added,
c.date_modified,
COUNT(DISTINCT cl.lead_id) as number_of_leads,
GROUP_CONCAT(lt.tag) as tags,
cat.title as category_name,
GROUP_CONCAT(ll.name) as segment_name,
GROUP_CONCAT(emails.name) as email_name,
CASE WHEN c.is_published = 1 THEN "Yes" ELSE "No" END AS "published",
CASE WHEN c.publish_down > now() THEN "Yes"
WHEN c.publish_down > now() AND c.is_published = 0 THEN "Yes"
ELSE "No" END AS "expired"
FROM campaigns c
LEFT JOIN campaign_leads cl ON cl.campaign_id = c.id
LEFT JOIN lead_tags_xref ltx on cl.lead_id = ltx.lead_id
LEFT JOIN lead_tags lt on ltx.tag_id = lt.id
LEFT JOIN categories cat on c.category_id = cat.id
LEFT JOIN lead_lists_leads llist on cl.lead_id = llist.lead_id
LEFT JOIN lead_lists ll on llist.leadlist_id = ll.id
LEFT JOIN email_list_xref el on ll.id = el.leadlist_id
LEFT JOIN emails on el.email_id = emails.id
GROUP BY c.id;
anyway be sure you have a proper composite index on
table campaign_leads columns campaign_id, lead_id
table lead_tags_xref columns lead_id, tag_id
table lead_lists_leads columns lead_id, leadlist_id
table email_list_xref columns leadlist_id, email_id

MySQL - I want my table JOIN to return a single row based on a condition

in the below code there are multiple entries in 'leads' table with the same 'account_id'. I want it to return a single row - the one with the minimal value of another field 'date_entered'. I cannot use 'group by' on account_id as I intend to use 'group by' on BU and get summation accordingly. Please help.
select uc.business_unit_dp_c,
FORMAT(SUM(CASE
WHEN lc.source_leads_c not in ('Discovery','Discovery SuperEmail','Self Generated','Partner','Channel_Partner') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c not in ('Discovery','Discovery SuperEmail','Self Generated','Partner','Channel_Partner') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Online,
FORMAT(SUM(CASE
WHEN lc.source_leads_c in ('Discovery', 'Discovery SuperEmail') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c in ('Discovery', 'Discovery SuperEmail') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Discovery,
FORMAT(SUM(CASE
WHEN lc.source_leads_c in ('Partner','Channel_Partner') and k.id<>'' THEN k.order_value
WHEN lc.source_leads_c in ('Partner','Channel_Partner') and s.id<>'' THEN s.sivr_aiv_inr
ELSE 0
END),0)
as Self_Generated_CP
from opportunities as o
left join opportunities_cstm as oc on o.id=oc.id_c
left join opportunities_knw_caf_1_c as ok on o.id=ok.opportunities_knw_caf_1opportunities_ida
left join knw_caf as k on ok.opportunities_knw_caf_1knw_caf_idb=k.id
left join opportunities_knw_sivr_caf_1_c as os on os.opportunities_knw_sivr_caf_1opportunities_ida=o.id
left join knw_sivr_caf as s on s.id=os.opportunities_knw_sivr_caf_1knw_sivr_caf_idb
left join accounts_opportunities as ao on ao.opportunity_id=o.id
left join leads as l on l.account_id=ao.account_id and l.account_id <> ''
left join leads_cstm as lc on lc.id_c=l.id
left join users_cstm as uc on uc.id_c=o.assigned_user_id
where o.sales_stage='clw' and
(k.id<>'' or s.id<>'') and o.jira_raise_date <> '' and
(o.tranjection_type in ('Fresh Plan / New Customer','Number Activation','Revival','Balance Amount') or o.transaction_sivr in ('Paid Project','Number Allocation','New Feature')) and
o.jira_raise_date between '2016-06-01' and curdate()
group by uc.business_unit_dp_c
Write SQL just as you described
Select *
from from opportunities o
left join opportunities_cstm oc
on o.id = oc.id_c
left join opportunities_knw_caf_1_c ok
on o.id = ok.opportunities_knw_caf_1opportunities_ida
left join knw_caf k
on ok.opportunities_knw_caf_1knw_caf_idb = k.id
left join opportunities_knw_sivr_caf_1_c os
on os.opportunities_knw_sivr_caf_1opportunities_ida=o.id
left join knw_sivr_caf s
on s.id = os.opportunities_knw_sivr_caf_1knw_sivr_caf_idb
left join accounts_opportunities ao
on ao.opportunity_id=o.id
left join leads l
on l.account_id=ao.account_id
and l.account_id <> ''
left join leads_cstm lc
on lc.id_c = l.id
left join users_cstm uc
on uc.id_c = o.assigned_user_id
where o.sales_stage = 'clw' and
and (k.id <> '' or s.id <> '')
and o.jira_raise_date <> ''
and (o.tranjection_type in
('Fresh Plan / New Customer',
'Number Activation','Revival','Balance Amount') or
o.transaction_sivr in
('Paid Project','Number Allocation','New Feature'))
and o.jira_raise_date between '2016-06-01' and curdate()
-- next, add this additional predicate to Where clause...
use table w/DateEntered column
and date_entered =
(Select Min(date_entered)
From accounts_opportunities os
join tableWithDateEntered dr -- Table w/DateEntered
on ????? -- proper join criteria here
Where os.account_id = l.account_id)
--- or as constructed by op ( and simplified by me, since both account_id and date_entered are in table leads, that's the only table that needs to be referenced in the subquery).....
and l.date_entered =
(select min(date_entered)
from leads
where account_id = l.account_id)
select min(C.date),C.Customer_Code from (
select InvoiceNo,month(InvoiceDate) as date,Customer_Code,Createddate from tbl_Invoices A Inner Join tbl_customer B on A.customer_Code=B.CustomerCode
where YEAR(InvoiceDate)='2017'
and CustomerCode not in (select CustomerCode from tbl_customer where year(createddate) in (year(getdate())))
and CustomerCode not in (select customer_Code from tbl_Invoices where year(InvoiceDate) in (year(getdate())-1))
and CustomerCode in (select customer_Code from tbl_Invoices where year(InvoiceDate) not in (year(getdate())))
--group by Customer_Code,Createddate,InvoiceNo
)C group by C.Customer_Code

How to count the results of a where clause in order to calculate a proportion a MYSQL select clause?

I'm starting out with this query, which gives me back 8 records with a "claimed" status. I'm looking to see if any of the addresses in the invites-from-address column are different from that in the moves-from-address column :
SELECT i.id, i.company_id, i.status,
ia_f.base_street as "invites-from-address", a_f.base_street as "moves-from-address",
ia_t.base_street as "invites-to-address", a_t.base_street as "moves-to-address", i.`mover_first_name`,
i.mover_last_name, i.`to_address_id`
FROM invites i
JOIN moves m ON i.id = m.`claimed_invite_id`
JOIN `invite_addresses` ia_f ON ia_f.id = i.`from_address_id`
JOIN addresses a_f ON a_f.id = m.from_address_id
JOIN `invite_addresses` ia_t ON ia_t.id = i.to_address_id
JOIN addresses a_t ON a_t.id = m.to_address_id
WHERE i.`company_id` = 1040345
GROUP BY id
What I'm trying to do in this query below is to create an average_discrepancy column on the fly that shows the proportion of addresses that differ between invites-from-address and moves-from-address. I was able to successfully check for address discrepancies by using a WHERE clause that checks that ia_f.base_street is not equal to a_f.base_street (which are aliased to the columns invites-from-address and moves-from-address respectively) but when I put this WHERE clause inside the count function in my SELECT cause it doesn't work. Is it because I can't place a WHERE clause inside a SELECT or a count function or both? And is there also a problem with trying to divide the results of two calls to the count function in my SELECT clause ?
SELECT i.id, i.company_id, i.status,
count(WHERE ia_f.base_street != a_f.base_street)/count(i.status="claimed") as "average_discrepancy",
ia_f.base_street as "invites-from-address", a_f.base_street as "moves-from-address",
ia_t.base_street as "invites-to-address", a_t.base_street as "moves-to-address",
i.`mover_first_name`,
i.mover_last_name, i.`to_address_id`
FROM invites i
JOIN moves m ON i.id = m.`claimed_invite_id`
JOIN `invite_addresses` ia_f ON ia_f.id = i.`from_address_id`
JOIN addresses a_f ON a_f.id = m.from_address_id
JOIN `invite_addresses` ia_t ON ia_t.id = i.to_address_id
JOIN addresses a_t ON a_t.id = m.to_address_id
WHERE i.`company_id` = 1040345
AND i.status = "claimed"
You need to put this into a SUM instead of a COUNT. Something like this would do the trick:
SELECT i.id, i.company_id, i.status,
SUM(CASE WHEN ia_f.base_street != a_f.base_street THEN 1 ELSE 0 END)/ SUM(CASE WHEN i.status='claimed' THEN 1 ELSE 0 END) as 'average_discrepancy',
ia_f.base_street as 'invites-from-address',
a_f.base_street as 'moves-from-address',
ia_t.base_street as 'invites-to-address',
a_t.base_street as 'moves-to-address',
i.mover_first_name,
i.mover_last_name,
i.to_address_id
FROM invites i
JOIN moves m ON i.id = m.claimed_invite_id
JOIN invite_addresses ia_f ON ia_f.id = i.from_address_id
JOIN addresses a_f ON a_f.id = m.from_address_id
JOIN invite_addresses ia_t ON ia_t.id = i.to_address_id
JOIN addresses a_t ON a_t.id = m.to_address_id
WHERE i.company_id = 1040345
AND i.status = 'claimed'

Ho to get last item in mysql using max()

I want to get the last item as result using max(), but i'm just getting the first item even if im using max()
Here's the SQL code:
SELECT r.correct, r.items, r.percentage,MAX(r.date_taken) as date_taken,
u.username,u.FN, u.user_course_type,
IFNULL(u.user_major_type,'N/A') as user_major_type,u.level_name,
u.section_name
FROM bcc_fs_exam_result r
INNER JOIN
(SELECT u.id_user, u.username, CONCAT(u.lastname,', ',u.firstname) as FN,
c.user_course_type, m.user_major_type, l.level_name, s.section_name
FROM bcc_fs_user u
LEFT JOIN bcc_fs_user_course c on c.id_user_course = u.id_user_course
LEFT JOIN bcc_fs_user_major m on m.id_user_major = u.id_user_major
LEFT JOIN bcc_fs_group_level l ON l.id_level = u.id_level
LEFT JOIN bcc_fs_group_section s ON s.id_section = u.id_section
) u ON r.id_user = u.id_user WHERE r.id_exam = 5 GROUP BY r.id_user
TIA
What I take from your question is that you want to get the "correct", "items", "percentage" ect. columns of the row in bcc_fs_exam_result which has the last or first date.
If that's correct then you can filter bcc_fs_exam_result by first finding what is the min or max date by for each id_user then join that back on the exam results table.
SELECT
r.correct,
r.items,
r.percentage,
r.date_taken,
u.username,
u.FN,
u.user_course_type,
IFNULL(u.user_major_type,'N/A') as user_major_type,
u.level_name,
u.section_name
FROM bcc_fs_exam_result r
INNER JOIN (
SELECT u.id_user,
u.username,
CONCAT(u.lastname,', ',u.firstname) as FN,
c.user_course_type,
m.user_major_type,
l.level_name,
s.section_name
FROM bcc_fs_user u
LEFT JOIN bcc_fs_user_course c on c.id_user_course = u.id_user_course
LEFT JOIN bcc_fs_user_major m on m.id_user_major = u.id_user_major
LEFT JOIN bcc_fs_group_level l ON l.id_level = u.id_level
LEFT JOIN bcc_fs_group_section s ON s.id_section = u.id_section
) u ON r.id_user = u.id_user
INNER JOIN (
SELECT
id_user, max(r.date_taken) as last_date_taken
FROM bcc_fs_exam_result
GROUP BY id_user
) as lastdate ON lastDate.id_user = r.id_user and r.date_taken = lastdate.last_date_taken
which could be more simply written as :
SELECT
r.correct,
r.items,
r.percentage,
r.date_taken,
u.username,
CONCAT(u.lastname,', ',u.firstname) as FN,
c.user_course_type,
IFNULL(m.user_major_type,'N/A') as user_major_type,
l.level_name,
s.section_name
FROM bcc_fs_exam_result r
INNER JOIN (
SELECT
id_user, max(r.date_taken) as last_date_taken
FROM bcc_fs_exam_result
GROUP BY id_user
) as lastdate ON lastDate.id_user = r.id_user and r.date_taken = lastdate.last_date_taken
INNER JOIN bcc_fs_user u on r.id_user = u.id_user
LEFT JOIN bcc_fs_user_course c on c.id_user_course = u.id_user_course
LEFT JOIN bcc_fs_user_major m on m.id_user_major = u.id_user_major
LEFT JOIN bcc_fs_group_level l ON l.id_level = u.id_level
LEFT JOIN bcc_fs_group_section s ON s.id_section = u.id_section
You have assumed that id_user + date_taken is a surrogate key of bcc_fs_exam_result which you should probably enforce with a constraint if at all possible. Otherwise it looks from your sample data that the order of the unique id column id_result follows the date_takenso you might be better to use Max(id_result) rather than Max(date_taken). That would avoid returning duplicate rows for one id_user where two rows of bcc_fs_exam_result have the same taken_date and `id_user.

selecting row with highest value and a join

Ho can i only fetch the rows with the highest cvID value?
current code
SELECT
CollectionVersionBlocks.cID,
CollectionVersionBlocks.cbDisplayOrder,
CollectionVersionBlocks.cvID,
btContentLocal.bID,
btContentLocal.content
FROM
CollectionVersionBlocks
INNER JOIN btContentLocal
ON CollectionVersionBlocks.bID = btContentLocal.bID
WHERE (CollectionVersionBlocks.cID = 259)
AND CollectionVersionBlocks.isOriginal = 1
AND CollectionVersionBlocks.arHandle = 'main'
AND btContentLocal.content != ''
I want to get the row at the bottom (where the cvID value is 10).
This is a test statement for a bigger result set -
I will eventually need a set of results from perset cIDs (CollectionVersionBlocks.cID = 259 OR CollectionVersionBlocks.cID = 260... upto 800)
updated screenshots
1) too few results
2) un grouped results
To get the highest row per group (from your question i assume cID as a single group) you can do so by using a self join on the maxima of your desired column by using additional condition in in your third join i.e ON(c.cID=cc.cID AND c.cvID=cc.cvID)
SELECT
c.cID,
c.cbDisplayOrder,
c.cvID,
b.bID,
b.content
FROM
CollectionVersionBlocks c
INNER JOIN btContentLocal b
ON (c.bID = b.bID)
INNER JOIN
(SELECT cID, MAX(cvID) cvID FROM CollectionVersionBlocks GROUP BY cID) cc
ON(c.cID=cc.cID AND c.cvID=cc.cvID)
WHERE (c.cID = 259)
AND c.isOriginal = 1
AND c.arHandle = 'main'
AND b.content != ''
and for multiple groups you can just use WHERE c.cID IN(259,....800)
Try below Query:
SELECT CollectionVersionBlocks.cID,CollectionVersionBlocks.cbDisplayOrder, CollectionVersionBlocks.cvID , btContentLocal.bID , btContentLocal.content
FROM CollectionVersionBlocks
INNER JOIN btContentLocal
ON CollectionVersionBlocks.bID=btContentLocal.bID
WHERE (CollectionVersionBlocks.cID = 259)
AND CollectionVersionBlocks.isOriginal=1 AND CollectionVersionBlocks.arHandle ='main' AND btContentLocal.content !='' and CollectionVersionBlocks.cID in
(
SELECT Max(CollectionVersionBlocks.cID)
FROM CollectionVersionBlocks
INNER JOIN btContentLocal
ON CollectionVersionBlocks.bID=btContentLocal.bID
WHERE (CollectionVersionBlocks.cID = 259)
AND CollectionVersionBlocks.isOriginal=1 AND CollectionVersionBlocks.arHandle ='main' AND btContentLocal.content !='' )