Join nested relationship as columns - mysql

I need some help with my SQL. I am attempting to get the following rows...
ID
Sender
Code 1
Code 1 Score
Code 2
Code 2 Score
1
John Doe
AB
80
BA
87
2
Jane Doe
CD
45
DC
99
The code columns are the hard part. This is the relationships
Conversation -> Messages -> Highlights -> Codes.
Messages have a conversation_id
Messages Highlights have a message_id
Codes have a highlight_id
One conversation has many messages, Each Message has many Highlights, and any highlights have many codes.
SELECT
conversations.id,
CONCAT(senders.first_name, ' ', senders.last_name) as 'Sender Name'
conversations.created_at AS 'Start Date',
FROM
conversations
LEFT JOIN users AS senders ON sender.id = conversations.participant_1
LEFT JOIN users AS recipient ON recipient.id = conversations.participant_2
GROUP BY
conversations.name
I cannot figure out how to include the codes into the results as columns that a paired (Code 1 Name, Code Score).
The above query works to get the first few columns, but I'd like to join in the codes. Any help would be appreciated.

I have an idea to make use of row_number to create multiple columns
SELECT m.highlight_id,
MAX( CASE WHEN RN = 1 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 1 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 2 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 2 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 3 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 3 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 4 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 4 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 5 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 5 THEN m.code_score ELSE '' END )
FROM
(
SELECT
b.highlight_id, b.code_name, b.code_score, (CASE WHEN #T = b.highlight_id THEN #RN := #RN + 1 ELSE #RN := 1 END ) AS RN, #T := b.highlight_id
FROM highlights a
JOIN codes b ON a.highlight_id = b.highlight_id
JOIN (SELECT #T := '', #RN := 0) u
ORDER BY a.highlight_id, b.code_name
) m
GROUP BY m.highlight_id
;
If you are using MySQL 8.0, it's simpler
SELECT m.highlight_id,
MAX( CASE WHEN RN = 1 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 1 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 2 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 2 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 3 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 3 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 4 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 4 THEN m.code_score ELSE '' END ),
MAX( CASE WHEN RN = 5 THEN m.code_name ELSE '' END ),
MAX( CASE WHEN RN = 5 THEN m.code_score ELSE '' END )
FROM (
SELECT b.highlight_id, b.code_name, b.code_score, ROW_NUMBER() OVER(PARTITION BY b.highlight_id ORDER BY b.code_name) RN
FROM highlights a
JOIN codes b ON a.highlight_id = b.highlight_id
) m
GROUP BY m.highlight_id
;

Related

What kind of query optimization can be done on this query?

I'm listing out orders data based on this query. This query basically pulls the recurring orders data from the table. I'm also using some dropdown and a input field to search / filter query results.
SELECT
orders.id,
parent_id,
(
SELECT
COUNT(*)
FROM
orders o
WHERE
o.parent_id = orders.id
) AS recurring_order_count,
shopify_order_type,
shopify_order_id,
shopify_order_customer_ID,
coupon_code AS coupon,
FORMAT(shopify_order_total_price, 2) AS shopify_order_total_price,
FORMAT(
shopify_order_subtotal_price,
2
) AS shopify_order_subtotal_price,
FORMAT(
shopify_order_total_line_items_price,
2
) AS shopify_order_total_line_items_price,
FORMAT(commission_amount, 2) AS commission_amount,
(
CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected'
END
) AS is_paid,
(
CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated'
END
) AS is_invoice_generated,
DATE_FORMAT(
shopify_order_created_at,
'%m-%d-%Y'
) AS shopify_order_created_at,
(
CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused'
END
) AS is_paused,
DATE_FORMAT(
shopify_recurring_date,
'%m-%d-%Y'
) AS shopify_recurring_date
FROM
`orders`
WHERE
coupon_code LIKE '%GERALD8314%' OR shopify_order_id LIKE '%GERALD8314%' OR(
CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected'
END
) LIKE '%GERALD8314%' OR(
CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated'
END
) LIKE '%GERALD8314%' OR DATE_FORMAT(
shopify_order_created_at,
'%m-%d-%Y'
) LIKE '%GERALD8314%' OR(
CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused'
END
) LIKE '%GERALD8314%' OR DATE_FORMAT(
shopify_recurring_date,
'%m-%d-%Y'
) LIKE '%GERALD8314%' AND DATE_FORMAT(
shopify_order_created_at,
'%Y-%m-%d'
) BETWEEN ? AND ?
GROUP BY
`id`
HAVING
parent_id = 0 AND shopify_order_type = 1
ORDER BY
`id`
DESC
LIMIT 10 OFFSET 0
Is this query optimized? Is this SELECT COUNT(*) FROM orders WHERE o.parent_id = orders.id AS recurring_order_count line most expensive in terms of query execution speed? Is there anything I should take care to improve the query speed here? Please advise.
OR and LIKE with leading wildcard are terrible for performance. Consider having a FULLTEXT index across the relevant columns. If it is practical, it will be immensely faster.
WHERE coupon_code LIKE '%GERALD8314%'
OR shopify_order_id LIKE '%GERALD8314%' OR( CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected' END ) LIKE '%GERALD8314%' OR( CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated' END ) LIKE '%GERALD8314%'
OR DATE_FORMAT( shopify_order_created_at, '%m-%d-%Y' ) LIKE '%GERALD8314%' OR( CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused' END ) LIKE '%GERALD8314%'
OR DATE_FORMAT( shopify_recurring_date, '%m-%d-%Y' ) LIKE '%GERALD8314%'
AND DATE_FORMAT( shopify_order_created_at, '%Y-%m-%d' ) BETWEEN ? AND ?
Also, there may be surprises in the results. Note that you have
a OR b OR c AND d
which is the same as
a OR b OR (c AND d)
I suspect you wanted
(a OR b OR c) AND d
I think the GROUP BY is unnecessary. And the HAVING clauses can be merged into the WHERE.
GROUP BY `id`
HAVING parent_id = 0
AND shopify_order_type = 1
ORDER BY `id` DESC
This has multiple issues:
DATE_FORMAT( shopify_order_created_at, '%Y-%m-%d' ) BETWEEN ? AND ?
BETWEEN is "inclusive". The way you have written the query, it will include the entire ending day. This may not be what you wanted.
Assuming the variable is a DATE or DATETIME, it can be simplified to
shopify_order_created_at BETWEEN ? AND ?
There may be more tips; fix these then come back for more.

Explain me ways to understand this complex query

Can anyone help me to understand this big sql query. How do I break it down in small chunks to understand it ?
select t.Instrument as Instrument ,ClearingId as ClearingId,
ISNULL( PrevQty ,0) AS PrevQty,SettlePrice,
ISNULL(TodayBuyQty,0) as TodayBuyQty,
ISNULL( TodaySellQty ,0) AS TodaySellQty,
ISNULL(PrevQty +TodayBuyQty-TodaySellQty,0) as NetQty,
TodayBuyPrice, TodaySellPrice,LTP,PnL,Token
from
(
select Instrument,w.ClearingId as ClearingId,
ISNULL( PrevQty ,0) AS PrevQty,ISNULL(TodayBuyQty,0) as TodayBuyQty,
ISNULL( TodaySellQty ,0) AS TodaySellQty,
TodayAvgBuyPrice as TodayBuyPrice,TodayAvgSellPrice as TodaySellPrice,
LTP,PnL,w.Token
from
(
select Symbol as Instrument, ClearingId,
ISNULL(TodayBuyQty,0) as TodayBuyQty,
TodayAvgBuyPrice,
ISNULL( -TodaySellQty ,0) AS TodaySellQty,
TodayAvgSellPrice, NULL as LTP ,NULL as PnL ,
w.Token as Token
from
(
select Token, sum(Qty) as NetPositionQty, ClearingId,
sum(CASE WHEN Qty < 0 THEN Qty ELSE 0 END) as TodaySellQty,
sum(CASE WHEN Qty > 0 THEN Qty ELSE 0 END) as TodayBuyQty,
sum(CASE WHEN Qty < 0 THEN Qty * Price ELSE 0 END)
/
NULLIF(sum(CASE WHEN Qty < 0 THEN Qty ELSE 0 END), 0) as TodayAvgSellPrice,
sum(CASE WHEN Qty > 0 THEN Qty * Price ELSE 0 END)
/
NULLIF(sum(CASE WHEN Qty > 0 THEN Qty ELSE 0 END), 0) as TodayAvgBuyPrice
from
(
select m.Token,
(CASE WHEN ClearingId = 'SATP' THEN 'STRAITS' ELSE CASE WHEN ClearingId = 'BATP' THEN 'BPI' ELSE 'UOB' END END ) as ClearingId ,
Price/CAST(Multiplier AS float ) as Price,Qty
from
(
select Token , Exchange as ClearingId ,
LastTradePrice as Price ,
CASE WHEN Side = 'S' THEN -LastTradeQuantity ELSE LastTradeQuantity END as Qty
from Strategy_Orders
where ExchangeStatus in (9,10) )m
JOIN TokenMap t ON ( m.Token = t.Token)
UNION ALL
select m.Token, (CASE WHEN ClearingId = 'SATP' THEN 'STRAITS' ELSE CASE WHEN ClearingId = 'BATP' THEN 'BPI' ELSE 'UOB' END END ) as ClearingId ,
Price/CAST(Multiplier AS float ) as Price,
Qty
from
(
select Token , Exchange as ClearingId ,
LastTradePrice as Price ,
CASE WHEN Side = 'S' THEN -LastTradeQuantity ELSE LastTradeQuantity END as Qty
from Manual_Orders
where ExchangeStatus in (9,10) )m
JOIN TokenMap t ON ( m.Token = t.Token)
UNION ALL
select Token , ClearingId , TodayBuyPrice ,
TodayBuyQty as Qty
from EOD_Holdings
where CurrentDate =
( select top 1 CurrentDAte from EOD_Holdings
order by CurrentDAte desc
)
UNION ALL
select Token , ClearingId , TodaySellPrice ,
TodaySellQty as Qty
from EOD_Holdings
where CurrentDate = (
select top 1 CurrentDAte from EOD_Holdings
order by CurrentDAte desc
)
) x
group by Token,ClearingId) w
JOIN(select Token, Symbol from TokenMAp ) h on w.Token = h.Token
) w
FULL OUTER JOIN(
select Token, PrevQty , ClearingId
from EOD_Holdings
where CurrentDate = ( select top 1 CurrentDAte from EOD_Holdings order by CurrentDAte desc
)
) h
on w.Token = h.Token and w.ClearingId = h.ClearingId
)t
JOIN (
select * from LatestSettlePrices
) sp
on t.Instrument = sp.Instrument
You can break the query into chunks by looking at each subquery ("select ..." ) separately and running them to see the results. You need to start with the innermost queries that do not have other select statements in the "from" or "where" clause.
Also note that this query does not seem to be a clear, neither an optimal solution.
You would want to avoid using full outer joins and union all statements for the best performance.

EMBEDDED DATEDIFF EXCLUD WEEKENDS + CASE STATEMENT

I have the below query that utilizes a case statement. I would like to datediff two dates but exclude weekend days.
I have the below that excutes but now I would like to exclude Sat and Sunday from this... AND DATEDIFF(DD,A.ALERTS_CREATE_DT,S.CreatedDate) <= 2
CASE WHEN
S.Name IN ('Assessment','Survey')
AND A.ALERT_DESC = 'ER'
AND CAST(A.ALERTS_CREATE_DT AS DATE) <= CAST(S.CreatedDate AS DATE)
AND DATEDIFF(DD,A.ALERTS_CREATE_DT,S.CreatedDate) <= 2 /*EXCLUDE Sat and Sunday from the calculation*/
Full Query
SELECT
CASE WHEN
S.Name IN ('Assessment','Survey')
AND A.ALERT_DESC = 'ER'
AND CAST(A.ALERTS_CREATE_DT AS DATE) <= CAST(S.CreatedDate AS DATE)
AND
( DATEDIFF(DD,A.ALERTS_CREATE_DT,S.CreatedDate) <= 2 /*Business Days*/
--DATEDIFF(DD,A.ALERTS_CREATE_DT,S.CreatedDate) + 1
---(DATEDIFF(WK,A.ALERTS_CREATE_DT,S.CreatedDate) * 2)
---(CASE WHEN DATENAME(DW,A.ALERTS_CREATE_DT) = 'SUNDAY' THEN 1 ELSE 0 END)
---(CASE WHEN DATENAME(DW,S.CreatedDate) = 'SATURADAY' THEN 1 ELSE 0 END)
)
THEN 'Y'
WHEN A.ALERT_DESC = 'model' OR S.CreatedDate IS NULL OR S.Name = 'ER'
THEN ''
ELSE 'N'
END 'Count towards Alerts'
FROM A
FULL S ON A.id= S.id
WHERE 1=1
This should give you the required result by excluding the Saturdays and Sundays.
SELECT A.ALERT_DESC,A.ALERTS_CREATE_DT,S.Name,S.CreatedDate,
CASE WHEN
S.Name IN ('Assessment','Survey') AND A.ALERT_DESC = 'ER'
AND CAST(A.ALERTS_CREATE_DT AS DATE) <= CAST(S.CreatedDate AS DATE)
AND
(( DATEDIFF(DD,A.ALERTS_CREATE_DT,S.CreatedDate)+1
- (datediff(wk,A.ALERTS_CREATE_DT,S.CreatedDate)*2)
- case when datepart(dw,A.ALERTS_CREATE_DT)=1 then 1 else 0 end
- case when datepart(dw,S.CreatedDate)=7 then 1 else 0 end
)) <=2 THEN 'Y'
WHEN A.ALERT_DESC = 'model' OR S.CreatedDate IS NULL OR S.Name = 'ER' THEN ''
ELSE 'N' END 'Count towards Alerts'
FROM A,S

Mysql query optimization takes more than 30 sec

Hi friends can any one please help me to optimize this query ,it takes more than 30 sec.
any suggestion is warmly welcome.
Query :
SELECT
CONCAT(CCD.CONTACT_FIRST_NAME, ' ', CCD.CONTACT_LAST_NAME) AS NAME,
A.EMAIL_IDS,
CCD.UNSUBSCRIBE,
IFNULL(CSL.LOG_DATE, '') AS LOG_DATE,
CSL.IP_ADDRESS,
IF(CSL.BROWSER IS NULL, '', CSL.BROWSER) AS BROWSER,
(CASE
WHEN (UNSUBSCRIBE = 0 OR UNSUBSCRIBE IS NULL) THEN 'Opted In'
ELSE (CASE
WHEN (UNSUBSCRIBE = 1) THEN 'Opted Out'
ELSE (CASE
WHEN
((UNSUBSCRIBE = 2 OR UNSUBSCRIBE = 3)
AND CCD.CONTACT_ID NOT IN (SELECT
CONTACT_ID
FROM
CM_SUBSCRIPTION_MAIL_DATA
WHERE
IS_MAIL_SENT = 'Y'))
THEN
'Opt-In Request not Sent'
ELSE 'Opt-In Pending'
END)
END)
END) AS CURR_SUB,
(CASE
WHEN (SUBSCRIBE_FROM = 0) THEN 'Opted In'
ELSE (CASE
WHEN (SUBSCRIBE_FROM = 1) THEN 'Opted Out'
ELSE (CASE
WHEN (SUBSCRIBE_FROM = 2 OR SUBSCRIBE_FROM = 3) THEN 'Opt-In Pending'
ELSE ''
END)
END)
END) AS SUB_FROM,
SUBSCRIBE_FROM,
(CASE
WHEN (SUBSCRIBE_TO = 0) THEN 'Opted In'
ELSE (CASE
WHEN (SUBSCRIBE_TO = 1) THEN 'Opted Out'
ELSE (CASE
WHEN (SUBSCRIBE_TO = 2 OR SUBSCRIBE_TO = 3) THEN 'Opt-In Pending'
ELSE ''
END)
END)
END) AS SUB_TO,
SUBSCRIBE_TO
FROM
CM_CONTACT_DETAILS CCD
LEFT JOIN
ADDRESS A ON CCD.CONTACT_ID = A.FOREIGN_ID
LEFT JOIN
CM_SUBSCRIPTION_LOGS CSL ON CSL.CONTACT_ID = CCD.CONTACT_ID
WHERE
1 = 1
GROUP BY CSL.LOG_DATE , CCD.CONTACT_ID
ORDER BY NAME DESC , LOG_DATE DESC
LIMIT 0 , 20
Your case/when can be simplified, but I believe the killer of your query was the correlated NOT IN SELECT clause for your Opt-in request not sent.
I've change your query to do a left-join to your mailing table based on the contact AND the mailed status of "Y". So this simplifies your case to only have to check for IS NULL of the mailing table contact ID.
You can't do too much on performance since your Group by is by columns from different tables that won't take advantage of any index optimization, then all that ordered by the name makes it more of an issue... but the correlated subquery per the field being executed every time to a left-join SHOULD help.
SELECT
CONCAT(CCD.CONTACT_FIRST_NAME, ' ', CCD.CONTACT_LAST_NAME) AS NAME,
A.EMAIL_IDS,
CCD.UNSUBSCRIBE,
IFNULL(CSL.LOG_DATE, '') AS LOG_DATE,
CSL.IP_ADDRESS,
IF(CSL.BROWSER IS NULL, '', CSL.BROWSER) AS BROWSER,
CASE WHEN UNSUBSCRIBE = 0 OR UNSUBSCRIBE IS NULL THEN 'Opted In'
WHEN UNSUBSCRIBE = 1 THEN 'Opted Out'
WHEN UNSUBSCRIBE IN (2,3) AND SMD.CONTACT_ID IS NULL THEN 'Opt-In Request not Sent'
ELSE 'Opt-In Pending' END AS CURR_SUB,
CASE WHEN SUBSCRIBE_FROM = 0 THEN 'Opted In'
WHEN SUBSCRIBE_FROM = 1 THEN 'Opted Out'
WHEN SUBSCRIBE_FROM = 2 OR SUBSCRIBE_FROM = 3 THEN 'Opt-In Pending'
ELSE '' END AS SUB_FROM,
SUBSCRIBE_FROM,
CASE WHEN SUBSCRIBE_TO = 0 THEN 'Opted In'
WHEN SUBSCRIBE_TO = 1 THEN 'Opted Out'
WHEN SUBSCRIBE_TO = 2 OR SUBSCRIBE_TO = 3 THEN 'Opt-In Pending'
ELSE '' END AS SUB_TO,
SUBSCRIBE_TO
FROM
CM_CONTACT_DETAILS CCD
LEFT JOIN ADDRESS A
ON CCD.CONTACT_ID = A.FOREIGN_ID
LEFT JOIN CM_SUBSCRIPTION_LOGS CSL
ON CSL.CONTACT_ID = CCD.CONTACT_ID
LEFT JOIN CM_SUBSCRIPTION_MAIL_DATA SMD
ON CCD.CONTACT_ID = SMD.CONTACT_ID
AND IS_MAIL_SENT = 'Y'
WHERE
1 = 1
GROUP BY
CSL.LOG_DATE,
CCD.CONTACT_ID
ORDER BY
NAME DESC,
LOG_DATE DESC
LIMIT
0, 20

selecting total comments (pos & neg) + latest comment for each

I am writing a query to get the top 10 rated businesses, the number of positive comments for each business, the number of negative comments for each business and the latest comment for each of these businesses.
SELECT comment.bis_id, Sum( Case When comment.rating <= 2 Then 1 Else 0 End ) As NegVotes
, Sum( Case When comment.rating >= 4 Then 1 Else 0 End ) As PosVotes, bis.bis_name
FROM bis, comment
WHERE comment.bis_id = bis.bis_id
GROUP BY bis_id
ORDER BY PosVotes DESC
LIMIT 0, 10";
The above gets positive comments and negative comments, but I can't seem to work out how to get the latest comment as well.
SELECT
c.bis_id
, Sum( Case When c.rating <= 2 Then 1 Else 0 End ) As NegVotes
, Sum( Case When c.rating >= 4 Then 1 Else 0 End ) As PosVotes
, b.bis_name
, cc.last_comment
FROM bis b
INNER JOIN comment c on (c.bis_id = b.bis_id)
INNER JOIN (SELECT c2.bis_id, c2.comment_text as last_comment
FROM comment c2
GROUP BY c2.bis_id
HAVING c2.comment_date = MAX(c2.comment_date) ) cc
ON (cc.bis_id = b.bis_id)
GROUP BY b.bis_id
ORDER BY PosVotes DESC
LIMIT 10 OFFSET 0