Improving the performance of sql joined count query

Improving the performance of sql joined count query - mysql

In my application the users can create campaigns for sending messages. When the campaign tries to send a message, one of the three things can happen:
The message is suppressed and not let through
The message can't reach the recipient and is considered failed
The message is successfully delivered
To keep track of this, I have the following table:
My problem is that when the application has processed a lot of messages (more than 10 million), the query I use for showing campaign statistics for the user slows down by a considerable margin (~ 15 seconds), even when there are only a few (~ 10) campaigns being displayed for the user.
Here is the query I'm using:
select `campaigns`.*, (select count(*) from `processed_messages`
where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'sent') as `messages_sent`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'failed') as `messages_failed`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'supressed') as `messages_supressed`
from `campaigns` where `user_id` = 1 and `campaigns`.`deleted_at` is null order by `updated_at` desc;
So my question is: how can I make this query run faster? I believe there should be some way of not having to use sub-queries multiple times but I am not very experienced with MySQL syntax yet.

You should write this as a single join, using conditional aggregation:
SELECT
c.*,
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
ORDER BY
c.updated_at DESC;
It should be noted that at first glance, doing SELECT c.* appears to be a violation of the GROUP BY rules which say that only columns which appear in the GROUP BY clause can be selected. However, assuming that campaigns.id is the primary key column, then there is nothing wrong with selecting all columns from this table, provided that we aggregate by the primary key.
Edit:
If the above answer does not run on your MySQL server version, with an error message complaining about only full group by, then use this version:
SELECT c1.*, c2.messages_sent, c2.messages_failed, c2.message_suppressed
FROM campaigns c1
INNER JOIN
(
SELECT
c.id
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
) c2
ON c1.id = c2.id
ORDER BY
c2.updated_at DESC;

Related

MySQL upgrade from left join to something similar as FULL join

Sorry for asking here this but I need help and google is not being nice.
I have the following table Products
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.kits
LEFT join core.suppliers on core.kits.Location = core.suppliers.id
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
but does not give me all the suppliers with zeros for the ones who have no appearance in kits.
Then in I do
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.suppliers
LEFT join core.suppliers on core.suppliers.id = core.kits.Location
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
I get all suppliers and correct numbers but the query takes 8 seconds instead of 1s. Any ideas how can I get all the suppliers with the count of stocks in 1s?
cheers.

If you want all the suppliers, even those that do not appear in kits you should do a LEFT join of suppliers to kits:
SELECT COUNT(k.Location) AS total,
COUNT(CASE WHEN k.cp = 1 THEN 1 END) AS used,
s.id, s.name, s.email, s.cperson, s.adress, s.phone
FROM core.suppliers s LEFT JOIN core.kits k
ON k.Location = s.id
WHERE banned=0
GROUP BY s.id
ORDER BY s.name ASC
LIMIT 1000 OFFSET 0;
I assume that core.suppliers.id is the primary key of suppliers, so that the conition:
core.suppliers.id is not null
is not needed.
Also, if the column banned is contained in the table kits, then the condition should be moved in the ON clause:
ON k.Location = s.id AND k.banned=0
and the WHERE clause should be removed.

Select statement can trigger dead lock on table in mysql?

The SQL below is inside a MySQL stored procedure.
The procedure run by a cron job every day once at midnight to populate report table with result.
this procedure take around 2 min to run.
please note that table1 has millions of records.
i put this to run at midnight because there are INSERT/UPDATE transactions during the day but unfortunately there are some few transaction at night also.
when this procedure runs and if there are other transactions running then a deadlock error on table1 occurs.
my question is
why SELECT statement cause deadlock on table1?
how can I avoid deadlock in this kind of situation?
DROP report;
CREATE TABLE IF NOT EXISTS report AS (
SELECT
DISTINCT
companies.id company_id,
(
SELECT
SUM(`message_count`) single_phone
FROM
`table1`
WHERE
`table1`.`company_id` = companies.id
AND
`status` != 'error'
) AS single_phone,
(
SELECT
SUM(`message_count`)
FROM
`table1`
WHERE
`table1`.`company_id` = companies.id
AND
`status` != 'not error'
) AS log,
(
SELECT
SUM(`message_count`)
FROM
`table1`
WHERE
`table1`.`company_id` = companies.id
AND
`status` != 'error'
) AS log_monthly,
(
SELECT
SUM(`number_of_sms`) AS aggregate
FROM
`messages`
WHERE
`messages`.`company_id` = companies.id
) AS p_monthly
FROM
companies
INNER JOIN company_users ON companies.id = company_users.company_id
WHERE
company_users.confirmed = 1
AND
company_users.deleted_at IS NULL
);

thanks you very much for help but i have found the problem. yes this procedure cause the deadlock on table but the actual cause of the issue is that i have put ->everyMinute() in my laravel Kernal for schedule run. and there is also a cron job configured by another developer for the same that run every minute. these will run schedule every minute and that is the real cause of the deadlock problem. i have change my Kernal schedule to ->dailyAt('02:00'); now the problem is solved.

Your field-level queries should be done ONCE in the from clause to get pre-aggregates done ONCE per company ID and left-joined in case a given company may NOT have qualified records in a given category. Additionally, your query to get Single_Phone is the same as your 'log_monthly', but have no criteria showing a
break or filter on the dates of activity to filter out a single month vs overall total of everything. So, I added a where clause for filtering, but only GUESSING if such some date exists.
This query might substantially improve your performance. By moving the COLUMN-based queries for every company ID into its own subquery via left-join, those will be summed() and grouped by company ONCE, then the JOIN for the final result. COALESCE() is used so if no such counts exists, the value returned will be 0 instead of null
DROP report;
CREATE TABLE IF NOT EXISTS report AS (
SELECT
c.id company_id,
coalesce( PhoneSum.Msgs, 0 ) as Single_Phone,
coalesce( PhoneLog.Msgs, 0 ) as Log,
coalesce( MonthLog.Msgs, 0 ) as Log_Monthly,
coalesce( SMSSummary.Aggregate, 0 ) as p_monthly
from
-- this will declare an in-line variable if you do need to filter by a month as a couple of your
-- column result names infer, but have no other indicator of filtering by a given month.
( select #yesterday := date_sub( date(curdate()), interval -1 day ),
#beginOfThatMonth := date_sub( #yesterday, interval dayOfMonth( #yesterday ) -1 day ) sqlvars,
companies c
INNER JOIN company_users cu
ON m.company.id = cu.company_id
AND cu.confirmed = 1
AND cu.deleted_at IS NULL
LEFT JOIN
( SELECT
t.company_id,
SUM( t.message_count ) Msgs
FROM
table1 t
INNER JOIN company_users cu
ON t.company.id = cu.company_id
AND cu.confirmed = 1
AND cu.deleted_at IS NULL
where
t.status != 'error'
GROUP BY
t.company_id ) AS PhoneSum,
on c.id = PhoneSum.company_id
LEFT JOIN
( SELECT
t.company_id,
SUM( t.message_count ) Msgs
FROM
table1 t
INNER JOIN company_users cu
ON t.company.id = cu.company_id
AND cu.confirmed = 1
AND cu.deleted_at IS NULL
where
t.status != 'not error'
GROUP BY
t.company_id ) AS PhoneLog,
on c.id = PhoneLog.company_id
LEFT JOIN
( SELECT
t.company_id,
SUM( t.message_count ) Msgs
FROM
table1 t
INNER JOIN company_users cu
ON t.company.id = cu.company_id
AND cu.confirmed = 1
AND cu.deleted_at IS NULL
where
t.status != 'error'
-- this would only get counts of activity for current month currently active
-- but since you are running at night, you need the day before current
AND t.SomeDateFieldOnTable1 >= #beginOfThatMonth
GROUP BY
t.company_id ) AS MonthLogMsgs,
on c.id = MonthLogMsgs.company_id
LEFT JOIN
( SELECT
m.company_id,
SUM( m.number_of_sms ) aggregate
FROM
messages m
INNER JOIN company_users cu
ON m.company.id = cu.company_id
AND cu.confirmed = 1
AND cu.deleted_at IS NULL
where
m.SomeDateFieldOnMessagesTable >= #beginOfThatMonth
GROUP BY
company_id ) AS SMSSummary,
on c.id = SMSSummary.company_id

shows mysql records twice because of inner joining

In below query (Mentors) are 13 which shows me 26, while (SchoolSupervisor) are 5 which shows me 10 which is wrong. it is because of the Evidence which having 2 evidance, because of 2 evidence the Mentors & SchoolSupervisor values shows me double.
please help me out.
Query:
select t.c_id,t.province,t.district,t.cohort,t.duration,t.venue,t.v_date,t.review_level, t.activity,
SUM(CASE WHEN pr.p_association = "Mentor" THEN 1 ELSE 0 END) as Mentor,
SUM(CASE WHEN pr.p_association = "School Supervisor" THEN 1 ELSE 0 END) as SchoolSupervisor,
(CASE WHEN count(file_id) > 0 THEN "Yes" ELSE "No" END) as evidence
FROM review_m t , review_attndnce ra
LEFT JOIN participant_registration AS pr ON pr.p_id = ra.p_id
LEFT JOIN review_files AS rf ON rf.training_id = ra.c_id
WHERE 1=1 AND t.c_id = ra.c_id
group by t.c_id, ra.c_id order by t.c_id desc
enter image description here

You may perform the aggregations in a separate subquery, and then join to it:
SELECT
t.c_id,
t.province,
t.district,
t.cohort,
t.duration,
t.venue,
t.v_date,
t.review_level,
t.activity,
pr.Mentor,
pr.SchoolSupervisor,
rf.evidence
FROM review_m t
INNER JOIN review_attndnce ra
ON t.c_id = ra.c_id
LEFT JOIN
(
SELECT
p_id,
COUNT(CASE WHEN p_association = 'Mentor' THEN 1 END) AS Mentor,
COUNT(CASE WHEN p_association = 'School Supervisor' THEN 1 END) AS SchoolSupervisor,
FROM participant_registration
GROUP BY p_id
) pr
ON pr.p_id = ra.p_id
LEFT JOIN
(
SELECT
training_id,
CASE WHEN COUNT(file_id) > 0 THEN 'Yes' ELSE 'No' END AS evidence
FROM review_files
GROUP BY training_id
) rf
ON rf.training_id = ra.c_id
ORDER BY
t.c_id DESC;
Note that this also fixes another problem your query had, which was that you were selecting many columns which did not appear in the GROUP BY clause. Under this refactor, there is nothing wrong with your current select, because the aggregation take place in a separate subquery.

try adding this to the WHERE part of your query
AND pr.p_id IS NOT NULL AND rf.training_id IS NOT NULL

You can add a group by pr.p_id to remove the duplicate records there. Since, the group by on pr is not present as of now, there might be multiple records of same p_id for same ra
group by t.c_id, ra.c_id, pr.p_id order by t.c_id desc

mySQL UNION add column into generated query result

so I have this query
SELECT a.*, b.full_name as salesman
from sales a
LEFT JOIN user b ON a.salesman_id = b.id
WHERE a.deleted_at IS NULL AND (a.status = '1' || a.status = '2' )
AND a.balance <= 0
I want to add another column that is not related to any of the column from the first query. I want to add another column (payment_amount) into the generated result
After googled a while, i come into this query
SELECT a.*, b.full_name as salesman from sales a
LEFT JOIN user b ON a.salesman_id = b.id
WHERE a.deleted_at IS NULL AND (a.status = '1' || a.status = '2' )
AND a.balance <= 0
UNION ALL
SELECT '','','','','','','','','','','','','','','','','','','','','','','','','','','',payment_amount from transaction
However, i cant see payment_amount column next to the generated result.
Please be reminded, that I can't edit the database.
the first query returns 28 columns.
What is the problem here? have been dealing with it for hours.
Any help given is really appreciated. Thank you.

union all will just add rows to your results from previous query. What exactly is ur problem?. Also if you are adding extra column in your 2nd query u need to add on dummy column in first.
SELECT a.*, b.full_name as salesman,"" as payment_amount from sales a
LEFT JOIN user b ON a.salesman_id = b.id
WHERE a.deleted_at IS NULL AND (a.status = '1' || a.status = '2' )
AND a.balance <= 0
UNION ALL
SELECT '','','',... till 28 times,payment_amount from transaction

All rows in the result of a SQL query have the same column names. When you use UNION, the column names are taken from the names/aliases from the first subquery in the union. So in your case, the payment_amount will be in the column named salesman, since that's the corresponding column in the first subquery.
If you want it to be in a column of its own, you can add an extra column 0 AS payment_amount to the first subquery, and an extra '' to the second subquery.
SELECT a.*, b.full_name as salesman, 0 AS payment_amount
from sales a
LEFT JOIN user b ON a.salesman_id = b.id
WHERE a.deleted_at IS NULL AND (a.status = '1' || a.status = '2' )
AND a.balance <= 0
UNION ALL
SELECT '', '','','','','','','','','','','','','','','','','','','','','','','','','','','',payment_amount from transaction

Where clause inside joined select

I'm trying to accommodate a similar solution to this one - what I have is a SELECT query inside a JOIN, and the problem is that the query runs at full for all rows (I'm talking 60,000 rows per table - and it runs on 3 tables!).
So what I want to do, is add a WHERE clause to the SELECTs inside the JOIN.
But, I can't access the outer SELECT and get the proper WHERE condition I need.
The query I'm attempting is here:
SELECT c.compete_id AS id,
s.id AS store_id,
c.enabled AS enabled,
s.store_name AS store_name,
s.store_url AS store_url,
c.verified AS verified,
r.rating_total AS rating,
r.positive_percent AS percent,
r.type AS type
FROM compete_settings c
LEFT JOIN stores s
ON c.compete_id = s.id
LEFT JOIN (
(SELECT store_id, rating_total, positive_percent, 'ebay' AS type FROM ebay_sellers WHERE store_id = c.compete_id)
UNION
(SELECT store_id, rating_total, positive_percent, 'amazon' AS type FROM amazon_sellers WHERE store_id = c.compete_id)
UNION
(SELECT store_id, CASE WHEN rank = 0 THEN NULL ELSE (200000 - rank) END AS rating_total, '100' as positive_percent, 'alexa' AS type FROM alexa_ratings WHERE store_id = c.compete_id)
) AS r
ON c.compete_id = r.store_id
WHERE c.store_id = :store_id
Note, :store_id is a variable bound through the framework - let's imagine it's the number 12345.
How can I do this? Any ideas?

We ended up going witha different approach - we just JOINed everything and only selected the right columns with a CASE. Here's the final query:
SELECT c.id AS id,
s.id AS store_id,
c.enabled AS enabled,
s.store_name AS store_name,
s.store_url AS store_url,
c.verified AS verified,
(CASE WHEN eb.rating_total IS NOT NULL THEN eb.rating_total
WHEN am.rating_total IS NOT NULL THEN am.rating_total
WHEN ax.rank IS NOT NULL THEN ax.rank
END) AS rating,
(CASE WHEN eb.positive_percent IS NOT NULL THEN eb.positive_percent
WHEN am.positive_percent IS NOT NULL THEN am.positive_percent
WHEN ax.rank IS NOT NULL THEN '100'
END) AS percent,
(CASE WHEN eb.positive_percent IS NOT NULL THEN 'ebay'
WHEN am.positive_percent IS NOT NULL THEN 'amazon'
WHEN ax.rank IS NOT NULL THEN 'alexa'
END) AS type
FROM compete_settings c
LEFT JOIN stores s
ON c.compete_id = s.id
LEFT JOIN ebay_sellers eb ON c.compete_id = eb.store_id
LEFT JOIN amazon_sellers am ON c.compete_id = am.store_id
LEFT JOIN alexa_ratings ax ON c.compete_id = ax.store_id
WHERE c.store_id = :store_id

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Improving the performance of sql joined count query - mysql

Related

MySQL upgrade from left join to something similar as FULL join

Select statement can trigger dead lock on table in mysql?

shows mysql records twice because of inner joining

mySQL UNION add column into generated query result

Where clause inside joined select

Categories

Resources