Sorting results from joins - mysql

While running this query:
SELECT
a.id,
pub.name AS publisher_name,
pc.name AS placement_name,
b.name AS banner_name,
a.lead_id,
a.partner_id,
a.type,
l.status,
s.correctness,
a.landing_page,
t.name AS tracker_name,
a.date_view,
a.date_action
FROM actions AS a
LEFT JOIN publishers AS pub ON a.publisher_id = pub.id
LEFT JOIN placements AS pc ON pc.publisher_id = pub.id
LEFT JOIN banners AS b ON b.campaign_id = a.campaign_id
LEFT JOIN leads l ON
l.lead_id = a.lead_id
AND l.created = (
SELECT MAX(created) from leads l2 where l2.lead_id = l.lead_id
)
LEFT JOIN statuses AS s ON l.status = s.status
LEFT JOIN trackers AS t ON t.id = a.tracker_id
LIMIT 10
I am able to sort by every column from actions table. However when I try to for example ORDER BY b.name (from banners table, joined on actions.banner_id) or ORDER BY l.lead_id (joined from leads on more complex condition as seen above) MySQL is running query for a loooong time (most tables have tens of thousands records). Is it possible, performance-wise, to sort by joined columns?

You should rewrite the query with a inner join on the table where the column you want to sort on is.
For example, if you sort on actions.banner_id
SELECT ...
FROM actions AS a
JOIN banners AS b ON b.campaign_id = a.campaign_id
LEFT JOIN *rest of the query*
You will get the same results unless there is not enough banners that can be joined to action to produce a total of 10 rows.
I'm guessing it's not the case otherwise you wouldn't be sorting on banner_id.

You could first filter (order by, where, etc.) your records in a subquery and then join the result with the rest of the tables.

Related

Multiple COUNT() in JOIN

I'm trying to get the number of rows of two different tables with two LEFT JOIN in a MySQL query. It works well when I have a COUNT on one table like this :
SELECT a.title, a.image, COUNT(o.id) AS occasions
FROM activity a
LEFT JOIN occasion AS o ON a.id = o.activity_id
WHERE a.user_id = 1
GROUP BY a.id
ORDER BY a.created_at
DESC LIMIT 50
Here, everything works and I get the good number of "occasions".
But when I try to add an additional COUNT with an additional LEFT JOIN, the result of the second COUNT is wrong :
SELECT a.title, a.image, COUNT(o.id) AS occasions, COUNT(au.id) AS users
FROM activity a
LEFT JOIN occasion AS o ON a.id = o.activity_id
LEFT JOIN activity_user AS au ON a.id = au.activity_id
WHERE a.user_id = 4
GROUP BY a.id
ORDER BY a.created_at
DESC LIMIT 50
Here, I get the good number of "occasions" but "users" seems to be a copy of the "occasions" count, which is wrong.
So my question is, how to fix this query to have the two COUNT working together?
COUNT() counts non-NULL values. The simple way to fix your query is to use COUNT(DISTINCT):
SELECT a.title, a.image,
COUNT(DISTINCT o.id) AS occasions, COUNT(DISTINCT au.id) AS users
. . .
And this will probably work. However, it creates an intermediate table that is the Cartesian product of the two tables (for each title). That could grow very big. The more scalable solution is to use subqueries and aggregate before joining.
The used left join for activity user limits your result because the DB is not able to found related data. But when you use LEFT OUTER JOIN the it should return all expected rows and their count.

MySQL - join two tables, group and count

I have two tables:
reviewStatusPhases - id|name
and
userPhase - id|reviewStatusPhase_id|user_id|created_at|updated_at
The reviewStatusPhases table have records inserted (Active, Inactive, On Pause, Terminated...), and userPhase is empty.
The tables are connected via
userPhase.reviewStatusPhase_id = reviewStatusPhases.id
one to one.
Is it possible that in one query I get all reviewStatusPhases, and cound how many users are in each phase? In this case I will get something like this:
Active (0 Users)
Inactive (0 Users)
On Pause (0 Users)
Terminated (0 Users)
I'm making some assumptions here (e.g. INNER JOIN versus LEFT JOIN in the join, and DISTINCT in the count), but it sounds like you just want
SELECT reviewStatusPhases.name, COUNT(DISTINCT userPhase.user_id)
FROM userPhase INNER JOIN reviewStatusPhases
ON userPhase.reviewStatusPhase_id = reviewStatusPhases.id
GROUP BY reviewStatusPhases.name
Query will be as follows:
SELECT r.name as `name`, count(u.id) as `count` FROM reviewStatusPhases r LEFT OUTER JOIN userPhase u ON r.id = u.reviewStatusPhase_id GROUP BY r.name
left outer join with reviewStatusPhases on left to show all names.
group by names of reviewStatusPhases.
display reviewStatusPhases name and count of user id's (to neglect null values)
Use LEFT JOIN as follows:
SELECT COUNT(m.UserId) FROM Table1 m
LEFT JOIN Table2 k ON k.StatusId = m.StatusId
WHERE k.Status = 'Inactive'
You can easily use the Status column to track the users and their activities. In your case, ReviewStatus.
I hope the following will be helpful
SELECT RPS.Name, COUNT(UP.user_id)
FROM reviewStatusPhases RPS
LEFT OUTER JOIN userphases UP ON RPS.id = UP.reviewStatusPhase_id
GROUP BY RPS.Name
ORDER BY RPS.Name
SELECT
DISTINCT s.s_level AS 'Level',
COUNT(DISTINCT s.s_id) AS Schools,
COUNT(DISTINCT st.st_id) AS Teachers
FROM schools AS s
JOIN school_teachers AS st ON st.st_school_idFk = s.s_id AND st.st_status = 1
WHERE s.s_status = 1
GROUP BY s.s_level

Multiple LEFT JOIN in SQL Running Slow. How do I optimize it?

I am combining three tables - persons, properties, totals - using LEFT JOIN. I find the following query to be really fast but it does not give me all rows from table-1 for which there is no corresponding data in table-2 or table-3. Basically, it gives me only rows where there is data in table-2 and table-3.
SELECT a.*, b.propery_address, c.person_asset_total
FROM persons AS a
LEFT JOIN properties AS b ON a.id = b.person_id
LEFT JOIN totals AS c ON a.id = c.person_id
WHERE a.city = 'New York' AND
c.description = 'Total Immovable'
Whereas the following query gives me the correct result by including all rows from table-1 irrespective of whether there is corresponding data or no data from table-2 and table-3. However, this query is taking a really long processing time.
FROM persons AS a
LEFT JOIN
properties AS b ON a.id = b.person_id
LEFT JOIN
(SELECT person_id, person_asset_total
FROM totals
WHERE description = 'Total Immovable'
) AS c ON a.id = c.person_id
WHERE a.city = 'New York'
Is there a better way to write a query that will give data equivalent to second query but with speed of execution equivalent to the first query?
Don't use a subquery:
SELECT p.*, pr.propery_address, t.person_asset_total
FROM persons p LEFT JOIN
properties pr
ON p.id = pr.person_id LEFT JOIN
totals t
ON a.id = c.person_id AND t.description = 'Total Immovable'
WHERE p.city = 'New York';
Your approach would be fine in almost any other database. However, MySQL materializes "derived tables", which makes them much harder to optimize. The above has the same effect.
You will also notice that I changed the table aliases to be abbreviations for the table names. This makes the query much easier to follow.

Writting SQL code with using count and gorup_concat

I've already read every post with the similarly title but didn't find right answer.
What I really need to do is to count some data from MySQL table and then do group_concat because I got more than one row.
My table looks like this
and here is how I tried to run the query
SELECT
count(cal.day) * 8,
w.name
FROM claim as c
RIGHT JOIN calendar as cal ON c.id = cal.claim_id
RIGHT JOIN worker as w ON c.worker_id = w.id
GROUP BY c.id
ORDER BY w.name asc
But I get for some workers more than one row and I can't group_concat them because of count(). I need this for mysql procedure I've making so please help me if you can.
I hope I've gived you enough informations
Edit for Dylan:
See the difference in output
GROUP BY w.id
GROUP BY c.id
MySQL does'not allow two aggregate functions used together, like GROUP_CONCAT(COUNT(...)).
Therefore, we can use a sub-query to work around as below.
SELECT
GROUP_CONCAT(t.cnt_cal_day) as cnt_days,
t.name
FROM
(
SELECT
count(cal.day) * 8 as cnt_cal_day,
w.name
FROM claim as c
RIGHT JOIN calendar as cal ON c.id = cal.claim_id
RIGHT JOIN worker as w ON c.worker_id = w.id
GROUP BY c.id
ORDER BY w.name asc
) t
While the question is still not clear for me, I try to guess what you need.
This query:
SELECT
w.name,
COUNT(cal.day) * 8 AS nb_hours
FROM worker w
LEFT JOIN claim c ON w.id = c.worker_id
INNER JOIN calendar cal ON c.id = cal.claim_id
GROUP BY w.id
ORDER BY w.name ASC
returns the names of all workers and, for each one, the number of hours of vacation approved for them.
If you use LEFT JOIN calendar instead you will get the number of hours of vacation claimed by each worker (approved and not approved). In order to separate them you should make the query like this:
SELECT
w.name,
c.approved, # <---- I assumed the name of this field
COUNT(cal.day) * 8 AS nb_hours
FROM worker w
LEFT JOIN claim c ON w.id = c.worker_id
LEFT JOIN calendar cal ON c.id = cal.claim_id
GROUP BY w.id, c.approved
ORDER BY w.name ASC
This query should return 1 or 2 rows for each worker, depending on the types of vacation claims they have (none, approved only, not approved only, both). For workers that don't have any vacation claim, the query returns NULL in column approved and 0 in column nb_hours.

mysql subquery inside a LEFT JOIN

I have a query that needs the most recent record from a secondary table called tbl_emails_sent.
That table holds all the emails sent to clients. And most clients have several to hundreds of emails recorded. I want to pull a query that displays the most recent.
Example:
SELECT c.name, c.email, e.datesent
FROM `tbl_customers` c
LEFT JOIN `tbl_emails_sent` e ON c.customerid = e.customerid
I'm guessing a LEFT JOIN with a subquery would be used, but I don't delve into subqueries much. Am I going the right direction?
Currently the query above isn't optimized for specifying the most recent record in the table, so I need a little assistance.
It should be like this, you need to have a separate query to get the maximum date (or the latest date) that the email was sent.
SELECT a.*, b.*
FROM tbl_customers a
INNER JOIN tbl_emails_sent b
ON a.customerid = b.customerid
INNER JOIN
(
SELECT customerid, MAX(datesent) maxSent
FROM tbl_emails_sent
GROUP BY customerid
) c ON c.customerid = b.customerid AND
c.maxSent = b.datesent
Would this not work?
SELECT t1.datesent,t1.customerid,t2.email,t2.name
FROM
(SELECT max(datesent) AS datesent,customerid
FROM `tbl_emails_sent`
) as t1
INNER JOIN `tbl_customers` as t2
ON t1.customerid=t2.customerid
Only issue you have then is what if two datesents are the same, what is the deciding factor in which one gets picked?