SQL Subquery optimization multiple aggregate operations - mysql

I'm trying to find the averages and sums of different tables and group by the project. I also want to condennse the returned table into a single row.
So for my subquery I get this result
And the using the outer query I expect this:
This is my sql code so far. It works but it's performance is very slow and I"m not sure why or how to optimize it.
select sum(sub.count) as count, avg(sub.opened) as opened,
avg(sub.clicked) as clicked, avg(sub.started_watching) as started_watching,
sum(sub.views) as views
from (
select p.id, count(e.id) as count,
avg(e.opened) as opened, avg(e.read_email) as clicked,
avg(e.started_video) as started_watching, sum(e.views) as views
from projects p
inner join guests g
on g.project_id = p.id
inner join videos v
on v.guest_id = g.id
inner join emails e
on e.video_id=v.id
group by p.id) sub;

looking to your query the main problem is related to the creation of subquery result
so is important improve this query eg adding proper index on columns involved in join for each table
adn eventually add composite index for column used in select ( after the column used in join )
Select sum(sub.count) as count
, avg(sub.opened) as opened
, avg(sub.clicked) as clicked
, avg(sub.started_watching) as started_watching
, sum(sub.views) as views
from (
select p.id
, count(e.id) as count
, avg(e.opened) as opened
, avg(e.read_email) as clicked
, avg(e.started_video) as started_watching
, sum(e.views) as views
from projects p
inner join guests g on g.project_id = p.id
inner join videos v on v.guest_id = g.id
inner join emails e on e.video_id=v.id
group by p.id
) sub;
so be sure you have proper index on
guests (project_id)
videos (guest_id)
emails (video_id, id, opened, read_email,started_video, views)
and obvious on projects (id)

Related

GROUP_CONCAT(SELECT * FROM table) is something like this possible?

So I'd like to join two tables using an association table because table 'a' connects to multiple rows in table 'b', and I'd like to get all the results from each table that matches my a.value but the values from table 'b' should be concatenated somehow.
Right now my query looks like this:
SELECT *,
GROUP_CONCAT(DISTINCT g.group_name) group_name,
GROUP_CONCAT(DISTINCT g.group_profile_id) group_profile_id,
GROUP_CONCAT(DISTINCT g.group_publications) group_publications,
GROUP_CONCAT(DISTINCT g.group_followers) group_followers,
GROUP_CONCAT(DISTINCT g.group_ongoing_projects) group_ongoing_projects,
GROUP_CONCAT(DISTINCT g.group_finished_projects) group_finished_projects,
GROUP_CONCAT(DISTINCT g.group_members) group_members
FROM projects_groups pg
LEFT JOIN projects p on p.projectid = pg.projectid
RIGHT JOIN groups g on g.groupid = pg.groupid
WHERE p.projectid IN ($projectIds)
GROUP BY p.projectid
The problem is that I'm not even selecting the half of the columns of the groups table but I'd like to get all of them. I could write down all the columns like above, but it looks really ugly and also i'd have to modify this every time I alter my table.
To further explain the issue a project can connect to multiple groups, and I'd like to get the project data with all the data of its groups. I could query the groups separately but that doesn't seem logical, cause i'd have to do it for each project (e.g. for 100 projects). So i get all the projects in one query and then a query for each project to get its groups. Or I could get the projects in one query joined with the association table and then get the ids from the association table and make a second query using these ids to get all the groups through the association, but I'm looking for a simpler solution.
Pity there is no sample data nor expected result included with this question, however there are a few syntax issues worth mentioning. You start your FROM clause with this:
FROM projects_groups pg
LEFT JOIN projects p on p.projectid = pg.projectid
This allows all rows from projects_groups even if there is no associated project. If there is no associated project then all projects.* columns would be NULL. However in the where clause you require this: WHERE p.projectid IN ($projectIds) which means that any NULLs from the projects table would be ignored and hence that LEFT JOIN is useless.
So as the projects seems to be the primary concern, use it with FROM and then join to the other tables appropriate to your needs. Now my guess is you can have a project without a project group (even if only temporarily) so you might use a left join here and so the query may look like this
FROM projects p
LEFT JOIN projects_groups pg on p.projectid = pg.projectid
WHERE p.projectid IN ($projectIds)
but note you would only do that if you wanted to list projects even if they have no associated project groups. So making the assumption you do need that then continue this left join to groups as well:
FROM projects p
LEFT JOIN projects_groups pg on p.projectid = pg.projectid
LEFT JOIN groups g on g.groupid = pg.groupid
WHERE p.projectid IN ($projectIds)
If the assumption is wrong, then it will be more efficient to use inner joins:
FROM projects p
INNER JOIN projects_groups pg on p.projectid = pg.projectid
INNER JOIN groups g on g.groupid = pg.groupid
WHERE p.projectid IN ($projectIds)
Now we need to look at the SELECT & GROUP BY clauses. MySQL has historically allowed a very non-standard GROUP BY syntax and the default settings allowed a query like yours where only one column is listed to GROUP BY.
e.g. in the past this was allowed:
select p.* from projects p group by p.id
This is very bad practice and now current versions of MySQL now default to standard SQL syntax. Under standard SQL syntax rules you MUST SPECIFY every "non-aggregating column" in the group by clause. and so the example immediately above would produce a syntax error
e.g. you should always be precise about the group by columns
select p.id, p.name, count(*) from projects p group by p.id, p.name
So to help avoid syntax errors you might prefer to use a "derived table" like this:
SELECT
p.*
, d.*
FROM projects p
INNER JOIN (
SELECT
p.projectid
, GROUP_CONCAT(DISTINCT g.group_name) group_name
, GROUP_CONCAT(DISTINCT g.group_profile_id) group_profile_id
, GROUP_CONCAT(DISTINCT g.group_publications) group_publications
, GROUP_CONCAT(DISTINCT g.group_followers) group_followers
, GROUP_CONCAT(DISTINCT g.group_ongoing_projects) group_ongoing_projects
, GROUP_CONCAT(DISTINCT g.group_finished_projects) group_finished_projects
, GROUP_CONCAT(DISTINCT g.group_members) group_members
FROM projects p
LEFT JOIN projects_groups pg on p.projectid = pg.projectid
LEFT JOIN groups g on g.groupid = pg.groupid
WHERE p.projectid IN ($projectIds)
GROUP BY
p.projectid
) d on p.projectid = d.projectid

MySQL JOIN tables with COUNT values

I have the following tables in my database.I only listed the important columns which can be used for joining.
I need to get the following output
Currently I'm using two seperate queries for each COUNT value
For assigned licenses
select
products.id,products.name,COUNT(assigned_licenses.id)
from
deployment_users
inner join
assigned_licenses
on
deployment_users.id = assigned_licenses.deployment_user_id
inner join
products
on
assigned_licenses.id = products.id
and
deployment_users.customer_id = 10
group by
assigned_licenses.id
;
For total licenses
select
products.id,products.name,COUNT(total_licenses.id)
from
customers
inner join
total_licenses
on
customers.iccode = licenses.iccode
inner join
products
on
total_licenses.id = products.id
and
customers.id = 10
group by
total_licenses.id
;
Since there are more than a 1,000 products that need to be listed,I want to combine them into a single query.How can I do that?
Your specification leaves some room for interpretation (e.g. can a user have assigned licenses without total licenses? if yes my query will fail.) but I would go with this.
SELECT
products.id,
products.name,
Count(Distinct total_licenses.id) As CountTotalLicenses,
Count(Distinct assigned_liceses.deployment_users_id) As CountAssignedLicenses
FROM products
LEFT JOIN total_licenses ON total_licenses.products_id = products.id
LEFT JOIN customers ON customers.iccode = total_licenses.customers_iccode
LEFT JOIN assigned_licenses ON assigned_liceses.total_licenses_id = total_licenses.id
WHERE
customers.id = 10
GROUP BY
products.id,
products.name
For the future it would be awesome if you could paste code as code and not as an image. People cannot simple copy paste snippets of your code and have to type everything again...
Try joining Both of your query
SELECT * FROM (
(First Query) as assigned_licn
INNER JOIN
(Second Query) as total_licn
USING (id)
);

Combining join results with an at least function

I'm new to SQL and have managed to pick up the basic functions capably enough, however I'm now trying to find the people with at least two tokens from the results of an inner join:
SELECT
users.[First Name],
users.[Last Name],
IssuedTokens.UserID,
IssuedTokens.TokenID,
Tokens.TokenType
FROM IssuedTokens
INNER JOIN users ON users.ID = IssuedTokens.UserID
INNER JOIN Tokens ON Tokens.number = IssuedTokens.TokenID
GROUP BY IssuedTokens.UserID
HAVING COUNT(*) >= 2
ORDER BY IssuedTokens.UserID
This gives the error:
Column 'Users.First Name' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.
I'm comfortable using functions on pre-existing tables, but have not seen how to manipulate the results of a join. If anyone could help it would be much appreciated.
You can do a separate aggregation -- before the join -- to get the users with multiple tokens. Then, the rest of the query doesn't need an aggregation:
SELECT u.[First Name], u.[Last Name], u.UserID, it.TokenID, t.TokenType
FROM IssuedTokens it INNER JOIN
users u
ON u.ID = it.UserID INNER JOIN
Tokens t
ON t.number = it.TokenID INNER JOIN
(SELECT it.UserId
FROM IssuedTokens it
GROUP BY it.UserId
HAVING COUNT(*) >= 2
) itu
ON itu.UserId = it.UserId
ORDER BY it.UserID;

MySQL query optimization: Multiple SELECT IN to LEFT JOIN

I usually go with the join approach but in this case I am a bit confused. I am not even sure that it is possible at all. I wonder if the following query can be converted to a left join query instead of the multiple select in used:
select
users.id, users.first_name, users.last_name, users.description, users.email
from users
where id in (
select assigned.id_user from assigned where id_project in (
select assigned.id_project from assigned where id_user = 1
)
)
or id in (
select projects.id_user from projects where projects.id in (
select assigned.id_project from assigned where id_user = 1
)
)
This query returns the correct result set. However, I guess the repetition of the query that selects assigned.id_project is a waste.
You could start with the project assignments of user 1 a1. Then find all assignments of other people to those projects a2, and the user in the project table p. The users you are looking for are then in either a2 or p. I added distinct to remove users who can be reached in both ways.
select distinct u.*
from assigned a1
left join
assigned a2
on a1.id_project = a2.id_project
left join
project p
on a1.id_project = p.id
join user u
on u.id = a2.id_user
or u.id = p.id_user
where a1.id_user = 1
Since both subqueries have a condition where assigned.id_user = 1, I start with that query. Let's call that assignment(s) the 'leading assignment'.
Then join the rest, using left joins for the 'optional' tables.
Use an inner join on user that matches either users of assignments linked to the leading assignment or users of projects linked to the leading project.
I use distinct, because I assumen you'd want each user once, event if they have an assignment and a project (or multiple projects).
select distinct
u.id, u.first_name, u.last_name, u.description, u.email
from
assigned a
left join assigned ap on ap.id_project = a.id_project
left join projects p on p.id = a.id_project
inner join users u on u.id = ap.id_user or u.id = p.id_user
where
a.id_user = 1
Here's an alternative way to get rid of the repetition:
SELECT
users.id,
users.first_name,
users.last_name,
users.description,
users.email
FROM users
WHERE id IN (
SELECT up.id_user
FROM (
SELECT id_user, id_project FROM assigned
UNION ALL
SELECT id_user, id FROM projects
) up
INNER JOIN assigned a
ON a.id_project = up.id_project
WHERE a.id_user = 1
)
;
That is, the assigned table's pairs of id_user, id_project are UNIONed with those of projects. The resulting set is then joined with the user_id = 1 projects to obtain the list of all users who share the projects with the ID 1 user. And now it only remains to retrieve the details for those users, which in this case is done in the same way as in your query, i.e. using an IN clause.
I'm sorry to say that I don't have MySQL to thoroughly test the performance of this query and so cannot be quite sure if it is in any way better or worse than your original query or than the one suggested both by #GolezTrol and by #Andomar. Generally I tend to agree with #GolezTrol's comment that a query with simple (semi- or whatever-) joins and repetitive parts might turn out more efficient than an equivalent sophisticated query that doesn't have repetitions. In the end, however, it is testing that must reveal the final answer for you.

MySql query to get count of days spent in each country for each purpose? (Get count of all record in second table present in first table)

I have three tables tl_log, tl_geo_countries,tl_purpose. I am trying to get the count of number of days spent in each country in table 'tl_log' for each purpose in table 'tl_purpose'.
I tried below mysql query
SELECT t.country_id AS countryID,t.reason_id AS reasonID,count(t.reason_id) AS
days,c.name AS country, p.purpose AS purpose
FROM `tl_log` AS t
LEFT JOIN tl_geo_countries AS c ON t.country_id=c.id
LEFT JOIN tl_purpose AS p ON t.reason_id=p.id
GROUP BY t.reason_id,t.country_id ORDER BY days DESC
But landed up with.
I am not able to get the count for purpose for each country in 'tl_log' that is not present in table 'tl_log'. Any help is greatly appreciated. Also, Please let me know if the question is difficult to understand.
Expected Output:
Below is the structure of these three tables
tl_log
tl_geo_countries
tl_purpose
If you want all possible combination of countries and purposes, even those that do not appear on the log table (these will be shown with a count of 0), you can do first a cartesian product of the two tables (a CROSS join) and then LEFT join to the log table:
SELECT
c.id AS countryID,
p.id AS reasonID,
COUNT(t.reason_id) AS days,
c.name AS country,
p.purpose AS purpose
FROM
tl_geo_countries AS c
CROSS JOIN
tl_purpose AS p
LEFT JOIN
tl_log AS t
ON t.country_id = c.id
AND t.reason_id = p.id
GROUP BY
p.id,
c.id
ORDER BY
days DESC ;
If you want the records for only the countries that are present in the log table (but still all possible reason/purposes), a slight modification is needed:
SELECT
c.id AS countryID,
p.id AS reasonID,
COUNT(t.reason_id) AS days,
c.name AS country,
p.purpose AS purpose
FROM
( SELECT DISTINCT
country_id
FROM
tl_log
) AS dc
JOIN
tl_geo_countries AS c
ON c.id = dc.country_id
CROSS JOIN
tl_purpose AS p
LEFT JOIN
tl_log AS t
ON t.country_id = c.id
AND t.reason_id = p.id
GROUP BY
p.id,
c.id
ORDER BY
days DESC ;
LEFT JOIN should be replaced by RIGHT JOIN