Combined queries returns different results - mysql

The results of a query I created from two different queries is not returning the same results.
Query A
SELECT
c.fullname Course,
u.firstname First,
u.lastname Last,
u.id ID,
u.institution Company
FROM (mdl_scorm_scoes_track AS st)
JOIN mdl_user AS u ON st.userid=u.id
JOIN mdl_scorm AS sc ON sc.id=st.scormid
JOIN mdl_course AS c ON c.id=sc.course
Join mdl_user_enrolments AS uenr ON uenr.userid=u.id
Join mdl_enrol AS enr ON enr.id=uenr.enrolid
WHERE (
(st.value='incomplete' OR st.value='not attempted')
AND DATEDIFF(NOW(), FROM_UNIXTIME(uenr.timecreated)>60)
ORDER BY c.fullname, u.lastname,u.firstname, u.id
Query B
SELECT
c.fullname AS Course,
u.firstname AS Firstname,
u.lastname AS Lastname,
u.id AS ID,
u.institution AS Company,
IF (u.lastaccess = 0,'never',
DATE_FORMAT(FROM_UNIXTIME(u.lastaccess),'%Y-%m-%d')) AS dLastAccess
,(SELECT DATE_FORMAT(FROM_UNIXTIME(timeaccess),'%Y-%m-%d') FROM mdl_user_lastaccess WHERE userid=u.id AND courseid=c.id) AS CourseLastAccess
FROM mdl_user_enrolments AS ue
JOIN mdl_enrol AS e ON e.id = ue.enrolid
JOIN mdl_course AS c ON c.id = e.courseid
JOIN mdl_user AS u ON u.id = ue.userid
LEFT JOIN mdl_user_lastaccess AS ul ON ul.userid = u.id
WHERE ul.timeaccess IS NULL AND (DATEDIFF(NOW(), FROM_UNIXTIME(ue.timecreated))>60)
ORDER BY u.id, c.fullname
I have combined them into Query C
SELECT
c.fullname AS Course,
u.firstname AS Firstname,
u.lastname AS Lastname,
u.id AS IDNumber,
u.institution AS Institution,
IF (u.lastaccess = 0,'never',
DATE_FORMAT(FROM_UNIXTIME(u.lastaccess),'%Y-%m-%d')) AS dLastAccess
,(SELECT DATE_FORMAT(FROM_UNIXTIME(timeaccess),'%Y-%m-%d') FROM mdl_user_lastaccess WHERE userid=u.id AND courseid=c.id) AS CourseLastAccess
FROM mdl_user_enrolments AS ue
JOIN mdl_enrol AS e ON e.id = ue.enrolid
JOIN mdl_course AS c ON c.id = e.courseid
JOIN mdl_user AS u ON u.id = ue.userid
LEFT JOIN mdl_user_lastaccess AS ul ON ul.userid = u.id
WHERE (ul.timeaccess IS NULL OR ue.userid IN
(SELECT u.id
FROM (mdl_scorm_scoes_track AS st)
JOIN mdl_scorm AS sc ON sc.id=st.scormid
WHERE c.id=sc.course AND st.userid=u.id AND (st.value='incomplete' OR st.value='not attempted')
)
)AND (DATEDIFF(NOW(), FROM_UNIXTIME(ue.timecreated))>60)
ORDER BY c.fullname, u.lastname,u.firstname
I have not found where my logic is incorrect in Query C. Query C is adding an incorrect record not found by either A or B and duplicating entries in a couple of cases.
I would like some pointers on where my logic on combining the 2 went astray.
I fixed the commas in this post, the actual queries did have them.
My intent is to list all users that have been enrolled into a course but within a given timeframe, have not logged into the system, accessed the course and finally have not completed the activity in the course.
So the logic I am looking for is:
If the user has not logged in within 60 days - display name
If logged in but has not accessed the course within 60 days - display name
If logged in and has accessed the course but has not completed the course activity- display name
Query A does list all users that started the activity but have not completed within 60 days
Query B does list all users that have not logged in or accessed the course within 60 days
In trying to combine the 2 queries to satisfy all 3 conditions is where I am having problems. I first tried a UNION but could not get it to work.

You are not selecting from the same worlds in both queries. Query A has 6 tables it is selecting from and Query B has 5 (its missing the mdl_scorm table) and your Query C has the 5 tables that Query B has. So now your scope for Query A has changed. That extra table could have been eliminating rows in the join that are now appearing in Query C for the selects of A.
I would check Query A with the mdl_scorm table missing from the query and you probably have a different result size.
Think of it as
select COLUMNS from TABLES where FILTERS; TABLES (universe of data) needs to be the same for both queries for you to just join the COLUMNS and FILTERS (expand selection and limit results)

Related

Substitute "OR EXISTS" in MySql query so i can get better perfomance results

This query is taking forever to finish in MySql 8, doing some research i found out that the "EXISTS" in this code can be extremely slow in some queries.
When i remove the "OR EXISTS" sub-query part, it runs in less than a second.
So i need to substitute the "OR EXISTS" in this query so i can get all the users i need:
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE u.status = 2 OR EXISTS (
SELECT *
FROM user_recovery ur
WHERE ur.user_id = u.id
AND ur.status_recovery = 1
)
Is there a way to do it without the "OR EXISTS"?
Or can enforce a full scan
try
you can't get rid of the eXISTS clause because it increases the number of returned rows.
Add a INDEX on user status and user_recovery userid,status_recovery and on the on Clause columns.
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE u.status = 2
UNION
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE EXISTS (
SELECT 1
FROM user_recovery ur
WHERE ur.user_id = u.id
AND ur.status_recovery = 1
)
"I'll see your UNION; and raise you a derived table."
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM ( SELECT id
FROM users
WHERE status = 2
UNION DISTINCT -- or UNION ALL; see below
SELECT user_id
FROM user_recovery
WHERE status_recovery = 1 -- see new index
) AS u1
JOIN users AS u USING(id) -- self-join to pick up other columns
JOIN roles r ON r.id = u.role_id
JOIN sectors s ON s.id = u.sector_id
JOIN branches b ON b.id = u.branch_id;
Indexes:
user_recovery: INDEX(status_recovery, user_id) -- in this order
users: INDEX(status, id) -- in this order
(I assume `id` is the PRIMARY KEY in each table)
The general rule here is... When you have a bunch of JOINs, but a single table that controls which rows, but that is messy or slow (eg UNION in this case, GROUP BY or LIMIT in other cases),
Optimize finding the ids (user.id aka user_id) is the optimal way.
Then JOIN back to the original table (if needed), plus the other tables.
In doing all that, it became apparent that a new index for user_recovery might be beneficial.
(If UNION ALL won't produce any dups, switch to it for a little more speed.)

use 2 left join can't work but separately can get results

I have three tables, company, user and share. I want to count one company's user and share, they are not relevant.
There may be a row that has share value but not user value. so I used left join, I can get results separately, but it doesn't work together.
Here is my query:
SELECT c.name, count(u.company_id), count(s.company_id)
FROM company c
LEFT JOIN user u
ON c.id=u.company_id and u.company_id=337
WHERE u.company_id is NOT NULL
LEFT JOIN share s
ON c.id=s.id AND s.company_id=337
WHERE s.company_id is NOT NULL
You need to do at least one of the counts in a subquery. Otherwise, both counts will be the same, since you're just counting the rows in the resulting cross product.
SELECT c.name, user_count, share_count
FROM company AS c
JOIN (SELECT company_id, COUNT(*) AS user_count
FROM users
GROUP BY company_id) AS u
ON u.company_id = c.id
JOIN (SELECT company_id, COUNT(*) AS share_count
FROM share
GROUP BY company_id) AS s
ON s.company_id = c.id
WHERE c.company_id = 337
Another option is to count the distinct primary keys of the tables you're joining with:
SELECT c.name, COUNT(DISTINCT u.id) AS user_count, COUNT(DISTINCT s.id) AS share_count
FROM company AS c
JOIN users AS u on u.company_id = c.id
JOIN share AS s ON s.company_id = c.id
WHERE c.company_id = 337
Your code looks okay, except for the extra WHERE clause. However, you probably want COUNT(DISTINCT), because the two counts will return the same value:
SELECT c.name, count(distinct u.company_id), count(distinct s.company_id)
FROM company c LEFT JOIN
user u
ON c.id = u.company_id and u.company_id=337 LEFT JOIN
share s
ON c.id = s.id AND s.company_id=337
WHERE s.company_id is NOT NULL AND u.company_id IS NOT NULL;

MySQL check if value is in result set before summing

I have a table of ratings for comments, when I fetch comments, I also fetch the ratings and I also want to be able to display which comments the logged user has already voted on. This is what I am doing now
SELECT
c.id,
c.text,
c.datetime,
c.author,
u.email AS author_name,
SUM(cr.vote) AS rating,
cr2.vote AS voted
FROM comments c
LEFT JOIN users u ON u.id = c.author
LEFT JOIN comments_ratings cr ON c.id = cr.comment
LEFT JOIN comments_ratings cr2 ON c.id = cr2.comment AND cr2.user = :logged_user_id
GROUP BY c.id ORDER BY c.id DESC
But I don't like how I'm performing a second join on the same table. I know it is perfectly valid but if I could get the information I want from the first join, which is there anyway, why perform a second one?
Is it possible to figure out if a row with column user equal to :logged_user_id exists on table comments_ratings cr before executing the aggregate function(s)?
P.S.: If someone could come up with a better title, people can find in future, I'd also appreciate that.
You can do what you want with conditional aggregation:
SELECT c.id, c.text, c.datetime, c.author, u.email AS author_name,
SUM(cr.vote) AS rating,
MAX(cr.user = :logged_user_id) as voted
FROM comments c LEFT JOIN
users u
ON u.id = c.author LEFT JOIN
comments_ratings cr
ON c.id = cr.comment
GROUP BY c.id
ORDER BY c.id DESC;

Why isn't this query working?

Here is the query in question:
SELECT e.id, e.name, u.x_account_username, ao.id
FROM x_evidence e
LEFT JOIN x_ambition_owner ao
ON e.ambition_id = ao.id
LEFT JOIN x_user u
ON ao.profile_id = u.id
WHERE e.updated BETWEEN '2014-03-19 10:16:00' AND '2014-03-19 11:16:00'
Can anyone see what I'm doing wrong?
Evidence table:
id ambition_id name updated
1jk 2ef abc 2014-03-19 10:33:31
Ambition owner table:
id ambition_id profile_id
1op 2ef 1abc
User table:
id x_account_username
1abc rex hamilton
I want my returning to get:
the evidence id, name, user's name, the associated ambition id, between the times stated (but these could vary)
Thanks
One way it is not working is that the where clause is turning the first left join into an inner join. So, you would not be getting all rows from the query that you expect. In fact, you might not be getting any of them.
You can fix this by moving the condition into the on clause:
SELECT ao.id AS a_id, e.id AS e_id, e.name AS evidence_name,
u.x_account_username AS username, 'evidence_edit' AS type
FROM x_ambition_owner ao
LEFT JOIN x_evidence e
ON ao.id = e.ambition_id
AND e.updated BETWEEN '2014-03-19 10:16:00' AND '2014-03-19 11:16:00'
LEFT JOIN x_user u
ON ao.profile_id = u.id;
EDIT (by OP):
This is the query I will be using:
SELECT e.id, e.name, u.x_account_username AS username, a.ambition_id
FROM x_evidence e
LEFT JOIN x_ambition_owner a
ON e.ambition_id = a.ambition_id
LEFT JOIN x_user u
ON a.profile_id = u.id
WHERE e.updated BETWEEN '2014-03-19 10:00:00' AND '2014-03-19 11:16:00'

Flattened string of fields from an associated table in the result sets as a comma separated string

I have the query below (shorted up all the fields in the select to just *'s).
SELECT u.*, e1.*, e2.*
FROM employee_db e1
JOIN employee_db e2 ON e1.manager_id = e2.id
JOIN users u ON u.id = e1.id
There are two more tables involved:
teams (need a flattened version of 'team_name' where user is assigned
to the team)
team_user_associations (team_id, user_id)
(users have many teams through team_user_associations).
What I need is 1 field added to the results that's a comma separated string of all the 'team_name'(s) a users belongs to. I'm having trouble figuring out what the approach would be here... Would it be something like the resutls of a subquery where the 'team_name' field in the subqueries record set are flattend down to a comma separated string that becomes a field in the main query?
Thanks for any help!
You could try a solution like the one below:
SELECT u.*, e1.*, e2.*,
GROUP_CONCAT(t.team_name ORDER BY t.team_name) AS team_names
FROM employee_db e
JOIN employee_db e2 ON l.manager_id = l2.id
JOIN users u ON u.id = l.id
JOIN team_user_associations ta ON ta.user_id = u.id
JOIN teams t ON ta.team_id = t.id
GROUP BY user_id
Try it and tell me if it works, it's difficult without knowing the structure and purpose of the other tables.
Anyway, the trick is to expand the rows by joining with team/user associations, and reducing them again with the GROUP BY. Having the associations now aggregated, you can use GROUP_CONCAT to retrieve the team name column.
SELECT u.*, e.*, e2.*, GROUP_CONCAT(t.team_name)
FROM employee_db e
JOIN employee_db e2
ON e2.id = e.manager_id
JOIN users u
ON u.id = e.id
JOIN team_user te
ON te.user_ud = u.id
JOIN teams t
ON t.id = te.team_id
GROUP BY
u.id