SQL Left Inner Join (mysql) - mysql

I have been using the following query:
I am using two tables: (there are some others mentioned but not needed for this question)
assessment_criteria
+-------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+----------------+
| id | mediumint(9) | NO | PRI | NULL | auto_increment |
| scheme_of_work_id | mediumint(9) | NO | | NULL | |
| level | char(255) | YES | | NULL | |
| criteria | char(255) | NO | | NULL | |
+-------------------+--------------+------+-----+---------+----------------+
criteria_completed
+------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+--------------+------+-----+---------+----------------+
| id | mediumint(9) | NO | PRI | NULL | auto_increment |
| student_ID | mediumint(9) | NO | | NULL | |
| assessment_criteria_id | mediumint(9) | NO | | NULL | |
| date_marked | date | NO | | NULL | |
| notes | varchar(255) | YES | | NULL | |
| attainment | varchar(15) | YES | | NULL | |
| effort | varchar(15) | YES | | NULL | |
| marked_by | varchar(20) | NO | | NULL | |
+------------------------+--------------+------+-----+---------+----------------+
I was using a query like this to display a list of assessment criteria that a student HAS NOT completed:
SELECT DISTINCT assessment_criteria.id, assessment_criteria.level, assessment_criteria.criteria FROM assessment_criteria, criteria_completed
WHERE (assessment_criteria.scheme_of_work_id = '17')
AND (assessment_criteria.id NOT IN (SELECT criteria_completed.assessment_criteria_id FROM criteria_completed WHERE (student_ID = '403')))
ORDER BY level;
This query has become incredibly slow to run, I have been trying to make it faster using LEFT JOIN.
SELECT DISTINCT a.id, a.level, a.criteria
FROM assessment_criteria a
LEFT JOIN criteria_completed b
ON a.id = b.assessment_criteria_id
WHERE b.assessment_criteria_id IS NULL
But I am having no success when I try to add in clauses for project and student; ie.
SELECT DISTINCT a.id, a.level, a.criteria
FROM assessment_criteria a
LEFT JOIN criteria_completed b
ON a.id = b.assessment_criteria_id
WHERE b.assessment_criteria_id IS NULL
AND (b.student_ID = '403')
AND (a.scheme_of_work_id = '17');
mysql reports "empty set". I suspect I am referencing these foreign keys incorrectly?

(Just to confirm, you are using b.assessment_criteria_id IS NULL to detect failed joins)
Applying the filters on table b to the WHERE clause will filter out any records where the join has failed, which I believe is the cause of the problem.
You can try moving the b filters into the JOIN condition:
SELECT DISTINCT a.id, a.level, a.criteria
FROM assessment_criteria a
LEFT JOIN criteria_completed b
ON a.id = b.assessment_criteria_id
AND (b.student_ID = 403)
WHERE b.assessment_criteria_id IS NULL
AND (a.scheme_of_work_id = 17);
Although personally, I dislike filtering like this in a JOIN. The alternative would be:
SELECT DISTINCT a.id, a.level, a.criteria
FROM assessment_criteria a
LEFT JOIN criteria_completed b
ON a.id = b.assessment_criteria_id
WHERE (a.scheme_of_work_id = 17)
AND (b.assessment_criteria_id IS NULL OR b.student_ID = 403);

Related

SQL query on a JOINED table with multiple conditions

I have two sql tables: The wall table and the tag table. Each of them is linked with has_and_belongs_to_many relationship. Also the tag table has unique names.
Here are the tables in sql
mysql> describe tags;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | UNI | NULL | |
| count | int(11) | NO | | NULL | |
| created_at | datetime | NO | | NULL | |
| updated_at | datetime | NO | | NULL | |
+------------+--------------+------+-----+---------+----------------+
mysql> describe tags_walls;
+---------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------+------+-----+---------+-------+
| tag_id | bigint(20) | NO | MUL | NULL | |
| wall_id | bigint(20) | NO | | NULL | |
+---------+------------+------+-----+---------+-------+
mysql> describe walls;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | | NULL | |
| created_at | datetime | NO | | NULL | |
| updated_at | datetime | NO | | NULL | |
+------------+--------------+------+-----+---------+----------------+
I am in rails 5 and i want to query a wall that has multiple tags.
I'm trying to do
result = Wall.all.includes(:tags).where(tags: {name: 'TAG1'})
result = result.where(tags: {name: 'TAG2'})
and the query that is constructed by rails is
SELECT DISTINCT `walls`.`id`
FROM `walls`
LEFT OUTER JOIN `tags_walls` ON `tags_walls`.`wall_id` = `walls`.`id`
LEFT OUTER JOIN `tags` ON `tags`.`id` = `tags_walls`.`tag_id`
WHERE `tags`.`name` = 'TAG1' AND `tags`.`name` = 'TAG2'
It should give me multiple walls as a results but the return is #<ActiveRecord::Relation []>
I want to build a custom sql query and just do a
Wall.includes(:tags).where query
How can i do a WHERE query on a joined table with multiple conditions linked by an AND ?
I would write this as:
SELECT tw.id
FROM tags_walls tw JOIN
tags t
ON t.id = tw.tag_id
WHERE t.name IN ('TAG1', 'TAG2')
GROUP BY tw.id
HAVING COUNT(*) = 2;
This assumes that tags are not duplicated on a wall. If that is possible, then use COUNT(DISTINCT t.name) = 2.
Notes:
walls is not needed, so that JOIN is removed.
You are looking for matches, so INNER JOIN is more appropriate than LEFT JOIN.
Table aliases make the query easier to write and to read.
Unnecessary backticks make the query harder to write and to read.
SELECT w.id
FROM walls w
JOIN tags_walls tw
ON tw.wall_id = w.id
JOIN tags t
ON t.id = tw.tag_id
AND t.name IN('TAG1','TAG2')
GROUP
BY w.id
HAVING COUNT(DISTINCT t.name) = 2 -- where '2' equals the number of arguments in IN()
WHERE tags.name = 'TAG1' OR tags.name = 'TAG2'
or
WHERE tags.name IN ('TAG1','TAG2')

MYSQL Counting matching results across multiple tables

I have the following tables
Business
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| b_id | bigint(20) | NO | PRI | NULL | auto_increment |
| b_name | varchar(255) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Locations
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| l_id | bigint(20) | NO | PRI | NULL | auto_increment |
| l_name | varchar(255) | NO | | NULL | |
| b_id | big(20) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Jobs
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| j_id | bigint(20) | NO | PRI | NULL | auto_increment |
| j_name | varchar(255) | NO | | NULL | |
| b_id | bigint(20) | NO | | NULL | |
| l_id | bigint(20) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
People
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| u_id | bigint(20) | NO | PRI | NULL | auto_increment |
| salutation | varchar(10) | NO | | NULL | |
| first_name | varchar(25) | NO | | NULL | |
| last_name | varchar(25) | NO | | NULL | |
+-------------+---------------+------+-----+---------+----------------+
People's Jobs
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| pj_id | bigint(20) | NO | PRI | NULL | auto_increment |
| u_id | bigint(20) | NO | | NULL | |
| j_id | bigint(20) | NO | | NULL | |
| l_id | bigint(20) | NO | MUL | NULL | |
+-------------+------------+------+-----+---------+----------------+
I need to produce a table that shows
+----------+-------------------------+------------+------------+------------+
| b_id | b_name | Locations | Jobs | People |
+----------+-------------------------+------------+------------+------------+
| 21 | Widgets Inc | 0 | x | 0 |
| 24 | Prince Privates | 0 | 0 | 0 |
| 23 | Halon plc | x | 0 | 0 |
| 18 | Stinky Hotels | x | x | x |
| 20 | Pylon Catering Corps | x | x | x |
| 22 | Skytrain Biscuits | 0 | 0 | 0 |
+----------+-------------------------+------------+------------+------------+
I can achieve a correct count of matching locations for each business with:
SELECT b.b_id,
b.b_name,
count(l.l_id) AS locations
FROM business AS b
LEFT JOIN locations AS l ON b.b_id=l.b_id
GROUP BY b.b_id
ORDER BY b_name
If I extend it to include a count of the jobs at each business and then the count of people at each business it all goes pear shaped.
I know that the following is inherently wrong with regards to getting the count of people (as people can hold more than 1 job). I don't know if I need to use sub selects or COALESCE?
SELECT b.b_id,
b.b_name,
count(l.l_id) AS locations,
count(j.j_id) AS jobs,
count(p.u_id) AS people
FROM business AS b
LEFT JOIN locations AS l ON b.b_id=l.b_id
LEFT JOIN job AS j ON b.b_id=j.b_id
LEFT JOIN people_jobs AS p ON l.l_id=p.l_id
GROUP BY b.b_id
ORDER BY b_name
I think you can do a quick-and-dirty fix of your query by using count(distinct):
SELECT b.b_id, b.b_name,
count(distinct l.l_id) AS locations,
count(distinct j.j_id) AS jobs,
count(distinct p.u_id) AS people
FROM business b LEFT JOIN
locations l
ON b.b_id = l.b_id LEFT JOIN
job j
ON b.b_id = j.b_id LEFT JOIN
people_jobs p
ON l.l_id = p.l_id
GROUP BY b.b_id
ORDER BY b_name ;
It is also possible that the problem is simply that the join to people_jobs needs more conditions:
people_jobs p
ON l.l_id = p.l_id and j.j_id = p.j_id
And maybe a condition on u.
Your problem is that you are trying to do aggregation across multiple dimensions and getting a cartesian product for each business. An alternative that is sometimes necessary is to do the counts in subqueries.
This query should do what you need:
SELECT
b.b_id,
b.b_name,
(SELECT COALESCE(COUNT(l_id ),0) FROM locations WHERE b_id=b.b_id) AS locations,
(SELECT COALESCE(COUNT(j_id ),0) FROM jobs WHERE b_id=b.b_id) AS jobs,
(SELECT COALESCE(COUNT(DISTINCT u_id),0)
FROM jobs j
JOIN people_jobs pj ON pj.j_id=j.j_id
WHERE j.b_id=b.b_id
) AS people
FROM business as b
ORDER BY b_name
You don't need the GROUP BY if you use subSELECTs, as the outer query will return 1 row per b_id, no more.
If instead you do JOIN the 4 tables at the main query level, like you were doing, you have two difficulties:
number of rows increases (avoidable with GROUP BY)
a simple COUNT does not work properly (avoidable with COUNT(DISTINCT
...))
(as shown in Gordon's answer)
You can try This Query:-
SELECT b.b_id,b.b_name,count(l.l_id) AS locations,count(j.j_id) AS jobs,count(p.u_id) AS people
FROM business as b LEFT JOIN locations as l ON b.b_id=l.b_id
LEFT JOIN job as j ON b.b_id=j.b_id
LEFT JOIN people_jobs as p ON l.l_id=p.l_id
GROUP BY b.b_id, b.b_name
ORDER BY b_name
I hope this will work for you.

SELECT statement with an case statement

I have the following table for Customer:
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| first | varchar(45) | YES | | NULL | |
| last | varchar(45) | YES | | NULL | |
| password | varchar(45) | NO | | NULL | |
| contact_id | int(11) | NO | PRI | NULL | |
| address_id | int(11) | NO | PRI | NULL | |
+------------+-------------+------+-----+---------+----------------+
And the following structure for Appointment:
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| time | datetime | NO | | NULL | |
| cancelled | tinyint(1) | YES | | 0 | |
| confirmed | tinyint(1) | YES | | 0 | |
| customer_id | int(11) | NO | PRI | NULL | |
+-------------+------------+------+-----+---------+----------------+
I want to use a single query to get the customer information, if they have appointment information, then it'll query it, otherwise it won't.
I am trying to use the following:
CASE
WHEN (SELECT count(a.id) FROM appointment
INNER JOIN customer c ON a.customer_id = c.id)
THEN (SELECT c.first, c.last, c.id, a.id FROM appointent
INNER JOIN customer c ON a.customer_id = c.id)
ELSE
(SELECT c.first, c.last, c.id FROM customer)
END;
Do you have any suggestions?
how about
SELECT * FROM Customer c LEFT JOIN Appointment a ON a.CustomerId = c.Id
You could make two queries and UNION them.
SELECT c.first, c.last, c.id, a.id FROM appointent a
INNER JOIN customer c ON a.customer_id = c.id
UNION
SELECT c.first, c.last, c.id, null FROM customer c
Or an outer join, where the a.id would be populated with null if there was no match during the join.
SELECT c.first, c.last, c.id, a.id FROM customer c
OUTER JOIN appointent a ON a.customer_id = c.id
As per my comment on Zdravko's answer, you could have also used:
select * from customer where id in (select customer_id from appointment where cancelled = 0)
Which would allow you to filter in a nice way.
You can also filter Zdravko's answer like this:
SELECT * FROM Customer c LEFT JOIN Appointment a ON a.CustomerId = c.Id
WHERE a.cancelled 0 and a.confirmed = 1

Counting many to many association records

I have many to many association between words and definitions.
words:
+-----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
+-----------------+--------------+------+-----+---------+----------------+
definitions:
+-------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| language_id | int(11) | YES | MUL | NULL | |
+-------------------+--------------+------+-----+---------+----------------+
definitions_words:
+---------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------+------+-----+---------+-------+
| definition_id | int(11) | NO | PRI | NULL | |
| word_id | int(11) | NO | PRI | NULL | |
+---------------+---------+------+-----+---------+-------+
I would like to get all word records which have exactly one definition with language_id = 1.
I think the simplest way to express this in SQL is using in:
select *
from words
where id in (select word_id
from word_definitions
where language_id = 1
having count(*) = 1
)
However, in with a subquery does not always work efficiently in MySQL. It can be replaced with an exists clause:
select *
from words w
where exists (select 1
from word_definitions wd
where language_id = 1
having count(*) = 1 and wd.word_id = w.id
)
SELECT a.ID, COUNT(*) totalRecordCount
FROM words a
INNER JOIN definition_words b
ON a.ID = b.word_ID
INNER JOIN definitions c
ON b.definition_id = c.ID
INNER JOIN
(
SELECT id,
SUM(language = 1) totalCount
FROM definitions
GROUP BY id
) d ON c.ID = d.ID AND
d.TotalCount = 1
GROUP BY a.ID

MySQL query to match date and null between two tables

I have two MySQL-tables like this:
desc students;
+---------------------------+---------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+---------------------------+---------------+------+-----+---------+
| student_id | int(11) | NO | PRI | NULL |
| student_firstname | varchar(255) | NO | | NULL |
| student_lasttname | varchar(255) | NO | | NULL |
+---------------------------+---------------+------+-----+---------+
desc studentabsence;
+---------------------------+-------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+---------------------------+-------------+------+-----+---------+
| student_absence_id | int(11) | NO | PRI | NULL |
| student_id | int(11) | YES | | NULL |
| student_absence_startdate | date | YES | | NULL |
| student_absence_enddate | date | YES | | NULL |
| student_absence_type | varchar(45) | YES | | NULL |
+---------------------------+-------------+------+-----+---------+
Then I have this MySQL- query to list students.
Query:
SELECT s.student_id, s.student_firstname, s.student_lastname,
a.student_absence_startdate, a.student_absence_enddate, a.student_absence_type
FROM students s LEFT JOIN studentabsence a ON a.student_id = s.student_id
Whenever a student has absence information this is displayed in the columns
a.student_absence_startdate a.student_absence_enddatea.student_absence_type
Sometimes a student has two or more rows in the table studentabsence then he is listed two times.
My question is if there is any way to be more specific in the query. I would like to list all students from db.students and if there is a row in db.studentabsence with a date between startdate and enddate (for example 2012-07-30) list the student one time with this absence information. Only if there is a match on date.
So something like...
... WHERE (a.student_absence_startdate OR a.student_absence_enddate) IS NULL OR
'2012-07-30' BETWEEN a.student_absence_startdate AND
a.student_absence_enddate ...
It's kinda hard to explain so let me know if you need more information...
I think that you can arrange it with a JOIN on a subselect/subview :
SELECT s.student_id, s.student_firstname, s.student_lastname,
a.student_absence_startdate, a.student_absence_enddate, a.student_absence_type
FROM students s
LEFT JOIN
(SELECT * FROM studentabsence a1 WHERE ('2012-07-30' BETWEEN a1.student_absence_startdate AND a1.student_absence_enddate) ) a
ON a.student_id = s.student_id
I'd use parameters with default values (01/01/1900 00:00:00), like this:
AND ( a.student_absence_startdate >= #P_startdate OR #P_startdate = '01/01/1900 00:00:00' )
AND ( a.student_absence_enddate <= #P_enddate OR #P_enddate = '01/01/1900 00:00:00' )