I want to export some data from the DB.
Basically what I want to say is this:
1- Select mbr_name from the members table
2- Choose the ones that exist at the course_registration table (based on mbr_id)
3- Join the course_registration ids with course_comments table
Then I need to apply these WHERE condtions as well:
1- Make sure that crr_status at course_registration table is set to completed
2- Make sure that crr_ts at course_registration table is between "2021-03-07 00:00:00" AND "2022-03-17 00:00:00"
3- Make sure that crm_confirmation from course_comments table is set to accept
So I tried my best and wrote this:
SELECT members.mbr_name
FROM members
INNER JOIN course_registration AS udt ON members.mbr_id = udt.crr_mbr_id
INNER JOIN course_comments AS dot ON udt.crr_cor_id = dot.crm_reference_id
WHERE udt.crr_status = "completed" AND udt.crr_ts >= "2021-03-07 00:00:00" AND udt.crr_ts < "2022-03-17 00:00:00"
AND dot.crm_confirmation = "accept";
But this will give wrong data somehow.
The actual number of members that have all these conditions are 12K but this query gives me 120K results which is obviously wrong!
So what's going wrong here? How can I solve this issue?
UPDATE:
Here are the keys of each table:
members (mbr_id (PK), mbr_name)
course_registration (crr_id (PK), crr_mbr_id (FK), crr_cor_id (FK), crr_status)
course_comments (crm_id (PK), crm_reference_id (FK), crm_confirmation)
You have a so-called cardinality problem. JOINs can, when multiple rows on the one table match a single row in the other table, cause the result set to have multiple rows. Your JOIN as written will generate many rows: members x courses x comments. That's what JOIN does.
It looks like you want exactly one row in your resultset for each member who ...
has completed one or more courses meeting your criterion.
has submitted one or more comments.
So let's start with a subquery. It gives the mbr_id values for members who have submitted one or more comments on one or more courses that meet your criteria.
SELECT udt.crr_mbr_id
FROM course_registration udt
JOIN course_comments dot ON udt.crr_cor_id = dot.crm_reference_id
WHERE udt.crr_status = "completed"
AND udt.crr_ts >= "2021-03-07 00:00:00"
AND udt.crr_ts < "2022-03-17 00:00:00"
AND dot.crm_confirmation = "accept"
GROUP BY udt.mbr_id
You use the results of that subquery to find your members. The final query is
SELECT members.mbr_name
FROM members
WHERE members.mbr_id IN (
SELECT udt.crr_mbr_id
FROM course_registration udt
JOIN course_comments dot ON udt.crr_cor_id = dot.crm_reference_id
WHERE udt.crr_status = "completed"
AND udt.crr_ts >= "2021-03-07 00:00:00"
AND udt.crr_ts < "2022-03-17 00:00:00"
AND dot.crm_confirmation = "accept"
GROUP BY udt.mbr_id )
As you only want to select Member name you can try as below if this gives required result
select m.mbr_name
from Members m
where Exists ( select 1 from Course_Registration cr
join Course_Comments cm on cr.crr_cor_id = cm.crm_reference_id
where cr.crr_mbr_id = m.mbr_id
And cr.crr_status = "completed" AND cr.crr_ts >= "2021-03-07 00:00:00" AND cr.crr_ts < "2022-03-17 00:00:00"
AND cr.crm_confirmation = "accept";
);
My first guess, without knowing the context, is that:
a member can register to one or more courses,
each course can have one or more comments.
If this is the case, you are getting way more tuples due to redundancy. In that case you just need to stick a DISTINCT right after your first SELECT.
Furthermore, since the JOIN is the most resource-expensive operation in sql, I would first filter the data and then leave any join as the last operation to improve efficiency. Something like this:
SELECT
members.mbr_name
FROM
(
SELECT DISTINCT
crm_reference_id
FROM
course_comments
WHERE
crm_confirmation = 'accept'
) accepted_comments
INNER JOIN
(
SELECT DISTINCT
crr_mbr_id,
crr_cor_id
FROM
course_registration
WHERE
crr_status = 'completed'
AND
crr_ts BETWEEN '2021-03-07 00:00:00' AND '2022-03-17 00:00:00'
) completed_courses
ON
accepted_comments.crm_reference_id = completed_courses.crr_cor_id
INNER JOIN
members
ON
members.mbr_id = completed_courses.crr_mbr_id
I would start at the registration FIRST instead of the members. By getting a DISTINCT list of members signing up for a course, you have a smaller subset. From that too, joining to the comments for just those accepted gives you a final list.
Once you have those two, join back to members to get the name. I included the member ID as well as the name because what if you have two or more "John" or "Karen" names in the registration. At least you have the ID that confirms the unique students.
select
m.mbr_name,
m.mbr_id
from
( select distinct
cr.crr_mbr_id
from
course_registration cr
JOIN course_comments cc
ON cr.crr_cor_id = cc.crm_reference_id
AND cc.crm_confirmation = 'accept'
WHERE
cr.crr_status = 'completed'
AND cr.crr_ts >= '2021-03-07'
AND cr.crr_ts < '2022-03-17' ) PQ
JOIN members m
ON PQ.crr_mbr_id = m.mbr_id
Try using this and if it not works then try using 'between' for date field (crr_ts).
select mbr.mbr_name from
(
select * from course_registration AS udt
INNER JOIN course_comments AS dot ON udt.crr_cor_id = dot.crm_reference_id
where dot.crm_confirmation = "accept" AND udt.crr_status = "completed" AND udt.crr_ts >= "2021-03-07 00:00:00" AND udt.crr_ts < "2022-03-17 00:00:00"
)x
INNER JOIN members mbr on mbr.mbr_id = x.crr_mbr_id
Try this:
SELECT *
FROM members M
INNER JOIN course_registration CR
ON CR.crr_mbr_id = M.mbr_id
AND CR.crr_status = 'completed'
AND CR.crr_ts BETWEEN '2021-03-07 00:00:00' AND '2022-03-17 00:00:00'
WHERE EXISTS(
SELECT * FROM course_comments CC
WHERE CC.crm_confirmation = 'accept'
AND CC.crm_reference_id = CR.crr_cor_id
)
ORDER BY M.mbr_id;
Related
I have a table that looks like this:
For each COMPANY there are multiple NATURAL_PERSON_ID, every NATURAL_PERSON have a date in which an audit was performed FECHA_DE_REPORTE and as a company there is a date in which the first loan was give to that company.
What I want is to select for each NATURAL_PERSON all the FOLIO_CONSULTA whose FECHA_DE_REPORTE is less or equal to FIRST_LOAN (the date in which the first loan was given for that company) Then I need to find the MAX date among each group and keep al the information (the whole row) for the value that fulfills all these conditions, and all this for each NATURAL_PERSON
So for this example the result I expected is all the information of the second row since this is the MAX() of FECHA_DE_REPORTE by COMPANY AND NATURAL_PERSON.
I have tried:
SELECT NPC.COMPANY_ID
,NPC.NATURAL_PERSON_ID
,NPS.DIGITAL_SIGNATURE_ID
,CDC.FOLIO_CONSULTA
,CDC.FECHA_DE_REPORTE
,FIRST_LOAN.FIRST_LOAN
,MAX(CDC.FECHA_DE_REPORTE) MAX_FOLIO_CONSUTA
FROM KONFIO.NATURAL_PERSON_COMPANY NPC
LEFT JOIN KONFIO.NATURAL_PERSON_SIGNATURE NPS ON NPS.NATURAL_PERSON_ID = NPC.NATURAL_PERSON_ID
JOIN KONFIO.CDC_RESPONSE CDC ON CDC.DIGITAL_SIGNATURE_ID= NPS.DIGITAL_SIGNATURE_ID
JOIN
(
SELECT CAPP.COMPANY_ID
,MIN(LOAN.DOCUMENTATION_DATE) FIRST_LOAN
FROM KONFIO.COMPANY_APPLICATION CAPP
JOIN KONFIO.LOAN ON LOAN.APPLICATION_ID = CAPP.APPLICATION_ID
GROUP BY CAPP.COMPANY_ID) FIRST_LOAN ON FIRST_LOAN.COMPANY_ID = NPC.COMPANY_ID
WHERE CDC.FECHA_DE_REPORTE <= FIRST_LOAN.FIRST_LOAN
AND NPC.COMPANY_ID IN (1033)
GROUP BY NPC.COMPANY_ID, NPC.NATURAL_PERSON_ID
but it retrieves the first value that finds so the FOLIO_CONSULTA does not correspond to the FOLIO_CONSULTA of the MAX() FECHA_DE_REPORTE
Any help would be appreciated
You should join the subquery for MAX(FECHA_DE_REPORTE) on table CDC_RESPONSE
SELECT NPC.COMPANY_ID
,NPC.NATURAL_PERSON_ID
,NPS.DIGITAL_SIGNATURE_ID
,CDC.FOLIO_CONSULTA
,CDC.FECHA_DE_REPORTE
,FIRST_LOAN.FIRST_LOAN
,T.MAX_FOLIO_CONSUTA
FROM KONFIO.NATURAL_PERSON_COMPANY NPC
INNER JOIN (
SELECT DIGITAL_SIGNATURE_ID
, MAX(FECHA_DE_REPORTE) MAX_FOLIO_CONSUTA
FROM KONFIO.CDC_RESPONSE
GROUP BY DIGITAL_SIGNATURE_ID
) T ON T.DIGITAL_SIGNATURE_ID = NPS.DIGITAL_SIGNATURE_ID
AND T.MAX_FOLIO_CONSUTA = CDC.FECHA_DE_REPORTE
LEFT JOIN KONFIO.NATURAL_PERSON_SIGNATURE NPS ON NPS.NATURAL_PERSON_ID = NPC.NATURAL_PERSON_ID
JOIN KONFIO.CDC_RESPONSE CDC ON CDC.DIGITAL_SIGNATURE_ID= NPS.DIGITAL_SIGNATURE_ID
JOIN
(
SELECT CAPP.COMPANY_ID
,MIN(LOAN.DOCUMENTATION_DATE) FIRST_LOAN
FROM KONFIO.COMPANY_APPLICATION CAPP
JOIN KONFIO.LOAN ON LOAN.APPLICATION_ID = CAPP.APPLICATION_ID
GROUP BY CAPP.COMPANY_ID) FIRST_LOAN ON FIRST_LOAN.COMPANY_ID = NPC.COMPANY_ID
WHERE CDC.FECHA_DE_REPORTE <= FIRST_LOAN.FIRST_LOAN
AND NPC.COMPANY_ID IN (1033)
GROUP BY NPC.COMPANY_ID, NPC.NATURAL_PERSON_ID
...... missing part
Table A [PATIENT] has columns [PATID], [FIRSTVISITDATE]
Table B [APPT] has columns [APPTID], [PATID], [CREATEDATE]
Table C [NOTE] has columns [NOTEID], [NOTETEXT]
Table D [PROCS] has column [PROCID], [PATID]
Table E [CHARGE] has columns [CHARGEID], [AMOUNT]
I need to sum CHARGE(AMOUNT) by PATID for all PATIENTS where the NOTE.NOTETEXT contains 'text' and one of the APPT for a PATIENT containing the 'text' has an APPT.CREATEDATE = to the PATIENT.FIRSTVISITDATE
Simply put I need to SUM the charges for PATIENTS if they have an appointment with 'text' in their notes and the appointment with that 'text' was their first visit to the office
Other key points:
CHARGE.CHARGEID = PROC.PROCID
NOTE.NOTEID = APPT.APPTID
With my limited knowledge of SQL I was able to sum for all patients regardless if the 'text' was included in their first appointments notes and for that I used:
select (SUM(AMOUNT)) as 'Cash Payments' from CHARGE where CHARGEID in
(select PROCID from PROC where PATID in
(select PATID from APPT where APPTID in
(select NOTEID from NOTE where NOTETEXT like '%text%')))
You can use the GROUP BY clause to group the AMOUNT by patient. You can filter your patients to just the ones with the text in the notes and FIRSTVISITDATE = CREATEDATE using an inner query that joins the tables on those conditions.
I have not tested the following query, but it should do what you're asking.
SELECT pa.PATIENT, SUM(c.AMOUNT) AS 'Cash Payments'
FROM PATIENT pa
INNER JOIN PROCS pr
ON pa.PATID = pr.PATID
INNER JOIN CHARGE c
ON pr.PROCID = c.CHARGEID
WHERE pa.PATIENT IN (
SELECT pa.PATIENT
FROM PATIENT pa
INNER JOIN APPT a
ON pa.PATID = a.PATID
AND pa.FIRSTVISITDATE = a.CREATEDATE
INNER JOIN NOTE n
ON a.APPTID = n.NOTEID
WHERE n.NOTETEXT LIKE '%text%'
)
GROUP BY pa.PATIENT;
MySQL Table Diagram:
My query this far:
SELECT tblcourses.CourseStandard,
tblcourses.CourseID,
tblcourses.CourseRef,
tblcourses.CourseStandard,
tblcourses.CourseName,
tblcourses.CourseDuration,
tblcourses.NQFLevel,
tblcourses.CoursePrice,
tblcoursestartdates.StartDate
FROM etcgroup.tblcoursestartdates tblcoursestartdates
INNER JOIN etcgroup.tblcourses tblcourses
ON (tblcoursestartdates.CourseID = tblcourses.CourseID)
WHERE tblcoursestartdates.StartDate >= Now()
If you look at the diagram you will see I have a 3rd table. The query above works fine. It display all the data as it should.
I want to show all the courses and their respective dates excluding those that the student is already booked for. Keep in mind that there can be 20 start dates for 1 course. This is why I am only choosing dates >= Now().
I want to make sure that a student does not get double booked. Yes I can do it afterwards. Beep student already booked BUT if I can have it now show the course dates that the student already booked then great. Any suggestions?
This is pretty straightforward. Presumably you know the StudentID you'd like to see. Do a left join to the bookings table and select the mismatches.
SELECT tblcourses.CourseStandard,
tblcourses.CourseID,
tblcourses.CourseRef,
tblcourses.CourseStandard,
tblcourses.CourseName,
tblcourses.CourseDuration,
tblcourses.NQFLevel,
tblcourses.CoursePrice,
tblcoursestartdates.StartDate
FROM etcgroup.tblcoursestartdates tblcoursestartdates
INNER JOIN etcgroup.tblcourses tblcourses
ON tblcoursestartdates.CourseID = tblcourses.CourseID
AND tblcoursestartdates.StartDate >= Now()
LEFT JOIN tblbookings
ON tblbookings.CourseId = tblcourses.CourseId
AND tblbookings.StudentId = <<<the student ID in question >>>
WHERE tblbookings.BookingID IS NULL
The trick here is the LEFT JOIN ... IS NULL pattern. It eliminates the rows where the ON condition of the LEFT JOIN hit, leaving only the ones where it missed.
Do a left join to tblBookings on courseID where the bookingID is null (there are no matches). You'll have to provide the studentID as a parameter to the query.
SELECT DISTINCT c.CourseStandard,
c.CourseID,
c.CourseRef,
c.CourseStandard,
c.CourseName,
c.CourseDuration,
c.NQFLevel,
c.CoursePrice,
d.StartDate
FROM etcgroup.tblcoursestartdates d
INNER JOIN etcgroup.tblcourses c ON d.CourseID = c.CourseID
LEFT JOIN etcgroup.tblBookings b on c.CourseID = b.CourseID and b.StudentID = #StudentID
WHERE d.StartDate >= Now() and b.bookingID is null
Use NOT EXISTS or NOT IN to find the courses a student has already booked:
SELECT
c.CourseStandard,
c.CourseID,
c.CourseRef,
c.CourseStandard,
c.CourseName,
c.CourseDuration,
c.NQFLevel,
c.CoursePrice,
csd.StartDate
FROM etcgroup.tblcourses c
INNER JOIN etcgroup.tblcoursestartdates csd ON csd.CourseID = tblcourses.CourseID
WHERE csd.StartDate >= Now()
AND c.CourseID NOT IN
(
SELECT CourseID
FROM tblbookings
WHERE StudentID = 12345
);
I have 3 tables; events, memberEvents, and members.
Events: eventId, eventName, eventDivision
memberEvents: memberID, eventOne, eventTwo, eventThree, eventFour, eventFive
member: memberID, memberFirstName, memberLastName
I am trying to get it to display events.eventName followed by the memberFirstName & memberLastName of members that are doing that event
This is the query I have been trying:
SELECT * FROM events, memberEvents, members
WHERE events.eventDivision = 'C'
RIGHT JOIN memberEvents.memberID
ON events.eventID = memberEvents.eventOne
OR events.eventID = memberEvents.eventTwo
OR events.eventID = memberEvents.eventThree
OR events.eventID = memberEvents.eventFour
OR events.eventID = memberEvents.eventFive
When I run this i get "#1066 - Not unique table/alias: 'memberEvents'"
Try:
SELECT ev.*, me.* FROM events ev
RIGHT JOIN memberEvents me
ON (ev.eventID = me.eventOne
OR ev.eventID = me.eventTwo
OR ev.eventID = me.eventThree
OR ev.eventID = me.eventFour
OR ev.eventID = me.eventFive)
WHERE ev.eventDivision = 'C'
Did you specifically want to limit the number of events per member to five? Why not just have a memberEvent table that has a primary key made up of foreign keys to member and event?
memberEvent: memberId, eventId
Then your query would be
SELECT
event.eventName,
member.memberFirstName,
member.memberLastName
FROM
event
INNER JOIN
memberEvent
ON
memberEvent.eventID = event.eventId
INNER JOIN
member
ON
memberEvent.memberId = member.memberId
WHERE
event.division = 'C';
Maybe you have a good reason for the table structure you have chosen but it is a denormalised design and if you ever need to increase the number of events per member you'll need to modify your schema and code to suit.
I think you should have a closer look at the defenition
JOIN Syntax
It seems that you have misunderstood the JOIN syntax.
SELECT *
FROM table1 t1 right join
table2 t2 on t1.key1 = t2.key1
and t1.key2 = t2.key2
where t1.somecolumn = somevalue
I have a rather complex-seeming query that will form the basis for an online classroom scheduling tool. My challenge is to develop a method to identify which classes a user is signed up for in the st_schedule table, then deduce from the overall table of classes, st_classes, which other classes are available that don't conflict with the user's current classes.
For example, if a user has an entry in st_schedule assigning them to a class from 8:00am to 9:00am, they would be ineligible for any class whose time fell between 8:00am and 9:00am. A class that ran 7:15am - 8:15am would make the user ineligible. I store the start times and end times of classes in the database separately for comparison purposes. It's important that this be as flexible as possible, so the concept of "blocking" times and assigning times to blocks is not a possibility.
Here are excerpts from the tables:
table st_classes (holds class information)
id
start_time
end_time
table st_schedule (holds schedule information)
id
user_id
class_id
I certainly could do this in a series of loops server-side, but I have to think that there's a MySQL method that can do this type of operation in one fell swoop.
You want to join the two tables together to represent the user's classes, and then find unregistered classes where the start time and end time do not fall between the start and end time of the user's classes.
Something like this. Completely off the cuff and untested:
SELECT
*
FROM
st_schedule s
INNER JOIN st_classes c ON c.id = s.class_id
INNER JOIN st_classes all_classes
ON all_classes.start_time NOT BETWEEN c.start_time AND c.end_time
AND all_classes.end_time NOT BETWEEN c.start_time AND c.end_time
WHERE
s.user_id = 1
Edit: Try #2
I only have a moment to look at this. I think I reversed the second join clauses. The all_classes alias represents the full list of classes, where the "c" alias represents the classes that the student is signed up for.
SELECT DISTINCT
all_classes.*
FROM
st_schedule s
INNER JOIN st_classes c ON c.id = s.class_id
INNER JOIN st_classes all_classes
ON c.start_time NOT BETWEEN all_classes.start_time AND all_classes.end_time
AND c.end_time NOT BETWEEN all_classes.start_time AND all_classes.end_time
WHERE
s.user_id = 1
This is using table variables in mssql but the sql selects should translate over to mysql
First the sample data
DECLARE #st_classes TABLE
(
ID INT NOT NULL,
Title VARCHAR(40) NOT NULL,
StartTime DATETIME NOT NULL,
EndTime DATETIME NOT NULL
)
DECLARE #st_schedule TABLE
(
ID INT NOT NULL,
UserID INT NOT NULL,
ClassID INT NOT NULL
)
INSERT INTO #st_classes (ID, Title, StartTime, EndTime)
SELECT 1,'Class1','08:00:00','09:30:00' UNION
SELECT 2,'Class2','09:30:00','11:30:00' UNION
SELECT 3,'Class3','11:30:00','16:00:00' UNION
SELECT 4,'Class4','16:00:00','17:30:00' UNION
SELECT 5,'Class5','09:00:00','11:45:00' UNION
SELECT 6,'Class6','07:00:00','18:00:00'
INSERT INTO #st_schedule(ID, UserID, ClassID)
SELECT 1,1,1 UNION
SELECT 2,1,2 UNION
SELECT 3,2,6
Next a bit of sql to confirm the tables join OK (selecting scheduled courses for user with an ID of 1) - Returns class 1 and 2
SELECT *
FROM #st_schedule AS S INNER JOIN
#st_classes AS C ON S.ClassID = C.ID
WHERE S.UserID = 1
Now we need to select all the ID of the courses where they overlap time wise with the users scheduled ones (including the scheduled ones) - Returns 1,2,5,6
SELECT AC.ID
FROM #st_classes AS AC
INNER JOIN ( SELECT C.StartTime,
C.EndTime
FROM #st_schedule AS S
INNER JOIN #st_classes AS C ON S.ClassID = C.ID
WHERE S.UserID = 1
) AS UC ON ( AC.StartTime < DATEADD(ss, -1, UC.EndTime)
AND DATEADD(ss, -1, UC.EndTime) > UC.StartTime
)
GROUP BY AC.ID
Now we need to select all courses where the Course ID is not in our list of overlapping course IDs. - Returns course 3 and 4
SELECT *
FROM #st_classes
WHERE ID NOT IN (
SELECT AC.ID
FROM #st_classes AS AC
INNER JOIN ( SELECT C.StartTime,
C.EndTime
FROM #st_schedule AS S
INNER JOIN #st_classes AS C ON S.ClassID = C.ID
WHERE S.UserID = 1
) AS UC ON ( AC.StartTime < DATEADD(ss, -1, UC.EndTime)
AND DATEADD(ss, -1, UC.EndTime) > UC.StartTime
)
GROUP BY AC.ID )
Change the user ID filter to 2 and you should not get any returned as the course assigned to that user overlaps all courses.