MYSQL Search returning wrong and duplicate values - mysql

I'm doing a project at university and I seem to be encountering some issues when i'm trying to search to collect some results.
I am trying to display the results which give the StudentName, ModuleName, and DegreeID. When I do this, it appears to be duplicating the values and returning wrong results.
For example - Owen Barnes is only studying Computer Science, not Philosophy yet it is simply returning all values instead of the specified 3 it should. Further, Connor Borne is studying Philosophy yet it is suggesting he is studying every module including those in Computer Science.
I was hoping someone could help me. I'm using 2 tables (ModulesFormDegree & StudiesModules) which are used to link Modules to Degree (using 2 foreign keys) and Students with Modules (also using 2 foreign keys).
I've attached my problem below, if any more data is required please let me know.
Inquiry & Results
Description of Tables
Query:
select StudentName, ModuleName, DegreeID
from Student, Modules, Degree, StudiesModules, ModulesFormDegree
where Student.StudentID=StudiesModules.StudentID and
Modules.ModuleID=ModulesFormDegree.ModID and
Degree.DegreeID=ModulesFormDegree.DegID

It's diffcult to say for sure because you have not posted all your table definitions but your query is missing a condition in the where clause which is causing the Cartesian product and can be fixed as follows:
select StudentName,
ModuleName,
DegreeID
from Student,
Modules,
Degree,
StudiesModules,
ModulesFormDegree
where Student.StudentID=StudiesModules.StudentID and
Modules.ModuleID=ModulesFormDegree.ModID and
Degree.DegreeID=ModulesFormDegree.DegID and
StudiesModules.ModuleID = ModulesFormDegree.ModID
however, joining tables on conditions in the WHERE clause is fairly antiquated and superseded using ANSI joins as follows:
SELECT StudentName,
ModuleName,
DegreeID
FROM StudiesModules sm
JOIN ModulesFormDegree md
ON sm.ModuleID = md.ModID
JOIN Degree d
On d.DegreeID = md.DegID
JOIN Modules m
ON m.ModuleID = md.ModID
JOIN Student s
ON s.StudentID = sm.StudentID

Related

count students from table where join mysql

I have the databaase in icon below. I want
to count all students from Subject with name Psychology and class with name Class5.
the percentage of students with status "Something" from subject with name Psychology and class with name Class5.
All students and the class name from Class "Class6" that are male.
I've tried for example
(in english:)
SELECT COUNT(student_name) AS NumberOfStudents FROM student_srms JOIN class_srms JOIN subject_srms WHERE class_srms.class_name='Class5' AND subject_srms.subject_name='Psychology'
But returns NumberOfStudents = 20, but 20 are all student entries.
The issue likely stems from your FROM clause. It's not enough to just say JOIN. You need to specify the relationship of the columns between the two tables being joined with an ON clause:
FROM student_srms
JOIN class_srms
ON student_srms.student_id = class_srms.student_id
JOIN subject_srms
ON class_srms.subject_id = subject_srms.subject_id
I believe in MySQL there is a NATURAL JOIN which will tell mysql without an ON clause to just join on column names that are similar between the two tables, but that feels dirty to me and could cause failures later on in an applications lifecycle if new columns are introduced to tables that share names, but not relationships, so I would just steer clear of that.
I have a suspicion that your diagram showing tables/columns is incorrect based on the error you are reporting in the comments. Instead, try (and I'm totally guessing blind here at this point):
FROM student_srms
JOIN student_class
On student_srms.student_id = student_class.class_id
JOIN class_srms
ON student_class.class_id = class_srms.student_id
JOIN subject_srms
ON class_srms.subject_id = subject_srms.subject_id
That adds in that student_class relationship table so you can make the jump from student to class tables. Fingers crossed.

SQL Select & Join Tables to display results

I've just started learning about databases and SQL, and I've been working on a problem that I just can't quite nail. This is what I'm trying to do, given two tables called CHARTER and CUSTOMER:
Give the relational algebra statement (or SQL) as well as the table that would
result from applying SELECT & JOIN relational operators to the CHARTER and
CUSTOMER tables to return only the CHAR_TRIP, CUS_LNAME and CHAR_DESTINATION
attributes for charters flown by pilot 109.
Note that 109 is sometimes the pilot and other times the co-pilot.
Display all these flights.
This is the SQL that I've tried:
select CHAR_TRIP, CUS_LNAME, CHAR_DESTINATION from CHARTER natural join CUSTOMER where CHARTER.CHAR_PILOT and CHARTER.CHAR_COPILOT="109";
But this just doesn't seem to give me what I want; I should be getting 6 records, but I'm only getting 3. I figured that it may be due to something in the SQL. Have I overlooked something with my code?
Not sure if Mysql has some special syntax, but try separating the WHERE conditions:
select CHAR_TRIP, CUS_LNAME, CHAR_DESTINATION
from CHARTER natural join CUSTOMER
where CHARTER.CHAR_PILOT ="109"
or CHARTER.CHAR_COPILOT="109";
I changed the second one to an or since 109 can be either the main pilot or the co-pilot.

SQL to retrieve related records with many to many relation

I am building an application for registration of agreements between institutes. These agreements may include more than 2 partners. As such, I quickly dropped the idea of having part1 and partner2 in a contracts table.
Current design is (Note: simplified for question):
Table Institutes: ID, Name , ..
Table Contract_institutes: ContractID, InstituteID
Table Contracts: ID, Title, ...
How would I go about showing a list of all contracts including the involved partners, assuming you know one partner: A user is logged in, and wants to see all the contracts that his institute has, and all the partners in the contract; e.g.:
Contract1: (Title) Institute1Name, Institute2Name
Contract2: (Title) Institute1Name, Institute2Name, Institute3Name
Contract3: (Title) Institute1Name
I could first get all the contracts IDs
select *fields*
from Contracts
left join Contract_institutes on Contracts.ID = Contract_institutes.ContractID
where Contract_institutes.InstituteID = *SomeValue*
And then get all the related institutes with a separate query for each contract (Or using an IN statement in the query), and use a lot of foreach php loops to format. Not pretty, and probably not efficient.
There must be a better way to do this, and get the list in a single sql statement. Can someone help me?
Ideally, I get output rows with: [contract ID][InstituteID][Institute.Name]. I can easily modify this in a per-contract view in the output.
PS:
- This is design phase of the application: The database is empty and can be modified to needs.
select C.ID, I.ID, I.Name
from Contracts C
join Contract_institutes CI on C.ID = CI.ContractID
join Institutes I on I.ID=CI.InstituteId
where CI.InstituteID <> *SomeValue*
and CI.ContractID in (select CI2.ContractId
from Contract_institutes CI2
where CI2.InstituteID = *SomeValue*)

More efficient to use subquery before inner joins?

I'm just in the process of learning MYSQL, and have something I've been wondering about.
Let's take this simple scenario: A hypothetical website for taking online courses, comprised of 4 tables: Students, Teachers, Courses and Registrations (one entry per course that a student has registered for)
You can find the DB generation code on github.
While the provided DB is tiny for clarity, to keep it relevant to what I need help with, let's assume that this is with a large enough database where efficiency would be a real issue - let's say hundreds of thousands of students, teachers, etc.
As far as I understand with MYSQL, if we want a table of students being taught by 'Charles Darwin', one possible query would be this:
Method 1
SELECT Students.name FROM Teachers
INNER JOIN Courses ON Teachers.id = Courses.teacher_id
INNER JOIN Registrations ON Courses.id = Registrations.course_id
INNER JOIN Students ON Registrations.student_id = Students.id
WHERE Teachers.name = "Charles Darwin"
which does indeed return what we want.
+----------------+
| name |
+----------------+
| John Doe |
| Jamie Heineman |
| Claire Doe |
+----------------+
So Here's my question:
With my (very) limited MYSQL knowledge, it seems to me that here we are JOIN-ing elements onto the teachers table, which could be quite large, while we are ultimately only after a single teacher, who we filter out at the very very end of the query.
My 'Intuition' Says that it would be much more efficient to first get a single row for the teacher we need, and then join the remaining stuff onto that instead:
Method 2
SELECT Students.name FROM (SELECT Teachers.id FROM Teachers WHERE Teachers.name =
"Charles Darwin") as Teacher
INNER JOIN Courses ON Teacher.id = Courses.teacher_id
INNER JOIN Registrations ON Courses.id = Registrations.course_id
INNER JOIN Students ON Registrations.student_id = Students.id
But is that really the case? Assuming thousands of teachers and students, is this more efficient than the first query? It could be that MYSQL is smart enough to parse the method 1 query in such a way that it runs more efficiently.
Also, if anyone could suggest an even more efficient query, I would be quite interested to hear it too.
Note: I've read before to use EXPLAIN to figure out how efficient a query is, but I don't understand MYSQL well enough to be able to decipher the result. Any insight here would be much appreciated as well.
My 'Intuition' Says that it would be much more efficient to first get
a single row for the teacher we need, and then join the remaining
stuff onto that instead:
You are getting a single row for teacher in method 1 by using the predicate Teachers.name = "Charles Darwin". The query optimiser should determine that it is more efficient to restrict the Teacher set using this predicate before joining the other tables.
If you don't trust the optimiser or want to lessen the work it does you can even force the table read order by using SELECT STRAIGHT_JOIN ... or STRAIGHT_JOIN instead of INNER_JOIN to make sure that MySQL reads the tables in the order you have specified in the query.
Your second query results in the same answer but may be less efficient because a temporary table is created for your teacher subquery.
The EXPLAIN documentation is a good source on how to interpret the EXPLAIN output.

Getting stuck doing a complicated SQL query for patent research purposes

I am trying to gather data for a research study for my university thesis. Unfortunately I am not a computer science or programming expert and do not have any SQL experience.
For my thesis I need to do a SQL query answering the question: "Give me all patents of a company X where there is more than one applicant (other company) in a specific time span". The data I want to extract is stored on a database called PATSTAT (where I have a 1 month trial) and is using - dont be surprised SQL.
I tried a lot of queries but all the time I am getting different syntax errors.
This is how the interface looks like:
http://www10.pic-upload.de/07.07.13/7u5bqf7jsow.png
I think I have a really good understanding of what (also from an SQL POV) needs to be done but I cannot execute it.
My idea: As result I want the names of the companies (with reference to the company entered below)
SELECT person_name from tls206_person table
Now because I need a criteria like
WHERE nb_applicants > 1 from tls201_appln table
I need to join these two tables tls206 and tls201. I did read some brief introduction guide on SQL (provided by european patent office) and because both tables have no common "reference key" we need to use the table tls207_pers_appln als "intermediate" so to speak. Now thats the point where I am getting stuck. I tried the following but this is not working
SELECT person_name, tls201_appln.nb_applicants
FROM tls206_person
INNER JOIN tls207_pers_appln ON tls206_person.person_id= tls207_pers_appln.person_id
INNER JOIN tls207_pers_appln ON tls201_appln.appln_id=tls201_appln.appln_id
WHERE person_name = "%Samsung%"
AND tls201_appln.nb_applicants > 1
AND tls201_appln.ipr_type = "PI"
I get the following error: "0:37:11 [SELECT - 0 row(s), 0 secs] [Error Code: 1064, SQL State: 0] Not unique table/alias: 'tls207_pers_appln'"
I think for just 4 Hours SQL my approach is not to bad but I really need some guidance on how to proceed because I am not making any progress.
Ideally I would like to count (for every company) and for every row respectively how many "nb_applicants" were found.
If you need further information for giving me guidance, just let me know.
Looking forward to your answers.
Best regards
Kendels
another way of doing the same thing, which you might find easier to understand (if you are new to sql it is impressive you have got so far), is:
SELECT tls206_person.person_name, tls201_appln.nb_applicants
FROM tls206_person, tls207_pers_appln, tls201_appln
WHERE tls206_person.person_id = tls207_pers_appln.person_id
AND tls201_appln.appln_id = tls201_appln.appln_id
AND tls206_person.person_name LIKE "%Samsung%"
AND tls201_appln.nb_applicants > 1
AND tls201_appln.ipr_type = "PI"
(it's equivalent to the other answer, but instead of trying to understand the JOIN syntax, you just write out all the logic and SQL is smart enough to make it work - this is often called the "new" or "ISO" inner join syntax, if you want to google for more info) (although it is possible, i suppose, that this newer syntax isn't supported by the database you are using).
You are referencing the table tls201_appln, but it is not in the from clause. I am guessing that the second reference to tls207_pers_appln should be to the other table:
SELECT person_name, tls201_appln.nb_applicants
FROM tls206_person
INNER JOIN tls207_pers_appln ON tls206_person.person_id = tls207_pers_appln.person_id
INNER JOIN tls201_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
WHERE person_name like '%Samsung%"'
AND tls201_appln.nb_applicants > 1
AND tls201_appln.ipr_type = "PI"
For my thesis I need to do a SQL query answering the question: "Give me all patents of a company X where there is more than one applicant (other company) in a specific time span".
Let me rephrase that for you :
SELECT * FROM patents p -- : "Give me all patents
WHERE p.company = 'X' -- of a company X
AND EXISTS ( -- where there is
SELECT *
FROM applicants x1
WHERE x1.patent_id = p.patent_id
AND x1.company <> 'X' -- another company:: exclude ourselves
AND x1.application_date >= $begin_date -- in a specific time span
AND x1.application_date < $end_date
-- more than one applicant (other company)
-- To avoid aggregation: Just repeat the same subquery
AND EXISTS ( -- where there is
SELECT *
FROM applicants x2
WHERE x2.patent_id = p.patent_id
AND x2.company <> 'X' -- another company:: exclude ourselves
AND x2.company <> x1.company -- :: exclude other other company, too
AND x2.application_date >= $begin_date -- in a specific time span
AND x2.application_date < $end_date
)
)
;
[Note: Since the OP did not give any table definitions, I had to invent these]
This is not the perfect query, but it does express your intentions. Given sane keys/indexes it will perform reasonably, too.