retrieve correct data from my schemas - relational-database

I would like to retrieve the Names of teachers who teach more than 1 course
I am really stuck on how to do it , all I know is that I need to consider the Course schema and operate on it , could I get some advice in terms of pi(projection),sigma(condition),rho(rename) etc (not i any specific database language)?

Since this is homework and you basically need to read & work through a textbook's absolutely introductory text & exercises on relational model basics & the relational algebra, I give some guiding questions tailored to your assignment.
A relation (given or query result) comes with a predicate, ie a statement template parameterized by attributes. A relation holds the tuples that make a true statement from its predicate. PKs & FKs are not needed for querying.
What is a relation expression for the tuples where...
the person identified by teacherid teaches the course identified by cid, which is named name? (Answer: Course.)
teacherid teaches cid named name? (Same answer. Why?)
teacherid teaches cid AND cid is named name? (Same answer. Why?)
(We can infer from your assignment query that the Course & Teacher predicates refer to persons or you couldn't get at teacher names.)
t teaches c named n?
t teaches c named n AND c = 101?
t teaches c named n AND t occupies o?
t teaches some course named some name?
for some c, t teaches c named some name? (Same answer. Why?)
for some c, t teaches c named something AND c = 101? (Why do we need FOR SOME?)
i ids a student AND NOT i takes some course taught by some teacher?
some student takes some course taught by t OR t occupies some office?
Thus: We compose a query predicate for the tuples we want using logic operators and the given predicates. Then we get an expression that calculates them by converting logic operators to relation operators and given predicates to given relations. (It can be tedious rearranging to get two relations with the same attributes in order to use AND NOT (MINUS) or OR (UNION).)
See this.
retrieve the Names of teachers who teach more than 1 course
You want tuples where attribute Name is the name of a person and for some two values the person teaches the first one's course and they teach the second one's course and those values/courses are not the same.

Related

How to join 2 sql tables where one table contains multiple values in a single column

Currently, this is what my SELECT code looks like:
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON course.stu_code ?????;
Basically, to elaborate the student table inherits from user table, therefore I had user_id = stu_code. What I'm confused about is how to join course table with student table.
Let's say that the course table has a course code (PK), a few other attributes and a stu_code column, however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Example: Student table has stu_code string value of '123' and course table has a stu_code with string value of '123, 246, 369'.
How would I go about joining these two tables together and separating the stu_code in the course table so that it represents 3 separate stu_code values -> i.e. '123', '246', '369'.
Any help is greatly appreciated!
however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Your data model is broken. Put your effort into fixing the data model. You want a junction/association table courseStudents or perhaps enrolled, with columns like:
stu_code (foreign key to students)
course_code (foreign key to students)
enrollment_date
and so on
What is wrong with your data model? Here are a few things:
You are storing numbers as a string.
You are putting multiple values into a string column.
You cannot define foreign key relationships.
SQL has poor string handling capabilities.
SQL has a great way to store lists of things. It is not called "string". It is called "table".
Your data model is ~broken~ hindering you from elegant solutions.
You cannot join your two tables efficiently. While they might both contain strings they do not contain data with the same rules. Thus, you must transform the data in order to join them so you could do this in a few ways but one way is using regular expression function.
You can use it to evaluate a test on whether the stu_code matches the list of codes. Further, you can do this dynamically ... constructing the test string itself based upon values from the left and right
join based on REGEXP
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON student.stu_code REGEXP CONCAT('[[:&lt:]]',course.stu_code,'[[:&gt:]]')
Assuming tables and data:
Student
- - - -
stu_code
123
Course
- - - -
stu_code
'123, 246, 369'
Example:
http://sqlfiddle.com/#!9/672b57f/4
about the regular expression
in mysql the regex syntax can be a little bit different. [[:<:]] is the character class in spencer notation for word boundary.
if you have a new enough version of mysql/mariadb you can use more typical ICU notation of \b.
more about that here : https://dev.mysql.com/doc/refman/8.0/en/regexp.html
about efficiency
in large datasets the performance will be awful. you will have to scan all records and you will have to perform the function on all of them. In a large set you might get some gains by joining on like first (which is faster than regexp). This will be much faster at filtering-out and then the regexp can deal with filtering-in.
Perhaps your model was based upon an assumption of having a courses table with very few rows?
It ironic because you have made your course table unnecessarily large. You would actually be better off with an intermediary table that represents the many-to-many nature (the fact that students can take many courses and courses can have many students) with 1 row per unique relationship. While this table would be an order of magnitude "longer" it would be leaner and it could be indexed and query performance would be faster.
The courses table does not need to have any awareness of the student list and thus you can alter courses by removing courses.stu_code once you change the model (aside: It might be useful if courses cached a hint of the expected student count for that course)
possible link table
would be a new table like this (note how it only ever needs these 2 columns)
stu_course_lnk
- - - - - - - -
stu_code course_id
123 ABC
124 ABC
...
123 XYZ
...
124 LMN
then you add joins of
...
student.stu_code = stu_course_lnk.stu_code
and
stu_course_lnk.course_id = course.id
...

How to design a simple database

I want to model a student, teacher, class relationship. Every student is associated with one teacher (the teacher can have many students). There are only three classes. The way I think of this is that there are three tables:
Student Table -> (student_id, student_name, class_id)
Teacher Table -> (student_id, student_name, class_id)
Class Table -> (class_id, class_name)
I'm not sure how to show the student-teacher relationship within the tables. How would we know which teacher is assigned to which student?
This can be accomplished with some simple joins.
Assuming that you want to find all the students associated with a certain teacher, you would start off by grabbing the row for the teacher. You would then join in the classes that the teacher teaches. Finally, you would join in the students that are in those classes.
This is known as a many-to-many relationship, and is an important concept in databases.
select
t.student_name, -- I suspect this col might actually be named teacher_name
s.student_name,
from
-- Find the classes that a teacher teaches
teacher_table t join class_table c on (t.class_id=c.class_id)
-- Find the students in those classes
join student_table s on (s.class_id=c.class_id)
where
t.student_id = ? -- Again, I suspect this should be "teacher_id"
This is a few more tables than you want, but several of the .NET examples that Microsoft has created revolve around a similar relational database.
Here is a link to that database:
https://msdn.microsoft.com/en-us/library/bb399731(v=vs.100).aspx
In this example, the student and the teacher are both kept in the person table and are related to the course table through two different Joining tables . . student grade and course instructor.
And here is the Contoso University Schema with link:
https://learn.microsoft.com/en-us/aspnet/mvc/overview/getting-started/getting-started-with-ef-using-mvc/creating-a-more-complex-data-model-for-an-asp-net-mvc-application

How to get redundant consistent (duplicate) data from the database using relational algebra?

I have 2 relations (tables):
Shops (Postcode (PK), SuburbName, BottleShopName (PK), Address),
People (PersonName (PK), Postcode (PK), AlcoholConsumption)
If I write down a query that will return the name of the each bottle shop in the
database and its suburb using relational algebra, it would look like this:
π BottleShopName, SuburbName (Shops).
The limitation of this query is that it would not show any redundant data. Say if there are two different Bottle shop names with the same name and are in the same suburb having the same post code, the above query would ignore the second one.
What modifications should I make to this query to get both results explicitly using relational algebra?
Since this question is missing the used query, here's what I understood of your question:
SELECT BottleShopName, SuburbName FROM Shops JOIN People ON People.Postcode = Shops.Postcode WHERE People.PersonName = ?
Say if there are two different Bottle shop names with the same name and are in the same suburb having the same post code, the above query would ignore the second one.
Relational algebra relations are sets of tuples. They cannot have two tuples with all the same values for attributes.
A relation holds the rows that make a true statement from a given predicate--a fill-in-the-blanks sentence template parameterized by attribute names.
Suppose that Shops holds rows where "a bottle shop has postal code Postcode, suburb name SuburbName, name BottleShopName and address Address". If, for some particular values of Postcode, SuburbName, BottleShopName & Address, there is more than one shop with postal code Postcode, suburb name SuburbName, name BottleShopName and address Address then there is still only one tuple with those values in the relation.
If there could be more than one shop and you want to know that then you need a different predicate. You could keep a count of similar shops: "N bottle shops have postal code Postcode, suburb name SuburbName, name BottleShopName and address Address". Or you could assign a unique name or id to each shop and then use: "a bottle shop identified by id has postal code Postcode, suburb name SuburbName, name BottleShopName and address Address". (Still no duplicates possible.)
The reason relations are sets of similar tuples in the relational model is that if relations R and S have predicates r and s then:
R JOIN S has predicate (holds the tuples where) r AND s,
R UNION S has predicate (holds the tuples where) r OR s,
R MINUS S has predicate (holds the tuples where) r AND NOT s,
RESTRICT condition R has predicate (holds the tuples where) r AND condition,
RENAME A N R has the predicate (holds the tuples where) r with A replaced by N, and
PROJECT A R has predicate (holds the tuples where) FOR SOME attributes other than A, r.
Ie a query also has a predicate and its tuples are also the ones that make its predicate into a true statement. So querying means writing a predicate for which we want the satisfying tuples.
In SQL, tables can have duplicate rows. So you cannot reason this way about what rows should be in a table in a given application situation, what a table states about the application situation in terms of the rows in it, or what rows a query is asking for in terms of its inputs, its operator and the application situation--except in special cases. (Eg it doesn't make sense to talk about a row stating something by being in a table.) This is perhaps the most profound way that SQL violates the relational model and is pointlessly harder to use.

Is un-normalised data always contained in a single table?

For an assignment I have created a database driven web application. I have to show my understanding of normalisation by showing my database in de-normalised form, and then normalising it gradually, explaining what was done at each stage.
The normalisation process at stages 1 to 3 (which is as far as we have to go) I have no trouble understanding.
My database contains 20+ tables and I don't know how I am supposed to represent this is 0NF. The main difficulty is due to the fact that, as I have understood, 0NF data is in a single table. In fact, I don't see any way around this because 0NF has no primary keys, and therefore there would be no way to reference data in other tables.
Am I right in thinking this? Or can I represent 0NF data in multiple tables, which would make this task a lot easier as I wouldn't have a 100+ column table.
0NF is a single table - like a spreadsheet of data. You wouldn't reference any other tables, you would simply repeat the data in the one table.
For example, imagine a messaging system:
Customer | Recipient | Message
Bob John Hello John
John Bob Hello
Bob John Have you got time to answer a question?
John Bob No way
We don't have a table containing the Person to link to, we repeat Bob or John in the customer column and in the recipient column.
0NF data can occur in multiple tables, each of which may be 0NF, but one table for everything is the worst form.
This may very well be the case of an assignment where you first have to fuck up your spontaneous solution first, so you can show the process of how to make it better.
You mean "unnormalized" not "de-normalized". The latter is when normalized base tables are replaced by others whose values are always the join of the orignals. You need to find out from whoever gave you the asignment whether unnormalized form here means your first design attempt or specifically a "universal relation" that is an appropriate join of all those. That would be de-normalizing.
Every base table and query result holds the rows that make some predicate (statement parameterized by columns) into a true proposition (statement).
SELECT * FROM EMP "employee [E] is named [N] and has dependent [D]"
SELECT * FROM DEP "employee [E] works for department [D]"
query SELECT E, N FROM EMP
for some D, "employee [E] is named [N] and has dependent [D]"
("employee [E] is named [N] and has some dependent")
An SQL FROM makes a temporary table that you can think of as having columns T.C for each column C of each table T. For inner JOINs (ie INNER, CROSS and plain) this temporary table is a cross join. It's predicate is the AND of the predicates of the joined tables. ON and WHERE conditions are also ANDed into the predicate. The SELECT clause renames the temporary columns so there are no "."s. (Athough SQL does that implicity if there's no ambiguity.)
query SELECT EMP.E AS E, N, DEP.D AS D FROM EMP JOIN DEP
"for some EMP.D, employee [EMP.E] is named [EMP.N] and has dependent [EMP.D]"
AND "employee [DEP.E] works for department [DEP.D]"
(ie "employee [E] is named [N] and has some dependent and works for department [D]")
Note that it doesn't matter what constraints hold. (Including UNIQUE, PRIMARY KEY,FOREIGN KEY & CHECK). Constraints just tell you that tables are limited in the values they will ever hold. In fact the constraints are determined by the predicates and the situations that can arise.
If you know that it's always the case that T1.C = T2.C for some column C of tables T1 & T2 then you only have to SELECT one of them, AS C. If every column C is always equal in every table then NATURAL JOIN does the appropriate = and the AS without having to mention any columns.
(More re predicates & SQL.)
PS The single-base version of a database is not a base whose value is the FULL ( OUTER ) JOIN of separate bases. First, normalization does not deal with NULLs, so you would have to remove them from any OUTER JOIN result, more or less giving you your tables back. Second, FULL JOIN is in general not associative, ie (T1 FULL JOIN T2) FULL JOIN T3 <> T1 FULL JOIN (T2 JOIN T3), so there is no such thing as "the FULL JOIN` of more than two tables". Third, even with just two tables their FULL JOIN does not in general allow you to reconstruct their values.
PPS There is no "0th Normal Form". There are different uses of "1st Normal Form". Sometimes it just means being a relation, and sometimes it means being a relation with no relation-valued attributes, and it is also frequently used in various other confused/nonsensible ways that are really about aspects of good design.

Dealing with a curriculum database

Okay, I have 3 tables:
Students
Offered_subjects
Enrollees_list
Students will contain all the information of the students of the school
Offered_subjects will contain all the subjects offered by the school
Enrollees_list will contain all information about what subject a students is enrolled in and will also contain remarks for that subject (pass or fail).
Now, the subjects in offered_subjects contain courses that have prerequisites (ie. before qualifying for MySQL101, the student mas have a passing remark in DBMS101)
The categories of prerequisites are:
Academic Year
Semester
Subject
Note that not all the subjects listed in offered_subjects have all the categories for its prerequisite. Some require to finish a certain subject, some require that the student must be in a certain academic year (ie, 3rd year), and some have all three.
What's required for the program is to display all the students that are qualified for a selected subject.
Let's say: MySQL101 has prerequisite of 2nd year, 2nd sem, DBMS101
I need to list all the students that are on their 2nd year, 2nd sem, and have a passing remark in DBMS101.
This would be easy if all the subjects have same categories for its prerequisites where I can put the same queries in the where clause, but my problem is that, again, not all the subjects listed in offered_subjects have all the categories for its prerequisite.
I'm new to MySQL and it's kind of confusing to me at the moment.
How do I do it?
I'm guessing you have the following columns on the Offered_subjects table:
Req_year
Req_semester
Req_subject
If one of your subjects don't have values on this column, you can ignore this check for this column. Something like:
Select ... where (Subject.req_year is null OR Student.year >= Subject.req_year) ...
This way, required year will not be taken into account if it's null.
(note: I'm not completely sure of syntax right now, but that's the idea).