Tricky intersection in Relational Algebra - intersection

Hi there for my exam revision i had picked up the following sample question for relational algebra:
employee (+person_name, street, city)
works (+person_name, company_name, salary)
company (+company_name, city)
manages (+person_name, manager_name)
+ indicate the underlined primary keys
Find the names of all employees who live in the same city and on the same street as their managers
MY solution
JOIN manages and employee (OVER person_name) GIVING T1
JOIN manages and employee (OVER manager_name) GIVING T2
PROJECT T1 over person_name, street, city GIVING T3
PROJECT T2 over street, city GIVING T4
T3 intersect T4 GIVING T5
PROJECT T5 over person_name GIVING RESULT
This was my solution until I had found out about that the intersection has to be union-compatible (number of columns matching and their headings)
Since then I couldn’t really find a solution to this problem because if I do the following change to line-3
PROJECT T1 over street, city GIVING T3
then I will never have the opportunity to link the result of intersection back to person_name.
On the other hand when I would make the following change to line-4:
PROJECT T2 over person_name, street, city GIVING T4
Then upon the intersection I would never get a person who has any other manager than himself.
I would appreciate any hints given, perhaps this online sample i picked up is quite ambiguous.

Another way to phrase the question: for every manager+person pair, find those for which the related city+state are the same for both people. You almost did that:
JOIN manages AND employee (OVER person_name) GIVING T1
JOIN T1 AND employee (OVER manager_name, street, city) GIVING T2
PROJECT T2 OVER person_name, manager_name, street, city GIVING RESULT
The problem statement does not require the names to returned in a single column, and this answer provides a useful result. If need be, you could repeat the above query, taking the union of two projections: one of person_name and the other of manager_name.
Just one thing: many managers would object to one column named "person" and the other "manager" because almost every manager -- your experience notwithstanding, perhaps? -- considers himself a person. More acceptable pairs might be manager/worker, lord/serf, master/slave, etc.

Related

Retrieve an overview of all countries that have at least one city, how many cities they have. and the average population of these cities

another day and another mysql problem, ive been scratching my head with this question for quite some while now.
My task is through a database called "world" is to retrive and overview of countries with atleast one city, how many cities they have and the average population of these cities. i would also like to sort the average population by using " (AS AverageCityPopulation)" and the number of cities with "(AS NumberOfCities)".
ive just started to learn about join, left join and right join aswell and i am pretty certain that i have to use one of those 3 to complete the task. im still trying to find a helpful way to memorize when to use those 3 (if you have a tip please leave it down below).
anyways, the data should be sorted like this i feel like
countrycode
countryname
First
row
Second
row
cityname
citycountrycode
First
row
Second
row
averagecitypop
numberofcities
First
row
Second
row
of course the data should be displayed sideways but it is a bit hard to make it work in stackoverflow. anyways, i have tried with multiple queries for now, but still havent found the answer. the closest i got to was the entire avg population of a city in Aruba
my current query is:
SELECT
country.name,
country.code,
city.name,
AVG(city.population) AS averageCityPop,
city.countrycode
FROM
world.city
right JOIN
world.country ON city.CountryCode = country.code
where city.CountryCode > 1
again i am relativly new, so any thesis or curriculum is appriciated as answers in this post and answers to my question, if you also know any good youtube channels or forums where its helpful to learn mysql it would be great!
thanks for any helpful answers <3
here are a few screenshots about the two tables im trying to connect
world.city
world.country
Note that the database I use is MySQL sample database - World.
For beginners: both tables have primary keys (for table country, it is 'code', for table city, it is 'id'), so it's enough to use inner joins.
SELECT co.code AS country_code,
co.name AS country_name,
COUNT(*) AS num_cities,
AVG(ci.population) AS avg_city_pop
FROM country co INNER JOIN city ci ON (co.code = ci.countrycode)
GROUP BY co.code;
Or if you want to show the name of each city:
SELECT co.code AS country_code,
co.name AS country_name,
ci.name AS city_name,
COUNT(*) OVER w AS num_cities,
AVG(ci.population) OVER w AS avg_city_pop
FROM country co INNER JOIN city ci ON (co.code = ci.countrycode)
WINDOW w AS (PARTITION BY co.code);

MySQL Change column value to the least value in its category

Assuming only one school can be identified in a specific town, you are required to move students from duplicated schools to originally created schools. We also assume that lowest school id implies the first schools to be created in
the database, hence original schools.
I want to achieve this simply by changing the school ids for duplicate school+town to the smallest school id (the original) in that category. This will take care of the student records table that is linked to this one via foreign key (school id).
How would I go about doing this on the table attached? I'm thinking along the lines of SELECT MIN, CASE STATEMENTS as well as GROUP BY and COUNT() but I'm at a loss on how to combine these. Anyone have an idea/code on how I would achieve the requirement above?
I'd assume that the school id is a unique identifier (key). Therefore you can't just update it in the schools table. You'd rather need to update the school_id column in the students table to point to the original school's id.
If this is the case you can do something along the lines of
-- get all students and their current school info for update
UPDATE students st JOIN schools sc
ON st.school_id = sc.id JOIN (
-- get ids of original schools
SELECT town, name, MIN(id) id
FROM schools
GROUP BY town, name
) q -- join students with a list original schools ids
ON sc.town = q.town AND sc.name = q.name
-- change the school id to the original one
SET st.school_id = q.id
-- but only for students that were associated with non-original schools
WHERE st.school_id <> q.id
Here is dbfiddle demo

Why would a SQL query need to be so complicated like this feature allows?

I am studying for SQL exam, and I came across this fact, regarding subqueries:
2. Main query and subquery can get data from different tables
When is a case when this feature would be useful? I find it difficult to imagine such a case.
Millions of situations call for finding information in different tables, it's the basis of relational data. Here's an example:
Find the emergency contact information for all students who are in a chemistry class:
SELECT Emergency_Name, Emergency_Phone
FROM tbl_StudentInfo
WHERE StudentID IN (SELECT b.StudentID
FROM tbl_ClassEnroll b
WHERE Subject = 'Chemistry')
SELECT * FROM tableA
WHERE id IN (SELECT id FROM tableB)
There is plenty of reasons why you have to get data from different tables, such as select sth from main query, which is based on subquery/subqueries from another tables. The usage is really huge.
choose customers from main query which is based on regions and their values
SELECT * FROM customers
WHERE country IN(SELECT name FROM country WHERE name LIKE '%land%')
choose products from main query which is greater or lower than average incoming salary of customers and so on...
You could do something like,
SELECT SUM(trans) as 'Transactions', branch.city as 'city'
FROM account
INNER JOIN branch
ON branch.bID = account.bID
GROUP BY branch.city
HAVING SUM(account.trans) < 0;
This would for a company to identify which branch makes the most profit and which branch is making a loss, it would help identify if the company had to make changes to their marketing approach in certain regions, in theory allowing for the company to become more dynamic and reactive to changes in the economy at any give time.

How Do I Join Tables to Return a Specific Number of People Per Team Who Fit Certain Criteria?

This is a pretty specific one:
I have two tables (t1 and t2) -- t1 is my all-person table, where everybody in my database and all their data is housed, and t2 is my much smaller table of people who are actually going to do the work of talking to the people in t1.
As you can see in this sample SQL Fiddle, the people in t1 each have specific criteria assigned to them (age, rating, and team). My end-result will hopefully be: for every one worker in t2, the query will return 2 specific people from t1 if their teams match (the idea behind this is that I'm matching the workers in t2 to the person they're going to talk to in t1 based on their team).
But what makes it trickier is that there are two more sets of criteria that I want the query to satisfy also:
Only return people from t1 if their age is in between 30 and 60 (everybody outside of those age-ranges should be ignored) --
And if there's more than two people that fit the above criteria, return the ones with the best rating first. (For example, if there are four people on a team and we only need two -- so the query should return the two with the best ratings, which would be whichever ratings closer to 100.)
The final thing that is difficult to wrap my head around is that there are multiple workers per team as well -- so if there's two t2 workers on 'Team A', the query should return four distinct t1 people on Team A: two attached to one worker and two attached to another (and like I said above, they should be the four best rated people [though it doesn't matter which two people goes to which worker).
My hopeful output will look something like the following for all teams:
ID (t1) Person (t1) Team (t1) Worker (t2)
539184 Smith, Jane Team A Smith, Bob
539186 Smith, Jim Team A Smith, Bob
537141 Smith, Danny Team A Smith, Bill
537162 Smith, James Team A Smith, Bill
Etc.
In reality, I'm doing something similar to this with tens of thousands of records -- which is why this is the only way I can imagine doing it, but I barely even know where to start. Any help would be greatly appreciated, and I'll add any additional information that would be helpful!
The SQL fiddle did not work. But still going ahead with the query format :)
SET #rank:=1, #curr = 0;
SELECT * FROM(
SELECT #rank := if(#curr = t1.id, #rank+1, 1 ) as rank,#curr := t1.id as curr, <field_list>
FROM t1
INNER JOIN t2 on t1.id = t2.ref_id
WHERE t2.age BETWEEN 30 AND 60 < AND "whatever">
ORDER BY t1.id, t2.ranking desc
) t WHERE rank <= 2 ;
replace with fields you want to select like name, id, gender
< AND "whatever"> replace with all conditions you have like "ranking > 10 " etc

Normalizing MySQL table with records of another table

i have 2 tables. The city tables is not normalized because the country information is in plain text. I have added the id_country to the 'city' table (that column is empty).
I need to check for matches between city>country and country>country and then update the city records that matched with the id_country from the country table. At the end i will be able to delete the 'country' column from the city table.
City table
id_city (1, 2, 3...)
city (Washington, Guayaquil, Bonn...)
country (Germany, Ecuador, USA...)
id_country (currently empty)
Country table
id_country (1, 2, 3...)
code (GE, EC, US...)
country (Germany, Ecuador, USA...)
I have no idea on where to start and if it can be done with a SQL query. My original idea was to search for matches in a php loop but that seems to be a really harder implementation.
You can do this with a JOIN on an UPDATE statement.
UPDATE city c1 INNER JOIN country c2 ON c1.country=c2.country
SET c1.id_country=c2.id_country;
Using an INNER JOIN will make sure that updates only occur for cities that have a matching country value.
Once you've run it, you'll be able to select all those cities that still have a null id_country just in case some of them didn't match. Conversely, once you've determined that all your cities have an id_country, you can delete that column from the city table.
The city tables is not normalized because the country information is
in plain text.
Nonsense. Normalization doesn't mean "replace plain text with id numbers". Find whoever taught you that and poke him in the eye with a sharp stick.
Your real problem is that "city" plus "country" isn't sufficient to identify cities, at least in the USA. I think there are at least a dozen different cities named "Washington" in the USA.
Instead of replacing the country name with an id number, you'd be far better off replacing it with the two-letter country code. The codes are human-readable; the id numbers will require an additional JOIN in every query that uses your table of cities.
Something like this should work:
UPDATE city set id_country = (SELECT country.id_country from country WHERE country.country = city.country)