Querying across hierarchical tables SQL

Querying across hierarchical tables SQL - mysql

I'm having issues querying a table with a subclass. To illustrate, if I had the following tables in a MySQL database
userTable:
id name gender_id
1 bob 1
....
genderTable:
gender_id term
1 male
2 female
....
How can I write a query for all of the males using term from genderTable, not just using the gender_id?

You seem to be looking for a simple join. There is no hierarchy involved here, the genderTable is called a referrential table.
The following query will give you all users that with a 'male' gender:
select u.*
from userTable u
inner join genderTable g
on g.gender_id = u.gender_id
and g.term = 'male'

Related

Get count of rows from joining 2 tables

I have 2 tables:
Table1: users
id
name
faculty_id
level_id
1
john
1
1
2
mark
1
1
3
sam
1
2
Table 2: subjects
id
title
faculty_id
1
physics
1
2
chemistry
1
3
english
2
SQL query:
SELECT count(subjects.id) FROM users INNER JOIN subjects ON users.faculty_id = subjects.faculty_id WHERE users.level_id = 1
I'm trying to get count of subjects where users.level_id = 1, Which should be 2 in this case physics and chemistry.
But it's returning more than 2.
Why is that and how to get only 2?

I would recommend exists:
SELECT COUNT(*)
FROM subjects s
WHERE EXISTS (SELECT 1
FROM users u
WHERE u.faculty_id = s.faculty_id AND
u.level_id = 1
);
This counts subjects where a user exists with a level of 1.

You are joining users and subjects on faculty_id; this produces every combination of user and subject rows (2 users and 2 subjects makes 4 combined rows); change your query to SELECT users.*, subjects.* FROM... to see how this works.
count(subjects.id) counts the number of non-null subjects.id values in your results; you can just do count(distinct subjects.id).

The two tables are not directly related as none is parent to the other. The faculty table is parent to both tables and this is what relates the two tables indirectly.
When joining the faculties' students with the faculties' subjects per faculty, you get all combinations (john|physics, mark|physics, sam|physics, john|chemistry, mark||chemistry, ...). Whether John really has the subject Physics cannot even be gathered from the database. We see that John studies a faculty containing the subjects Physics and Chemistry, but does every student have every subject belonging to their faculty? You probably know but we don't. That shows that in order to write proper queries, one should know their database :-)
Now you are joining the tables and get all students per faculty multiplied with all subjects per faculty. You limit this to level_id = 1, which gets you 2 students x 2 subjects = 4. You could use COUNT(*) for this, because you are counting rows. By applying COUNT(subjects.id) instead you are only counting rows for which the subject ID is not null, but that is true for all rows, because all four combined rows have either subject ID 1 (Physics) or 2 (Chemistry). Counting something that cannot be null makes no sense, except for counting distinct, as has already been suggested. You can COUNT(DISTINCT subjects.id) to get the distinct number of subjects matching yur conditions.
This, however, has two drawbacks. First, the query doesn't clearly show your intention. Why do you join all students with all subjects, when your are not really interested in the (four) combinations? Secondly, you are building an unnecessary intermediate result (four rows in your small example) that must be searched for duplicates, so these can be removed from the counting. That means more memory consumed and more work for the DBMS.
What you want to count is subjects. So select from the subjects table. Your condition is that a student exists with level 1 for the same faculty. Conditions belong in the WHERE clause. Use EXISTS as Gordon suggests in his answer or use IN which is slightly shorter to write and may hence be considered a tad more readable (but that boils down to personal preference, as EXISTS and IN express exactly the same thing here).
select count(*)
from subjects
where faculty_id in (select faculty_id from users where level_id = 1);

You can just add "distinct" before subjects.id
your SQL query like:
SELECT count(distinct subjects.id) FROM users INNER JOIN subjects ON users.faculty_id = subjects.faculty_id WHERE users.level_id = 1

You want to count level_id and you have mentioned subject_id in the code. I would suggest first join two tables.
SELECT users.name, users.level_id,
subjects.title
FROM users
INNER JOIN subjects ON
users.faculty_id = subjects.faculty_id as new_table
After joining the table u can get the count.
SELECT level_id, COUNT(level_id)
FROM new_table
GROUP BY level_id
WHERE level_id = 1
(You have not mentioned group by in your code.)

How to JOIN SELECT from multiple tables, where the SELECTS is based on different conditions?

I have three tables. One with notes Notes, one with users Users, and one a relational table between users and notes NotesUsers.
Users
user_id first_name last_name
1 John Smith
2 Jane Doe
Notes
note_id note_name owner_id
1 Math 1
2 Science 1
3 English 2
NoteUsers
user_id note_id
1 1
2 1
2 2
2 3
Hopefully, from the select statement you can tell what I'm trying to do. I am trying to select the notes that user_id = 2 has access to but doesn't necessarily own, but also along with this I'm trying to get the first and last name of the owner.
SELECT Notes.notes_id, note_name
FROM Notes, NotesUsers
WHERE NotesUsers.note_id = Notes.note_id AND NotesUsers.user_id = 2
JOIN SELECT first_name, last_name FROM Users, Notes WHERE Notes.owner_id = Users.user_id
My problem is that because the WHERE clause for first_name, and last_name versus that for notes are different, I don't know how to query the data. I understand that this is not how a JOIN works and
I don't necessarily want to use a JOIN, but I'm not sure how to structure the statement, so I left it in there so that you can understand what I'm trying to do.

You can join Notes with NoteUsers to check for access and with Users to add the user's details to the result:
SELECT n.noted_id, n.note_name, u.first_name, u.last_name
FROM Notes n
JOIN NoteUsers nu ON n.noted_id = nu.note_id AND nu.user_id = 2
JOIN Users u ON n.owner_id = u.user_id

you need here to use a query inside the main query. MySQL will return first all the note_id that the user with user_id = 2 has access to from NoteUser, then well build the outer query to return the first_name and the last_name of the owner.
SELECT u.first_name, u.last_name, n.note_name, n.note_id
FROM Notes AS n
LEFT JOIN Users AS u ON u.user_id = n.owner_id
WHERE n.note_id IN
(SELECT nu.note_id FROM NoteUser WHERE nu.user_id = 2)

SQL query using outer join and limiting child records for each parent

I'm having trouble figuring out how to structure a SQL query. Let's say we have a User table and a Pet table. Each user can have many pets and Pet has a breed column.
User:
id | name
______|________________
1 | Foo
2 | Bar
Pet:
id | owner_id | name | breed |
______|________________|____________|_____________|
1 | 1 | Fido | poodle |
2 | 2 | Fluffy | siamese |
The end goal is to provide a query that will give me all the pets for each user that match the given where clause while allowing sort and limit parameters to be used. So the ability to limit each user's pets to say 5 and sorted by name.
I'm working on building these queries dynamically for an ORM so I need a solution that works in MySQL and Postgresql (though it can be two different queries).
I've tried something like this which doesn't work:
SELECT "user"."id", "user"."name", "pet"."id", "pet"."owner_id", "pet"."name",
"pet"."breed"
FROM "user"
LEFT JOIN "pet" ON "user"."id" = "pet"."owner_id"
WHERE "pet"."id" IN
(SELECT "pet"."id" FROM "pet" WHERE "pet"."breed" = 'poodle' LIMIT 5)

In Postgres (8.4 or later), use the window function row_number() in a subquery:
SELECT user_id, user_name, pet_id, owner_id, pet_name, breed
FROM (
SELECT u.id AS user_id, u.name AS user_name
, p.id AS pet_id, owner_id, p.name AS pet_name, breed
, row_number() OVER (PARTITION BY u.id ORDER BY p.name, pet_id) AS rn
FROM "user" u
LEFT JOIN pet p ON p.owner_id = u.id
AND p.breed = 'poodle'
) sub
WHERE rn <= 5
ORDER BY user_name, user_id, pet_name, pet_id;
When using a LEFT JOIN, you can't combine that with WHERE conditions on the left table. That forcibly converts the LEFT JOIN to a plain [INNER] JOIN (and possibly removes rows from the result you did not want removed). Pull such conditions up into the join clause.
The way I have it, users without pets are included in the result - as opposed to your query stub.
The additional id columns in the ORDER BY clauses are supposed to break possible ties between non-unique names.
Never use a reserved word like user as identifier.
Work on your naming convention. id or name are terrible, non-descriptive choices, even if some ORMs suggest this nonsense. As you can see in the query, it leads to complications when joining a couple of tables, which is what you do in SQL.
Should be something like pet_id, pet, user_id, username etc. to begin with.
With a proper naming convention we could just SELECT * in the subquery.
MySQL does not support window functions, there are fidgety substitutes ...

SELECT user.id, user.name, pet.id, pet.name, pet.breed, pet.owner_id,
SUBSTRING_INDEX(group_concat(pet.owner_id order by pet.owner_id DESC), ',', 5)
FROM user
LEFT JOIN pet on user.id = pet.owner_id GROUP BY user.id
Above is rough/untested, but this source has exactly what you need, see step 4. also you don't need any of those " 's.

mysql left join duplicates

ive been searching for hours but cant find a solution. its a bit complicated so i'll break it down into a very simple example
i have two tables; people and cars
people:
name_id firstname
1 john
2 tony
3 peter
4 henry
cars:
name_id car_name
1 vw gulf
1 ferrari
2 mustang
4 toyota
as can be seen, they are linked by name_id, and john has 2 cars, tony has 1, peter has 0 and henry has 1.
i simply want to do a single mysql search for who has a (1 or more) car. so the anwser should be john, tony, henry.
the people table is the master table, and im using LEFT JOIN to add the cars. my problem arises from the duplicates. the fact that the table im joining has 2 entries for 1 id in the master.
im playing around with DISTINCT and GROUP BY but i cant seem to get it to work.
any help is much appreciated.
EDIT: adding the query:
$query = "
SELECT profiles.*, invoices.paid, COUNT(*) as num
FROM profiles
LEFT JOIN invoices ON (profiles.id=invoices.profileid)
WHERE (profiles.id LIKE '%$id%')
GROUP BY invoices.profileid
";

try this
select distinct p.name_id, firstname
from people p, cars c
where p.name_id = c.name_id
or use joins
select distinct p.name_id, firstname
from people p
inner join cars c
on p.name_id = c.name_id

If you only want to show people that have a car, then you should use a RIGHT JOIN. This will stop any results from the left table (people) to be returned if they didn't have a match in the cars table.
Group by the persons name to remove duplicates.
SELECT firstname
FROM people P
RIGHT JOIN cars C ON C.name_id = P.name_id
GROUP BY firstname

SELECT DISTINCT firstname
FROM people
JOIN cars ON cars.name_id = people.name_id;
If this doesn't work you might have to show us the full problem.
The way to propose it there's no need for a left join since you need at least a car per person. Left join is implicitely an OUTER join and is intended to return the results with 0 corresponding records in the joinned table.

MYSQL self referential logic with a self join

I have a table called friends which has id and name and a self join table called friendship which stores the relationship which includes friend_id and friend2_id .
how do i get the names of related friends if a name of a particular frnd is given
example
id name
1 jack
2 kurt
3 jim
and
friendship
f_id f1_id
1 3
So if i give 'jack' i should get jim back

You could do this in one query or two queries, depending on what you want to accomplish.
A simple one could be:
SELECT
f_id,
f1_id
FROM
friendship
WHERE
f_id=1
OR
f1_id=1
And then you can get the specific friends with a statement like:
SELECT name FROM people WHERE id IN(2,3)
Alternative is a self join but the hard part here is that your id might be in both f_id and f1_id so that would need some UNION command or something like (untested):
SELECT
p1.name,
p2.name,
FROM
friendship
INNER JOIN
people AS p1
ON friendship.f_id = people.id
INNER JOIN
people AS p2
ON friendship.f1_id = people.id
WHERE
p1.id=1 OR p2.id=1
I would thoroughly check the speed of these options since they are quite heavy on huge amounts of records. If you measure you need more performance try some alternative. For example when you always put the smallest people.id in f_id and the bigger one in f1_id you might run 2 queries which you union. Alternative is to denormalize a small bit to cache the results if you need them frequently.
It would save you lots of joings for example if you would add the names into the friendship table:
SELECT
f_id,
f1_id,
f_name,
f1_name
FROM
friendship
WHERE
f_id=1
OR
f1_id=1

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008