SQL How to Count table for each fields - mysql

I have 3 tables which is:
Courses
courses_id
name
QnAs
qna_id
student_id
courses_id
name
question
Students
student_id
name
Now I'm trying to count how many qna's there are for each courses. How do i make the query?
I've tried doing this :
SELECT (SELECT COUNT(qna_id) AS Expr1
FROM QnAs) AS Count
FROM QnAs AS QnAs_1 CROSS JOIN
Courses
GROUP BY Courses.courses_id
It does counts how many QnA's there are but not for each Courses
The output i got is each Courses names and QnAs count number but what i want is the QnA's number for each of the Courses

It seems you merely want to aggregate QNAs by course ID:
select courses_id, count(*)
from qnas
group by courses_id
order by courses_id;
Along with the course names:
select c.course_id, c.name, coalesce(q.cnt, 0) as qna_count
from courses c
left join
(
select courses_id, count(*) as cnt
from qnas
group by courses_id
) q on q.course_id = c.course_id
order by c.course_id;

Why not just use GROUP BY?
SELECT q.courses_id, COUNT(qna_id) as cnt
FROM QnAs q
GROUP BY q.courses_id;

This is not an answer, but just an explanation what your query does.
In your own query you first cross join all QnAs with all courses for no apparent reason, thus getting all possible combinations. So with two courses, each with three QNAs (that makes six QNAs in total), you'd construct 2 x 6 = 12 rows.
For each of these rows you select the total number of rows in the QNA table, which is six in above example. So you'd select 12 rows, all showing the number 6.
But then you group by course ID, thus ending up with two rows only in my example. You should apply an aggregate function on your subquery, e.g. MAX or SUM, but you don't, which makes your query invalid (because you are dealing with many rows, but treat this as if it were a single value). MySQL however silently applies ANY_VALUE, so your query becomes:
SELECT
ANY_VALUE( (SELECT COUNT(*) FROM QnAs) ) AS Count
FROM QnAs AS QnAs_1
CROSS JOIN Courses
GROUP BY Courses.courses_id;
I hope this explanation helps you understand how joins and aggregation work. You may want to set ONLY_FULL_GROUP_BY mode (https://dev.mysql.com/doc/...) in order to have MySQL report the syntax error instead of silently "fixing" the query by applying ANY_VALUE.

Related

Join mysql table with distinct value from another table

I encountered a problem on a database I am working with. I have a table of counsels which may hold repeating values, but their is an enrolment number filed which is unique and can be used to fetch them. However, I want to join from a cases_counsel table on the "first" unique value of the counsel table that matches that column on the cases counsel table.
I want to list the cases belonging to a particular counsel using the enrolment_number as the counsel_id on the cp_cases_counsel table. That means I want to pick just a distinct value of a counsel, then use it to join the cp_cases_counsel table and also return the count for such.
However, I keep getting duplicates. This was the mysql query I tried
SELECT T.suitno, T.counsel_id, COUNT(*) as total from cp_cases_counsel T
INNER JOIN (SELECT
enrolment_number as id, MIN(counsel)
FROM
cp_counsel
GROUP BY
enrolment_number
) A
ON A.id = T.counsel_id
GROUP BY T.suitno, T.counsel_id
and
SELECT enrolment_number as id, MIN(counsel) as counsel, COUNT(*) as total FROM cp_counsel
JOIN cp_cases_counsel ON cp_cases_counsel.counsel_id = cp_counsel.enrolment_number
GROUP BY enrolment_number
For the second query, it's joining twice and I am having like double of what I am supposed to get.
The columns that you want in the results are councel (actually only one of all its values) from cp_counsel and counsel_id from cp_cases_counsel, so you must group by them and select them:
SELECT a.counsel, t.counsel_id, COUNT(*) AS total
FROM cp_cases_counsel t
INNER JOIN (
SELECT enrolment_number, MIN(counsel) AS counsel
FROM cp_counsel
GROUP BY enrolment_number
) a ON a.enrolment_number = t.counsel_id
GROUP BY a.counsel, t.counsel_id;

Good alternatives of comporting two subquery in the 'Where' clause

I have this schema:
CLUB(Name, Address, City)
TEAM(TeamName, club)
PLAYER(Badge, teamName)
MATCH(matchNumber, player1, player2, club, winner)
I need to make this query:
For each club, find the number of players in that club that have won
at least two games.
I wrote this:
SELECT teamName
From TEAM t join Match m1 on t.club=m1.club
WHERE Q2 >= ALL Q1
Q1:
SELECT Count (Distinct winner)
FROM MATCH
WHERE match m join player p on m. winner=player.badge
GROUP BY teamName
Q2:
SELECT Count (distinct winner)
FROM match m2
WHERE m2.club=m1.club
I don’t know if it is correct, however I heard that using this form where I confront two counts is not the best. Why?
Try something like this:
SELECT club, COUNT(*) as PlayerCount
FROM (SELECT club, winner
FROM match
GROUP BY club, winner
HAVING COUNT(*) > 1) a
GROUP BY club
The inner query should limit results to club/player combinations that have 2 or more wins, and the outer query will count the number of these players per club.
I don’t know if it is correct, however I heard that using this form where I confront two counts is not the best. Why?
Comparing two count subqueries is fine if you need to, but a good rule of thumb is to hit each table as few times as possible. Using multiple subqueries will end up hitting each table multiple times, and will usually result in longer execution times.
Try this query
SELECT t.club, COUNT(*)
FROM TEAM t
JOIN PLAYER p ON p.teamName = t.TeamName
JOIN (
-- Won at least 2 matches.
SELECT club, winner, COUNT(*) AS TheCount
FROM MATCH
GROUP BY club, winner
HAVING COUNT(*) > 1
) w ON w.winner = p.badge AND w.club = t.club
GROUP BY t.club

Mysql: Count wins but only once per opponent

I'm looking for help using sum() in my SQL query:
Task: Count tournament wins of all players. (one number per player) (battles.result = 1 means Player1 wins)
SELECT members.id, members.name,
(
SELECT SUM(battles.result = 1)
FROM battles
WHERE members.id = battles.player1 AND battles.result=1 order by battles.gametime
( as wins,
FROM members
Next: Only count ONE result per two players.
So if there are multiple results of two players, count only the first result (first gametime).
I've already tried using order by battles.player2, but i guess there is a much better solution?
You can easily get the result by doing a join and aggregation instead. Try:
SELECT A.id, A.name, SUM(IFNULL(B.result,0)) wons
FROM members A LEFT JOIN battles B
ON A.id=B.player1
GROUP BY A.id, A.name;

MySQL avg() and count() in one statement with group by

today I'm fighting with MySQL: I've got two tables, that contain records like that (actually there are more columns, but I don't think it's relevant):
Table Metering:
id, value
1000, 0.117
1000, 0.689
1001, 0.050
...
Table Res (there is no more than one record per id in this table):
id, number_residents
1001, 2
...
I try to get results in the following format:
number_residents, avg, count(id)
2, 0.1234, 456
3, 0.5678, 567
...
In words: I try to find out the average of the value-fields with the same number_residents. The id-field is the connection between the two tables. The count(id)-column should show how many ids have been found with that number_residents. The query I could come up with was the following:
select number_residents,count(distinct Metering.id),avg(value)
from Metering, Res
where Metering.id = Res.id
group by number_residents;
The results look like what I searched for but when I tried to validate them I became insecure. I tried it without the distinct at first but that leads to too high values in the count-column of the results.
Is my statement right to get what I want? I thought it might have to to something with the order of execution like asked here, but I actually can't find any official documentation on that...
Thanks for helping!
Judging by the table names, Res is the "parent" table and Metering us the "child" table - that is there are 0-n meterings for each residence.
You have use "old school" joins (and I mean old - the join syntax has been around for 25 years now), which are inner joins, meaning residences without meterings won't participate in the results.
Use an outer join:
select
number_residents,
count(distinct r.id) residences_count,
avg(value) average_value
from Res r
left join Metering m on m.id = r.id
group by number_residents
Although meterings.id = res.id, with a left join counting them may produce different results: I've changed the count to count residences, which for a left join means residences that don't have meterings still count.
Now, nulls (which are what you get from a left-joined table that doesn't have a matching row) don't participate in avg() - either for the numerator or denominator, if you want residences without any meterings to count when calcukating the average (as if they have a single zero metering for the purposes of dividing the total value), use this query:
select
number_residents,
count(distinct r.id) residences_count,
sum(value) / count(r.id) average_value
from Res r
left join Metering m on m.id = r.id
group by number_residents
Because res.id is never null, count(r.id) counts the number of meterings plus 1 for every residence without any meterings.

MySQL is not using INDEX in subquery

I have these tables and queries as defined in sqlfiddle.
First my problem was to group people showing LEFT JOINed visits rows with the newest year. That I solved using subquery.
Now my problem is that that subquery is not using INDEX defined on visits table. That is causing my query to run nearly indefinitely on tables with approx 15000 rows each.
Here's the query. The goal is to list every person once with his newest (by year) record in visits table.
Unfortunately on large tables it gets real sloooow because it's not using INDEX in subquery.
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id
Does anyone know how to force MySQL to use INDEX already defined on visits table?
Your query:
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id;
First, is using non-standard SQL syntax (items appear in the SELECT list that are not part of the GROUP BY clause, are not aggregate functions and do not sepend on the grouping items). This can give indeterminate (semi-random) results.
Second, ( to avoid the indeterminate results) you have added an ORDER BY inside a subquery which (non-standard or not) is not documented anywhere in MySQL documentation that it should work as expected. So, it may be working now but it may not work in the not so distant future, when you upgrade to MySQL version X (where the optimizer will be clever enough to understand that ORDER BY inside a derived table is redundant and can be eliminated).
Try using this query:
SELECT
p.*, v.*
FROM
people AS p
LEFT JOIN
( SELECT
id_people
, MAX(year) AS year
FROM
visits
GROUP BY
id_people
) AS vm
JOIN
visits AS v
ON v.id_people = vm.id_people
AND v.year = vm.year
ON v.id_people = p.id;
The: SQL-fiddle
A compound index on (id_people, year) would help efficiency.
A different approach. It works fine if you limit the persons to a sensible limit (say 30) first and then join to the visits table:
SELECT
p.*, v.*
FROM
( SELECT *
FROM people
ORDER BY name
LIMIT 30
) AS p
LEFT JOIN
visits AS v
ON v.id_people = p.id
AND v.year =
( SELECT
year
FROM
visits
WHERE
id_people = p.id
ORDER BY
year DESC
LIMIT 1
)
ORDER BY name ;
Why do you have a subquery when all you need is a table name for joining?
It is also not obvious to me why your query has a GROUP BY clause in it. GROUP BY is ordinarily used with aggregate functions like MAX or COUNT, but you don't have those.
How about this? It may solve your problem.
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
If you need to show the person, the most recent visit, and the note from the most recent visit, you're going to have to explicitly join the visits table again to the summary query (virtual table) like so.
SELECT a.id, a.name, a.year, v.note
FROM (
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
)a
JOIN visits v ON (a.id = v.id_people and a.year = v.year)
Go fiddle: http://www.sqlfiddle.com/#!2/d67fc/20/0
If you need to show something for people that have never had a visit, you should try switching the JOIN items in my statement with LEFT JOIN.
As someone else wrote, an ORDER BY clause in a subquery is not standard, and generates unpredictable results. In your case it baffled the optimizer.
Edit: GROUP BY is a big hammer. Don't use it unless you need it. And, don't use it unless you use an aggregate function in the query.
Notice that if you have more than one row in visits for a person and the most recent year, this query will generate multiple rows for that person, one for each visit in that year. If you want just one row per person, and you DON'T need the note for the visit, then the first query will do the trick. If you have more than one visit for a person in a year, and you only need the latest one, you have to identify which row IS the latest one. Usually it will be the one with the highest ID number, but only you know that for sure. I added another person to your fiddle with that situation. http://www.sqlfiddle.com/#!2/4f644/2/0
This is complicated. But: if your visits.id numbers are automatically assigned and they are always in time order, you can simply report the highest visit id, and be guaranteed that you'll have the latest year. This will be a very efficient query.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT id_people, max(id) id
FROM visits
GROUP BY id_people
)m
JOIN people p ON (p.id = m.id_people)
JOIN visits v ON (m.id = v.id)
http://www.sqlfiddle.com/#!2/4f644/1/0 But this is not the way your example is set up. So you need another way to disambiguate your latest visit, so you just get one row per person. The only trick we have at our disposal is to use the largest id number.
So, we need to get a list of the visit.id numbers that are the latest ones, by this definition, from your tables. This query does that, with a MAX(year)...GROUP BY(id_people) nested inside a MAX(id)...GROUP BY(id_people) query.
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON (p.id_people = v.id_people AND p.year = v.year)
GROUP BY v.id_people
The overall query (http://www.sqlfiddle.com/#!2/c2da2/1/0) is this.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON ( p.id_people = v.id_people
AND p.year = v.year)
GROUP BY v.id_people
)m
JOIN people p ON (m.id_people = p.id)
JOIN visits v ON (m.id = v.id)
Disambiguation in SQL is a tricky business to learn, because it takes some time to wrap your head around the idea that there's no inherent order to rows in a DBMS.