Inequality in Mysql with count() - mysql

I have the following structure :
Table Author :
idAuthor,
Name
+----------+-------+
| idAuthor | Name |
+----------+-------+
| 1 | Renee |
| 2 | John |
| 3 | Bob |
| 4 | Bryan |
+----------+-------+
Table Publication:
idPublication,
Title,
Type,
Date,
Journal,
Conference
+---------------+--------------+------+-------------+------------+-----------+
| idPublication | Title | Date | Type | Conference | Journal |
+---------------+--------------+------+-------------+------------+-----------+
| 1 | Flower thing | 2008 | book | NULL | NULL |
| 2 | Bees | 2009 | article | NULL | Le Monde |
| 3 | Wasps | 2010 | inproceding | KDD | NULL |
| 4 | Whales | 2010 | inproceding | DPC | NULL |
| 5 | Lyon | 2011 | article | NULL | Le Figaro |
| 6 | Plants | 2012 | book | NULL | NULL |
| 7 | Walls | 2009 | proceeding | KDD | NULL |
| 8 | Juices | 2010 | proceeding | KDD | NULL |
| 9 | Fruits | 2010 | proceeding | DPC | NULL |
| 10 | Computers | 2010 | inproceding | DPC | NULL |
| 11 | Phones | 2010 | inproceding | DPC | NULL |
| 12 | Creams | 2010 | proceeding | DPC | NULL |
| 13 | Love | 2010 | proceeding | DPC | NULL |
+---------------+--------------+------+-------------+------------+-----------+
Table author_has_publication :
Author_idAuthor,
Publication_idPublication
+-----------------+---------------------------+
| Author_idAuthor | Publication_idPublication |
+-----------------+---------------------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 1 | 5 |
| 2 | 5 |
| 3 | 5 |
| 3 | 6 |
| 4 | 7 |
| 4 | 8 |
| 4 | 9 |
| 4 | 10 |
| 3 | 11 |
| 3 | 12 |
| 2 | 13 |
+-----------------+---------------------------+
I want to obtain the list of all authors having published at least 2 times at conference DPC in 2010.
I achieved to get the list of autors that have published something, and the number of publication for each, but I can't get my 'at least 2' factor.
My following query
SELECT author.name, COUNT(name) FROM author INNER JOIN author_has_publication ON author.idAuthor=author_has_publication.Author_idAuthor INNER JOIN publication ON author_has_publication.Publication_idPublication=publication.idPublication AND publication.date=2010 AND publication.conference='DPC'GROUP BY author.name;
returns the following result (which is good)
+-------+-------------+
| name | COUNT(name) |
+-------+-------------+
| Bob | 2 |
| Bryan | 3 |
| John | 1 |
+-------+-------------+
but when I try to select only the one with a count(name)>=2, i got an error.
I tried this query :
SELECT author.name, COUNT(name) FROM author INNER JOIN author_has_publication ON author.idAuthor=author_has_publication.Author_idAuthor INNER JOIN publication ON author_has_publication.Publication_idPublication=publication.idPublication AND publication.date=2010 AND publication.conference='DPC'GROUP BY author.name WHERE COUNT(name)>=2;

When you use aggregation funcion you can filter with a proper operator named HAVING
Having worok on the result of the query (then pn the aggrgated result like count() ) instead of where that work on the original value of the tables rows
SELECT author.name, COUNT(name)
FROM author INNER JOIN author_has_publication
ON author.idAuthor=author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication=publication.idPublication
AND publication.date=2010 AND publication.conference='DPC'
GROUP BY author.name
HAVING COUNT(name)>=2;

Related

Comparing all data to the data of specifically selected user ID?

I have a mysql table that holds data for team games.
Objective:
Count the number of times other SquadID's have have shared the same Team value as SquadID=21
// Selections table
+--------+---------+------+
| GameID | SquadID | Team |
+--------+---------+------+
| 1 | 5 | A |
| 1 | 7 | B |
| 1 | 11 | A |
| 1 | 21 | A |
| 2 | 5 | A |
| 2 | 7 | B |
| 2 | 11 | A |
| 2 | 21 | A |
| 3 | 5 | A |
| 3 | 7 | B |
| 3 | 11 | A |
| 3 | 21 | A |
| 4 | 5 | A |
| 4 | 11 | B |
| 4 | 21 | A |
| 5 | 5 | A |
| 5 | 11 | B |
| 5 | 21 | A |
| 6 | 5 | A |
| 6 | 11 | B |
| 6 | 21 | A |
+--------+---------+------+
// Desired Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 0 |
| 11 | 3 |
| 21 | 6 |
+----------+---------+
I've attempted to use a subquery specifying the specific player I wish to compare with and because this subquery has multiple rows, I've used in instead of =.
// Current Query
SELECT
SquadID,
COUNT(Team IN (SELECT Team FROM selections WHERE SquadID=21) AND GameID IN (SELECT GameID FROM selections WHERE SquadID=21)) AS TeamMate
FROM
selections
GROUP BY
SquadID;
The result I'm getting is the number of Games a user has played rather than the number of games a user has been on the same team as SquadID=21
// Current Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 3 |
| 11 | 6 |
| 21 | 6 |
+---------+----------+
What am I missing?
// DESCRIBE selections;
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| GameID | int(11) | NO | PRI | 0 | |
| SquadID | int(4) | NO | PRI | NULL | |
| Team | char(1) | NO | | NULL | |
| TeamID | int(11) | NO | | 1 | |
+---------+---------+------+-----+---------+-------+
General rule is to avoid nested selects and look for a better way of logically arranging joins. Lets look at a cross join:
From selections s1
inner join selects s2 on s1.gameid = s2.gameid and s1.team = s2.team
This will produce a cross joined list of each squadID that participated with another squadID (IE: they were in the same game and on same team). We are only interested in the times where the squad participated with squad 21, so add a where clause:
where s2.squadid = 21
Then it's simply choosing the field/count you want:
select s1.squad, count(1) as teammate
any aggregate needs a group by
group by s1.squad
Combine it together and give a go. Oddly, this will produce a list where squad 21 will be showing as playing on it's own team all 6 times. Adding a where clause can eliminate this
where s1.squadid <> s2.squadid
SELECT SquadID, count(t1.team) as TeamMate
FROM selections as t1 join
(select distinct team, gameid from selections where SquadID=21) as t2
on t1.Team=t2.Team and t1.Gameid=t2.Gameid
GROUP BY SquadID

how can i find this query in mysql workbench

For each student, find the number of courses they take and sort
the rows in descending order. (e.g. student id, the number of
courses taken by that student)
STUDENT TABLE
| ID | name | dept_name | tot_cred |
| S0901 | Alice | Comp.Sci. | 83 |
| S0902 | Martha | Comp.Sci. | 75 |
| S0903 | Micheal | Comp.Sci. | 45 |
| S0904 | Rose | Comp.Sci. | 77 |
| S0905 | Alfie | Comp.Sci. | 91 |
| S1901 | Brad | Biology | 23 |
TAKES TABLE
| ID | course_id | sec_id | semester | year | grade
| S0901 | CS-101 | 1 | Fall | 2009 | A
| S0901 | CS-315 | 1 | Spring | 2010 | B+
| S0901 | HIS-351 | 1 | Spring | 2010 | A-
| S0901 | MTH-101 | 1 | Fall | 2009 | A
| S0901 | MTH-102 | 1 | Spring | 2009 | B+
| S0902 | CS-101 | 1 | Fall | 2009 | A
| S0902 | CS-315 | 1 | Spring | 2010 | B+
| S0902 | CS-319 | 1 | Spring | 2010 | B
| S0902 | HIS-351 | 1 | Spring | 2010 | A-
| S0902 | MTH-101 | 1 | Fall | 2009 | A
| S0902 | MTH-102 | 1 | Spring | 2009 | B+
| S1901 | CS-101 | 1 | Fall | 2009 | B+
| S1901 | CS-190 | 1 | Spring | 2009 | C
| S1901 | CS-315 | 1 | Spring | 2010 | A-
| S1901 | HIS-351 | 1 | Spring | 2010 | A-
Technically, #Marcinek's answer may not be sufficient because it omits students who take zero classes. I would use this instead:
SELECT STUDENT.ID, COUNT(TAKES.ID)
FROM STUDENT LEFT JOIN TAKES ON STUDENT.ID = TAKES.ID
GROUP BY STUDENT.ID
ORDER BY COUNT(TAKES.ID) DESC;
By using LEFT JOIN, you can capture a student whose ID does not appear in the TAKES table.
Just join and group up the result.
SELECT COUNT(*), s.id FROM student s, takes t where t.id = s.id group by s.id order by count(*) desc

How to get count of combinations from database?

How to get count of combinations from database?
I have to database tables and want to get the count of combinations. Does anybody know how to put this in a database query, therefore I haven't a db request for each trip?
Trips
| ID | Driver | Date |
|----|--------|------------|
| 1 | A | 2015-12-15 |
| 2 | A | 2015-12-16 |
| 3 | B | 2015-12-17 |
| 4 | A | 2015-12-18 |
| 5 | A | 2015-12-19 |
Passengers
| ID | PassengerID | TripID |
|----|-------------|--------|
| 1 | B | 1 |
| 2 | C | 1 |
| 3 | D | 1 |
| 4 | B | 2 |
| 5 | D | 2 |
| 6 | A | 3 |
| 7 | B | 4 |
| 8 | D | 4 |
| 9 | B | 5 |
| 10 | C | 5 |
Expected result
| Driver | B-C-D | B-D | A | B-C |
|--------|-------|-----|---|-----|
| A | 1 | 2 | - | 1 |
| B | - | - | 1 | - |
Alternative
| Driver | Passengers | Count |
|--------|------------|-------|
| A | B-C-D | 1 |
| A | B-D | 2 |
| A | B-C | 1 |
| B | A | 1 |
Has anybody an idea?
Thanks a lot!
Try this:
SELECT Driver, Passengers, COUNT(*) AS `Count`
FROM (
SELECT t.ID, t.Driver,
GROUP_CONCAT(p.PassengerID
ORDER BY p.PassengerID
SEPARATOR '-') AS Passengers
FROM Trips AS t
INNER JOIN Passengers AS p ON t.ID = p.TripID
GROUP BY t.ID, t.Driver) AS t
GROUP BY Driver, Passengers
The above query will produce the alternative result set. The other result set can only be achieved using dynamic sql.
Demo here

MySQL query is returning duplicates when it shouldn't

I have three tables:
Person
+--------+-----------+
| fName | lName |
+--------+-----------+
| Paul | McCartney |
| John | Lennon |
| Jon | Stewart |
| Daniel | Tosh |
| Steven | Colbert |
| Pink | Floyd |
| The | Beatles |
| Arcade | Fire |
| First | Last |
| Andrew | Bird |
+--------+-----------+
Publication
+----+---------------------------------------+------+-----------+---------+
| id | title | year | pageStart | pageEnd |
+----+---------------------------------------+------+-----------+---------+
| 9 | The Dark Side of the Moon | 1973 | 0 | 0 |
| 10 | Piper At The Gates of Dawn | 1967 | 0 | 0 |
| 11 | Sgt. Pepper's Lonely Hearts Band Club | 1967 | 0 | 0 |
| 12 | Happy Thoughts | 2007 | 0 | 60 |
| 13 | Wish You Were Here | 1975 | 0 | 0 |
| 14 | Funeral | 2004 | 0 | 0 |
+----+---------------------------------------+------+-----------+---------+
Person_Publication
+-----------+----------------+--------+---------------+
| person_id | publication_id | editor | author_number |
+-----------+----------------+--------+---------------+
| 11 | 11 | 0 | 1 |
| 12 | 11 | 0 | 1 |
| 16 | 9 | 0 | 1 |
| 17 | 11 | 0 | 1 |
+-----------+----------------+--------+---------------+
I'm trying to select all authors of a certain publication using the following query:
SELECT fName , lName
FROM Publication , Person, Person_Publication
WHERE Person.id = Person_Publication.person_id
AND Person_Publication.publication_id = 11;
But the results I get are always duplicates (always 6x for some reason). The results:
+-------+-----------+
| fName | lName |
+-------+-----------+
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
+-------+-----------+
18 rows in set (0.03 sec)
Can somebody please tell me why this is happening and how to fix this?
You are getting 6x your results, exactly one for each Publication row.
Remove your Publication from your FROM clause:
SELECT fName , lName
FROM Person, Person_Publication
WHERE Person.id = Person_Publication.person_id
AND Person_Publication.publication_id = 11;
You are including three tables in your query:
FROM Publication, Person, Person_Publication
but you have only one join condition:
WHERE Person.id = Person_Publication.person_id
You end up with a cartesian product between Publication and Person JOIN Person_Publication
Add the following condition to your WHERE block:
AND Publication.id = Person_Publication.publication.id
A perfect example of why the explicit JOIN syntax is prefered. With the following syntax:
SELECT fName, lName
FROM Publication
JOIN Person_Publication ON Person_Publication.publication.id = Publication.id
JOIN Person ON Person.id = Person_Publication.person_id
WHERE Person_Publication.publication_id = 11;
.. such a mistake simply cannot happen.

Mysql LEFT JOIN of three tables returns to many Rows

I´m using Mysql since quite a while and am really confused by the result of a simple LEFT JOIN on three Tables.
I have the following three tables (I created an example, to narrow it down)
a) persons
+----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+----------------+
| PersonID | int(11) | NO | PRI | NULL | auto_increment |
| Name | varchar(50) | YES | | NULL | |
| Age | int(11) | YES | | NULL | |
+----------+-------------+------+-----+---------+----------------+
b) person_fav_artists
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| FavInterpretID | int(10) | NO | PRI | NULL | auto_increment |
| PersonID | int(10) | NO | | 0 | |
| Interpret | varchar(100) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
c) person_fav_movies
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| FavMovieID | int(10) | NO | PRI | NULL | auto_increment |
| PersonID | int(10) | NO | | 0 | |
| Movie | varchar(100) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
My example tables are used to store an any number of artists and movies to a single person.
Weather this makes sence or not doesn´t really matter since it´s just a simple example.
Now I have the following data in the tables:
mysql> SELECT * FROM persons;
+----------+------+------+
| PersonID | Name | Age |
+----------+------+------+
| 1 | Jeff | 22 |
| 2 | Lisa | 15 |
| 3 | Jon | 30 |
+----------+------+------+
mysql> SELECT * FROM person_fav_artists;
+----------------+----------+----------------+
| FavInterpretID | PersonID | Interpret |
+----------------+----------+----------------+
| 1 | 1 | Linkin Park |
| 2 | 1 | Muse |
| 3 | 2 | Madonna |
| 4 | 2 | Katy Perry |
| 5 | 2 | Britney Spears |
| 6 | 1 | Fort Minor |
| 7 | 1 | Jay Z |
+----------------+----------+----------------+
mysql> SELECT * FROM person_fav_movies;
+------------+----------+-------------------+
| FavMovieID | PersonID | Movie |
+------------+----------+-------------------+
| 1 | 1 | American Pie 1 |
| 2 | 1 | American Pie 2 |
| 3 | 1 | American Pie 3 |
| 4 | 3 | A Game of Thrones |
| 5 | 3 | Eragon |
+------------+----------+-------------------+
Now i´m simply joining the tables with the following query:
Select * FROM persons
LEFT JOIN person_fav_artists USING (PersonID)
LEFT JOIN person_fav_movies USING (PersonID);
which returns the following result:
+----------+------+------+----------------+----------------+------------+-------------------+
| PersonID | Name | Age | FavInterpretID | Interpret | FavMovieID | Movie |
+----------+------+------+----------------+----------------+------------+-------------------+
| 1 | Jeff | 22 | 1 | Linkin Park | 1 | American Pie 1 |
| 1 | Jeff | 22 | 1 | Linkin Park | 2 | American Pie 2 |
| 1 | Jeff | 22 | 1 | Linkin Park | 3 | American Pie 3 |
| 1 | Jeff | 22 | 2 | Muse | 1 | American Pie 1 |
| 1 | Jeff | 22 | 2 | Muse | 2 | American Pie 2 |
| 1 | Jeff | 22 | 2 | Muse | 3 | American Pie 3 |
| 1 | Jeff | 22 | 6 | Fort Minor | 1 | American Pie 1 |
| 1 | Jeff | 22 | 6 | Fort Minor | 2 | American Pie 2 |
| 1 | Jeff | 22 | 6 | Fort Minor | 3 | American Pie 3 |
| 1 | Jeff | 22 | 7 | Jay Z | 1 | American Pie 1 |
| 1 | Jeff | 22 | 7 | Jay Z | 2 | American Pie 2 |
| 1 | Jeff | 22 | 7 | Jay Z | 3 | American Pie 3 |
| 2 | Lisa | 15 | 3 | Madonna | NULL | NULL |
| 2 | Lisa | 15 | 4 | Katy Perry | NULL | NULL |
| 2 | Lisa | 15 | 5 | Britney Spears | NULL | NULL |
| 3 | Jon | 30 | NULL | NULL | 4 | A Game of Thrones |
| 3 | Jon | 30 | NULL | NULL | 5 | Eragon |
+----------+------+------+----------------+----------------+------------+-------------------+
17 rows in set (0.00 sec)
So far so good.
My question is now if it´s "normal" that '12' Rows are returned for the person 'Jeff' despite the fact that he only has four 'artists' and three 'movies' assigned to him.
I think I may understand why the result is as it is, but I think it´s quite stupid to return so many Rows for so less actual data.
So is there something wrong with my query or is this behaviour on purpose?
The result I´d like to have would be like the following (only for Jeff):
+----------+------+------+----------------+----------------+------------+-------------------+
| PersonID | Name | Age | FavInterpretID | Interpret | FavMovieID | Movie |
+----------+------+------+----------------+----------------+------------+-------------------+
| 1 | Jeff | 22 | 1 | Linkin Park | 1 | American Pie 1 |
| 1 | Jeff | 22 | 2 | Muse | 2 | American Pie 2 |
| 1 | Jeff | 22 | 3 | Fort Minor | 3 | American Pie 3 |
| 1 | Jeff | 22 | 4 | Jay Z | 1 | NULL | <- 'American Pie 1/2/3' would be OK as well.
+----------+------+------+----------------+----------------+------------+-------------------+
Thanks for your help!
Nothing wrong with query or the results, it is just returning all possible combinations. One option would be to split into two separate queries if the amount of data is going to be large.
You are getting the correct result with the 12 records becuase that is the correct tuple with the way you are asking for the data. I am not sure why you are joinming these 3 tables together becuase inherently, the 2 related tables are not the same type of data. What I would suggest is that you select person & movies and then you can union person & artists, becuase your union will want the columns to be the same, i would suggest adding a type to differentiate from artists and movies and then the nice name should just be AS a string_value
This behaviour is on purpose.
Your first table has 1 columns for Jeff.
The second table has 4 columns for Jeff, so the joined table gives
1x4.
The third table has 3 columns for Jeff, so the joined table gives
1x4x3.
You now got all possible combinations.
I think it's normal since it's taking all the combinations of fav movie and fav artist. I think this is how the joining works.
Try by replacing LEFT JOIN with INNER JOIN as:
SELECT *
FROM persons
INNER JOIN person_fav_artists USING (PersonID)
INNER JOIN person_fav_movies USING (PersonID);