I have a simple configuration :
2 tables linked in a many-to-many relation, so it gave me 3 tables.
Table author:
idAuthor INT
name VARCHAR
Table publication:
idPublication INT,
title VARCHAR,
date YEAR,
type VARCHAR,
conference VARCHAR,
journal VARCHAR
Table author_has_publication:
Author_idAuthor,
Publication_idPublication
I am trying to get all the authors name that have published at least 2 papers in conference SIGMOD and conference PVLDB.
Right now I achieved this but I still have a double result. My query :
SELECT author.name, publication.journal, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication = publication.idPublication
GROUP BY publication.journal, author.name
HAVING COUNT(*) >= 2
AND (publication.journal = 'PVLDB' OR publication.journal = 'SIGMOD');
returns
+-------+---------+----------+
| name | journal | COUNT(*) |
+-------+---------+----------+
| Renee | PVLDB | 2 |
| Renee | SIGMOD | 2 |
+-------+---------+----------+
As you can see the result is correct but doubled, as I just want 1 time the name.
Other question, how to modify the number parameter for only one conference, for example get all the author that published at least 3 SIGMOD and at least 1 PVLDB ?
If you don't care about the journal , don't select it, it is splitting your results. Also, normal filters need to be placed in the WHERE clause, not the HAVING clause :
SELECT author.name, COUNT(*)
FROM author
INNER JOIN author_has_publication
ON author.idAuthor = author_has_publication.Author_idAuthor
INNER JOIN publication
ON author_has_publication.Publication_idPublication =
publication.idPublication
WHERE publication.journal IN('PVLDB','SIGMOD')
GROUP BY author.name
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 2
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 2;
For the second question, use this HAVING() clause :
HAVING COUNT(CASE WHEN publication.journal = 'SIGMOD' THEN 1 END) >= 3
AND COUNT(CASE WHEN publication.journal = 'PVLDB' THEN 1 END) >= 1;
Related
I have a kind of tricky question for this query. First the code:
SELECT user_type.user_type_description,COUNT(incident.user_id) as Quantity
FROM incident
INNER JOIN user ON incident.user_id=user.user_id
INNER JOIN user_type ON user.user_type=user_type.user_type
WHERE incident.code=2
GROUP BY user.user_type
What Am I doing?
For example, I am counting police reports of robbery, made from different kind of users. In my example, "admin" users reported 6 incidents of code "2" (robbery) and so on, as is showed in 'where' clause (incident must be robbery, also code 2).
this brings the following result:
+-----------------------+----------+
| user_type_description | Quantity |
+-----------------------+----------+
| Admin | 6 |
| Moderator | 8 |
| Fully_registered_user | 8 |
| anonymous_user | 9 |
+-----------------------+----------+
Basically Admin,Moderator and Fully_registered_user are appropriately registered users. I need to add them in a result where it shows like:
+--------------+------------+
| Proper_users | Anonymous |
+--------------+------------+
| 22 | 9 |
+--------------+------------+
I am not good with sql. Any help is appreciated. Thanks.
You can try to use condition aggregate function base on your current result set.
SUM with CASE WHEN expression.
SELECT SUM(CASE WHEN user_type_description IN ('Admin','Moderator','Fully_registered_user') THEN Quantity END) Proper_users,
SUM(CASE WHEN user_type_description = 'anonymous_user' THEN Quantity END) Anonymous
FROM (
SELECT user_type.user_type_description,COUNT(incident.user_id) as Quantity
FROM incident
INNER JOIN user ON incident.user_id=user.user_id
INNER JOIN user_type ON user.user_type=user_type.user_type
WHERE incident.code=2
GROUP BY user.user_type
) t1
You just need conditional aggregation:
SELECT SUM( ut.user_type_description IN ('Admin', 'Moderator', 'Fully_registered_user') ) as Proper_users,
SUM( ut.user_type_description IN ('anonymous_user') as anonymous
FROM incident i INNER JOIN
user u
ON i.user_id = u.user_id INNER JOIN
user_type ut
ON u.user_type = ut.user_type
WHERE i.code = 2;
Notes:
Table aliases make the query easier to write and to read.
This uses a MySQL shortcut for adding values -- just just adding the booelean expressions.
I would solve it with a CTE, but it would be better to have this association in a table.
WITH
user_type_categories
AS
(
SELECT 'Admin' AS [user_type_description] , 'Proper_users' AS [user_type_category]
UNION SELECT 'Moderator' AS [user_type_description] , 'Proper_users' AS [user_type_category]
UNION SELECT 'Fully_registered_user' AS [user_type_description] , 'Proper_users' AS [user_type_category]
UNION SELECT 'anonymous_user' AS [user_type_description] , 'Anonymous' AS [user_type_category]
)
SELECT
CASE WHEN utc.[user_type_category] = 'Proper_users' THEN
SUM(incident.user_id)
END AS [Proper_Users_Quantity]
, CASE WHEN utc.[user_type_category] = 'Anonymous' THEN
SUM(incident.user_id)
END AS [Anonymous_Quantity]
FROM
[incident]
INNER JOIN [user] ON [incident].[user_id] = [user].[user_id]
INNER JOIN [user_type] ON [user].[user_type] = [user_type].[user_type]
LEFT JOIN user_type_categories AS utc ON utc.[user_type_description] = [user_type].[user_type_description]
WHERE
[incident].[code] = 2
There is an error in my query and I would like some help. I have three tables
Rooms{id,number,name,type(ECO/LUX),active(0/1)}
Men{passport,roomid,status(YOUTH/ADULT)}
Women{passport,roomid,status(YOUTH/ADULT)}
**In each room there can be more than one woman or man.
I want to count how many women and men have the same room with roomid in (1,2,3), status='ADULT', type='LUX' and active=1. Therefore I need a result like this:
+----+--------+-----------+----------+------------+
| id | number | name | CountMen | CountWomen |
+----+--------+-----------+----------+------------+
| 1 | 23 | 1st suite | 2 | 4 |
| 3 | 4 | 2nd suite | 1 | 2 |
+----+--------+-----------+----------+------------+
SELECT id,number,name,
sum(case when Men.status='ADULT' then 1 else 0 end) as CountMen,
sum(case when Women.status='ADULT' then 1 else 0 end) as CountWomen
FROM Rooms left join Men
on Rooms.id=Men.roomid
left join Women on Room.id=Women.roomid where
(type='LUX') and (active=true) and (id in (1,2,3))
group by id;
The problem is that I get sometimes wrong results in the counters.
In a left join, conditions on the second table need to be in the on clause. It would help if you qualified all column names in the query.
However your problem is because you are getting a Cartesian product between the gender tables. This is definitely a case where gender segregation is not a good thing. You should have just one table for people (and this doesn't even bring up other issues with defining binary genders).
SELECT r.id, r.number, r.name,
(SELECT COUNT(*)
FROM men m
WHERE m.status = 'ADULT' AND r.id = m.roomid
) as CountMen,
(SELECT COUNT(*)
FROM women w
WHERE w.status = 'ADULT' AND r.id = w.roomid
) as CountWomen
FROM Rooms r
WHERE r.type = 'LUX' AND r.active = true AND r.id IN (1, 2, 3);
However, you should fix your data model so you have people rather than segregated gender tables.
Let's say i've got this database:
book
| idBook | name |
|--------|----------|
| 1 |Book#1 |
category
| idCateg| category |
|--------|----------|
| 1 |Adventures|
| 2 |Science F.|
book_categ
| id | idBook | idCateg | DATA |
|--------|--------|----------|--------|
| 1 | 1 | 1 | (null) |
| 2 | 1 | 2 | (null) |
I'm trying to select only the books which are in category 1 AND category 2 something like this
SELECT book.* FROM book,book_categ
WHERE book_categ.idCateg = 1 AND book_categ.idCateg = 2
Obviously, this giving 0 results becouse each row has only one idCateg it does work width OR but the results are not what I need. I've also tried to use a join, but I just can't get the results I expect.
Here it's the SQLFiddle of my current project, with my current DB, the data at the begining is just a sample. SQLFiddle
Any help will be really appreciated.
Solution using EXISTS:
select *
from book b
where exists (select 'x'
from book_categ x
where x.idbook = b.idbook
and x.idcateg = 1)
and exists (select 'x'
from book_categ x
where x.idbook = b.idbook
and x.idcateg = 2)
Solution using join with an inline view:
select *
from book b
join (select idbook
from book_categ
where idcateg in (1, 2)
group by idbook
having count(*) = 2) x
on b.idbook = x.idbook
You could try using ALL instead of IN (if you only want values that match all criteria to be returned):
SELECT book.*
FROM book, book_categ
WHERE book_categ.idCateg = ALL(1 , 2)
One way to get the result is to do join to the book_categ table twice, something like
SELECT b.*
FROM book b
JOIN book_categ c1
ON c1.book_id = b.id
AND c1.idCateg = 1
JOIN book_categ c2
ON c2.book_id = b.id
AND c2.idCateg = 2
This assumes that (book_id, idCateg) is constrained to be unique in the book_categ table. If it isn't unique, then this query can return duplicate rows. Adding a GROUP BY clause or the DISTINCT keyword will eliminate any generated duplicates.
There are several other queries that can get generate the same result.
For example, another approach to finding book_id that are in two categories is to get all the rows with idCateg values of 1 or 2, and then GROUP BY book_id and get a count of DISTINCT values...
SELECT b.*
FROM book b
JOIN ( SELECT d.book_id
FROM book_categ d
WHERE d.idCateg IN (1,2)
GROUP BY d.book_id
HAVING COUNT(DISTINCT d.idCateg) = 2
) c
ON c.book_id = b.id
Ok, I have an example table with the following information and query.
First up is the data, with the question following at the end.
Here's the SQL Dump:
http://pastie.org/private/o7zzajdpm6lzcbqrjolgg
Or you can use the included a visual below:
Purchases Table
| id | brand | date |
1 b1 2000-01-01
2 b1 2000-01-03
3 b2 2000-01-04
4 b3 2000-01-08
5 b4 2000-01-14
Owners Table
id | firstname | lastname | purchaseid | itemCoupon | itemReturned | Accessories
1 Jane Doe 1 yes no 4
2 Jane Doe 2 yes no 2
3 Jane Doe 3 no no 1
4 Jane Doe 4 no no 3
5 Jane Doe 5 no yes 6
The Query
SELECT brand, COALESCE( SUM( inTime.Accessories ) , 0 ) AS acessory_sum
FROM purchases
INNER JOIN owners AS person ON person.purchaseid = purchases.id
AND person.firstname = 'Jane'
AND person.lastname = 'Doe'
LEFT JOIN owners AS inTime ON person.id = inTime.id
AND purchases.date
BETWEEN DATE( '2000-01-01' )
AND DATE( '2000-01-05' )
GROUP BY purchases.brand
This gives the following expected result:
| brand | accessory_sum
b1 6
b2 1
b3 0
b4 0
The question
Now, I would like to add to the query:
WHERE itemCoupon = 'yes' OR itemReturned = 'yes'
But this overrides the last join and when I do the same search above I get:
| brand | accessory_sum
b1 6
b2 1
Similarly I still want it to return No results found for 2000-01-04, 2000-01-08 using WHERE itemCoupon = 'yes' OR itemReturned = 'yes'. Removing the WHERE gives me zeros for all brands if I try to do it another way.
Basically I want to keep the way the WHERE behaves but also keep the format that I described in the first example of the expected output.
As it is now, using WHERE destroys the way the last LEFT JOIN works with COALESCE which fills the remaining brand rows with zeros.
Your WHERE turns the outer join into an inner join.
You need to move your additionally condition into the LEFT JOIN condition:
LEFT JOIN owners as inTime
ON person.id = inTime.id
AND purchases.date between purchases.date DATE ('2000-01-01') and DATE ('2000-01-05')
AND (inTime.itemCoupon = 'yes' or inTime.itemReturned = 'yes')
the ON clause when doing a JOIN is similar to the WHERE clause. So instead of trying to use WHERE, just add another AND to your query (and don't forget to use the parenthesis in the OR clause):
SELECT brand,
COALESCE(SUM(Time.purchasedAccessories),0) as acessory_sum
FROM purchases
INNER JOIN owners AS person
ON person.purchaseid = purchases.id
AND person.firstname = 'Jane'
AND person.lastname = 'Doe'
AND (person.itemCoupon = 'yes' OR person.itemReturned = 'yes')
LEFT JOIN owners AS inTime
ON person.id= inTime.id
AND purchases.date
BETWEEN purchases.date
DATE( '2000-01-01' )
AND
DATE( '2000-01-05' )
GROUP BY purchases.brand
I have a table called user_scores as below:
id | af_id | uid | level | record_date
----------------------------------------
1 | 1.1 | 1 | 3 | 2012-01-01
2 | 1.1 | 1 | 4 | 2012-02-01
3 | 1.2 | 1 | 3 | 2012-01-01
4 | 1.2 | 1 | 5 | 2012-03-01
...
I have another table call user_info as below:
uid | forename | surname | gender
-----------------------------------
1 | Homer | Simpson | M
2 | Marge | Simpson | F
3 | Bart | Simpson | M
4 | Lisa | Simpson | F
...
In user scores uid is the user id of a registered user on the system, af_id identifies a particular test a user submits. A user scores a level between 1 - 5 for each test, which can be submitted every month.
My problem is I need to produce an analysis at the end of the year to COUNT the number of users that have achieved each level for a particular test. The analysis is to show a gender split for male and female.
So for example an administrator would select test 1.1 and the system would generate stats based that would COUNT of the total MAX level achieved by each user in the year, with a gender split.
Any help is much appreciated. Thank you in advance.
-
I think I need to clarify myself a bit. Because a user can complete the test multiple times throughout the year, there will be multiple scores for the same test. The query should take the highest level achieved and include this in the count. An example result would be:
Male Results:
level1 | level2 | level3 | level4 | level5
------------------------------------------
2 | 5 | 10 | 8 | 1
I am not certain I get exactly what you mean, but as always I'll have a go. As I understand it you want to know how many people from each gender reached each level in a certain year.
SELECT MaxLevel,
COUNT(CASE WHEN ui.Gender = 'M' THEN 1 END) AS Males,
COUNT(CASE WHEN ui.Gender = 'F' THEN 1 END) AS Females
FROM User_Info ui
INNER JOIN
( SELECT MAX(Level) AS MaxLevel,
UID
FROM User_Scores us
WHERE af_ID = '1.1'
AND YEAR(Record_Date) = 2012
GROUP BY UID
) AS MaxUs
ON MaxUs.uid = ui.UID
GROUP BY MaxLevel
I've put some sample data on SQL Fiddle so you see if it is what you were after.
EDIT
To transpose the data so levels are along the top and Gender in the rows the following will work:
SELECT Gender,
COUNT(CASE WHEN MaxLevel = 1 THEN 1 END) AS Level1,
COUNT(CASE WHEN MaxLevel = 2 THEN 1 END) AS Level2,
COUNT(CASE WHEN MaxLevel = 3 THEN 1 END) AS Level3,
COUNT(CASE WHEN MaxLevel = 4 THEN 1 END) AS Level4,
COUNT(CASE WHEN MaxLevel = 5 THEN 1 END) AS Level5
FROM User_Info ui
INNER JOIN
( SELECT MAX(Level) AS MaxLevel,
UID
FROM User_Scores us
WHERE af_ID = '1.1'
AND YEAR(Record_Date) = 2012
GROUP BY UID
) AS MaxUs
ON MaxUs.uid = ui.UID
GROUP BY Gender
Note, that if there are ever more than 5 levels you will need to add more to the select statement, or start building dynamic SQL.
Assuming record_date holds only dates (without time parts):
SELECT
s.maxlevel,
COUNT(NULLIF(gender, 'F')) AS M,
COUNT(NULLIF(gender, 'M')) AS F
FROM user_info u
INNER JOIN (
SELECT
uid,
MAX(level) AS maxlevel
FROM user_scores
WHERE record_date > DATE_SUB(CURDATE(), INTERVAL DAYOFYEAR(CURDATE()) DAY)
AND af_id = '1.1'
GROUP BY
uid
) s ON s.uid = u.uid
GROUP BY
s.maxlevel
That will show you only the maximum levels found in the user_scores table. If you have a Levels table where all possible levels (1 to 5) are listed, you could use that table to get a complete list of levels. If some levels are not present in the requested subset of data, the corresponding rows will show 0s in both columns.
Here's the above script with minor changes to show the complete chart of levels:
SELECT
l.level AS maxlevel,
COUNT(NULLIF(gender, 'F')) AS M,
COUNT(NULLIF(gender, 'M')) AS F
FROM user_info u
INNER JOIN (
SELECT
uid, MAX(level) AS maxlevel
FROM user_scores
WHERE record_date > DATE_SUB(CURDATE(), INTERVAL DAYOFYEAR(CURDATE()) DAY)
AND af_id = '1.1'
GROUP BY
uid
) s ON s.uid = u.uid
RIGHT JOIN Levels l ON s.maxlevel = l.level
GROUP BY
l.level
Hope this is what your looking for!
Show number of records group by userid and gender of the max score for af_id '1.1'.
select count(*), info.uid, info.gender, max(score.level)
from user_info as info
join user_scores as score
on info.uid = score.uid
where score.af_id = '1.1'
group by info.uid, info.gender;
EDITED based on your edit.
select sum(if(a.gender="M",1,0)) Male_users, sum(if(a.gender="F",1,0)) Female_users
from myTable a where
a.level = (select max(b.level) from myTable b where a.uid=b.uid)
group by af_id.
I typed this in a rush. But it should work or at least get you where you need to go. E.G. if you need to specify time frame, add that.
You need something like
SELECT
uid,
MAX(level)
WHERE
record_date BETWEEN '2012-01-01' AND '2012-12-31'
AND af_id='1.1'
GROUP BY uid
If you need the gender splits then depending on what stat you need per gender you can either add a JOIN on the user_info table into this query (to get the MAX per gender) to wrap this as a sub-query and JOIN on the whole thing.