SQL Query to count IDs with exactly 91 duplicates - mysql

I've got a mySQL database of survey responses. Each one has a userID, a questionID and the actual answer. Now I'm trying to write a report that will tell me how many people actually completed the survey as opposed to stopped halfway through. So I'm trying to figure out how to write a query that will count all of the userIDs that are duplicated exactly 91 times.
Be gentle, this is my first stackoverflow question.

You have to group by having count(*) = 91
select userId from myTable group by userId having count(*) = 91
http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html

I don't have data to test this, but this may help:
SELECT COUNT(*) AS userCount, userId
FROM tbl
GROUP BY userId
HAVING userCount = 91

If the questions are ordered, then you could check for the last question ID:
SELECT COUNT(*) FROM SurveyResponses WHERE QuestionId= [last_question_id]

Related

SQL Selecting Repetead Rows in a table

I am currently refreshing my SQL knowledge.
I have a table - Sessions. It stores information about user log activity. ie the duration of how long they are logged in for. See the table below.
So I am trying to select all repeated rows from a table (not just validate that repeated rows exist).
So far I have managed to get the output of the entire table, however, I only need the userId and duration columns. How can I go about selecting only these two rows?
I thought it would have been SELECT a.userId instead of a.* etc however I get the error "ambiguous column name: userId". Not sure what is going on. Sorry if it's a stupid question but any help is appreciated. Thanks.
SELECT a.*
FROM sessions a
JOIN ( SELECT userId,duration
FROM sessions
GROUP BY userId
HAVING COUNT(userId) > 1 ) b
ON a.userId = b.userId
ORDER BY userId;
The problem is due to the ORDER BY clause, which does not scope the userId reference to one of the tables. Use this version:
ORDER BY a.userId;
Here is your updated query, with the select clause of the subquery also corrected by removing the incorrect (and unnecessary) reference to duration:
SELECT a.*
FROM sessions a
INNER JOIN
(
SELECT userId
FROM sessions
GROUP BY userId
HAVING COUNT(userId) > 1
) b
ON a.userId = b.userId
ORDER BY
a.userId;

Get top 5 most popular values in mysql with where clause

I'm working on a project and I have a problem. I have a table namedfriendswith three columnid,from_emailandto_email(it's a social networking site and "from_email" is the person that follows the "to_email"). I want a query to return the top 5 friends I follow according to the number of their followers. I know that the query for top 5 is:
SELECT
to_mail,
COUNT(*) AS friendsnumber
FROM
friends
GROUP BY
to_email
ORDER BY
friendsnumber DESC
LIMIT 5
Any ideas?
I would also like to return friends with the same number of followers ordered by their name. Is it possible?
You should use COUNT(from_email) instead of COUNT(*); because you want to calculate the number of followers, which is represented by from_email.
Thus, your select clause would be something like:
SELECT to_email, COUNT(from_email) as magnitude
as for getting the most popular people that you follow, you could use IN clause:
WHERE to_email IN (SELECT to_email FROM friends WHERE from_email='MY_EMAIL');
and about name, you shall join this query with the other table which contains the name value.
Since you've got the essentials now, I hope you can try to compose the full query on your own =)
Join again to the table for the 2nd tier count:
SELECT f1.to_email
FROM friends f1
JOIN friends f2 on f2.to_mail = f1.to_email
WHERE f1.from_email = 'myemail'
GROUP BY 1
ORDER BY count(*) DESC
LIMIT 5
If an index is defined on to_email, this will perform very well.

Mysql Find users that are part of two differnt projects

I'm searching my submission fields for users that signed up for two different projects. This is what I have, that is not working correctly. Any help would be great!
SELECT
user_id, COUNT(*)
FROM submissions
WHERE
project_id = 125
or project_id = 81
group by
user_id
HAVING COUNT(*) >= 2
So to clarify, I want to know what users have a submission from project_id 81 AND project_id 125. Each of the submissions
The right sintax is this one, you're missing a *
SELECT
user_id, COUNT(*)
FROM
submissions
WHERE
project_id = 125 or project_id = 81
GROUP BY
user_id
HAVING
COUNT(*) >= 2
in case a user can submit the same project multiple times, it's better to write your HAVING condition like this:
HAVING COUNT(DISTINCT project_id)>=2
so we can be sure that it will match two different distinct projects, and not just one project submitted multiple times

SQL query - select max where a count greater than value

I've got two tables with the following structure:
Question table
id int,
question text,
answer text,
level int
Progress table
qid int,
attempts int,
completed boolean (qid means question id)
Now my questions is how to construct a query that selects the max level where the count of correct questions is greater than let's say 30.
I created this query, but it doesn't work and I don't know why.
SELECT MAX(Questions.level)
FROM Questions, Progress
WHERE Questions.id = Progress.qid AND Progress.completed = 1
GROUP BY Questions.id, Questions.level
Having COUNT(*) >= 30
I would like to have it in one query as I suspect this is possible and probably the most 'optimized' way to query for it. Thanks for the help!
This sort of construct will work. You can figure out the details.
select max(something) maxvalue
from SomeTables
join (select id, count(*) records
from ATable
group by id) temp on ATable.id = temp.id
where records >= 30
Do it step by step rather than joining the two tables. In an inner select find the questions (i.e. the question ids) that were answered 30 times correctly. In an outer select find the corresponding levels and get the maximum value:
select max(level)
from questions
where id in
(
select qid
from progress
where completed = 1
group by qid
having count(*) >= 30
);

How to avoid distinct

I have a query which works when I use DISTINCT. However I have a feeling I could rewrite the query in a way that would help me avoid use of DISTINCT, which would make easier(quicker) for the database to process the query.
If there is no point in rewriting the query, please explain, if there is, please look at simplified query and give me a hint how to reformulate it so I wouldn't get duplicates in the first place.
SELECT Us.user_id, COUNT( DISTINCT Or.order_id ) AS orders
FROM users AS Us
LEFT JOIN events AS Ev ON Ev.user_id = Us.user_id
LEFT JOIN orders AS Or ON Or.event_id = Ev.event_id
OR Or.user_id = Us.user_id
GROUP BY Us.user_id
Short description of the query: I have a table of users, of their events and orders. Sometimes orders have column user_id, but mostly it is null and they have to be connected via event table.
Edit:
These are results of the simplified query I wrote, first without distinct and then including distinct.
user_id orders
3952 263
3953 7
3954 2
3955 6
3956 1
3957 0
...
user_id orders
3952 79
3953 7
3954 2
3955 6
3956 1
3957 0
...
Problem fixed:
SELECT COALESCE( Or.user_id, Ev.user_id ) AS user, COUNT( Or.order_id ) AS orders
FROM orders AS Or
LEFT JOIN events AS Ev ON Ev.event_id = Or.event_id
GROUP BY COALESCE( Or.user_id, Ev.user_id )
If an order can be associated with multiple events, or a user with an event multiple times, then it is possible for the same order to be associated with the same user multiple times. In this scenario, using DISTINCT will count that order only once per user whereas omitting it will count that order once for each association with the user.
If you're after the former, then your existing query is your best option.
You are not getting anything from the user table, nor the events table, so why join them. Your last "OR" clause makes explicit reference that it has a user_ID column. I would hope your order table has an index on the user ID placing the order, then you could just do
select
user_id,
count(*) as Orders
from
orders
group by
user_id