Count the amount of occurrences of a another count result - mysql

We have a table called entries which stores user information against a date. Users are allowed to enter the database once per day. Some example data:
+----+------------------+----------+
| id | email | date |
+----+------------------+----------+
| 1 | marty#domain.com | 04.09.13 |
| 2 | john#domain.com | 04.09.13 |
| 3 | marty#domain.com | 05.09.13 |
+----+------------------+----------+
I need to work out how many times there are X entries with the same email in the database. For example, the result should look like this for the above data, where we have 1 instance of one entry and 1 instance of 2 entries:
+---------------+-------+
| times_entered | total |
+---------------+-------+
| 1 | 1 |
| 2 | 1 |
+---------------+-------+
I've tried a few things but the furthest I have been able to get is getting a count of the amount of times each email address was found. It seems like all I need to do from here is collate those results and perform another COUNT on those groups to get the totals, but I'm unsure of how I can do that.

Usually it can be something like
select times_entered, count(*) from
( select count(*) times_entered from entries group by email )
group by times_entered
Not sure if it works for MySQL though...

The following will get the number of records per email:
SELECT COUNT(1) AS times_entered, email
FROM entries
GROUP BY email
Therefore, using this query as a derived table, we can group by the number of records to get the count (we do not need to select the email column in the subquery because we don't care about it):
SELECT times_entered, COUNT(1) AS total
FROM
(
SELECT COUNT(1) AS times_entered
FROM entries
GROUP BY email
) x
GROUP BY times_entered
SQL Fiddle demo
SQL Fiddle demo with a slightly larger data set

It could be something like this:
SELECT times_entered, COUNT( times_entered ) AS total
FROM (
SELECT COUNT( email ) AS times_entered
FROM `entries`
WHERE 1
GROUP BY email
) AS tmp
GROUP BY times_entered;

Related

Query to Find Duplicate entries

I am looking for an SQL query to give me a list of duplicate entries in a table. However, there are 3 different columns to take into account. First is an ID, Second is a Name, and third is a Date. The situation is that there are multiple Names that are assigned with the same ID, and there are multiple records of those in a day, which makes THOUSANDS of different records per day.
I already filtered it so that only results for the past 7 days will show, but the amount of records is still too much for me to extract. I just want to decrease the number of rows in the output order to properly extract the results.
Sample
|--id-|--name--|-------date------|
| 1 | a |5-9-2015, 10:00am|
| 1 | a |5-8-2015, 10:02am|
| 1 | a |5-8-2015, 11:00am|
| 1 | b |5-8-2015, 10:00am|
| 1 | b |5-8-2015, 10:02am|
| 1 | c |5-8-2015, 10:00am|
| 2 | d |5-8-2015, 10:00am|
expected output
|--id-|--name--|
| 1 | a |
| 1 | b |
| 1 | c |
| 2 | d |
Inclusion of entries without any duplicates are fine. The important thing is to only return a single record of a unique id-name combination for a day.
Thanks in advance for any help that you can give.
You can get the combinations as:
select distinct id, name
from sample;
If you want duplicates, using group by and having:
select id, name
from sample
group by id, name
having count(*) > 1;
EDIT:
If you want this by date, then add date(date) to the group by and perhaps select clauses.
To return single id-name data per day you can use this:
select id, name
from tab
group by id, name, date(date)
The DATE() function extracts the date part of a date or date/time expression.
select id,name
from sample
group by id,name,DATE(date)
having count(*)>1;

MAX function in MySQL does not return proper key value

I have a table called tbl_user_sal:
| id | user_id | salary | date |
| 1 | 1 | 1000 | 2014-12-01 |
| 2 | 1 | 2000 | 2014-12-02 |
Now I want to get the id of the maximum date. I used the following query:
SELECT MAX(date) AS from_date, id, user_id, salary
FROM tbl_user_sal
WHERE user_id = 1
But it gave me this output:
| id | user_id | salary | from_date |
| 1 | 1 | 2000 | 2014-12-02 |
Which is correct as far as the max date being 2014-12-02, but the corresponding id is not correct. This happens for other records as well. I used order by to check but that was not successful either. Can anyone shed some light on this?
Note: Its not necessary that max date will have max id, according to my needs. Records can have max date but id may be older.
If you only want to retrieve that information for a single user, which you seem to, because of your WHERE clause, just use ORDER BY and LIMIT:
SELECT *
FROM tbl_user_sal
WHERE user_id = 1
ORDER BY date DESC
LIMIT 1
If you want to do that for every user, however, you will have to get a little bit fancier. Something like that should do it:
SELECT t2.id, user_id, date
--find max date for each user_id
FROM (SELECT user_id, MAX(date) AS date
FROM tbl_user_sal
GROUP BY user_id) AS t1
--join ids for each max date/user_id combo
JOIN tbl_user_sal AS t2
USING (user_id, date)
--limit to 1 id for every user_id
GROUP BY
user_id
You are missing group by clause Try this:
select max(awrd_date) as from_date,awrd_id
from tbl_user_sal
where awrd_user_id = 106
group by awrd_id
What I believe you should do here is have a subquery that pulls the max date, and your outer query looks for the row with that date.
It looks like this:
SELECT *
FROM myTable
WHERE date = (SELECT MAX(date) FROM myTable);
Additional things may need to be added if you want to search for a specific user_id, or get the largest date for each user_id, but this gives your expected results for this example here.
Here is the SQL Fiddle.

In mysql: how can I select the most recently added row when selecting by MAX if two values are equal (application is a games high score table)

I am trying to construct a highscore table from entries in a table with the layout
id(int) | username(varchar) | score(int) | modified (timestamp)
selecting the highest scores per day for each user is working well using the following:
SELECT id, username, MAX( score ) AS hiscore
FROM entries WHERE DATE( modified ) = CURDATE( )
Where I am stuck is that in some cases plays may achieve the same score multiple times in the same day, in which case I need to make sure that it is always the earliest one that is selected because 2 scores match will be the first to have reached that score who wins.
if my table contains the following:
id | username | score | modified
________|___________________|____________|_____________________
1 | userA | 22 | 2014-01-22 08:00:14
2 | userB | 22 | 2014-01-22 12:26:06
3 | userA | 22 | 2014-01-22 16:13:22
4 | userB | 15 | 2014-01-22 18:49:01
The returned winning table in this case should be:
id | username | score | modified
________|___________________|____________|_____________________
1 | userA | 22 | 2014-01-22 08:00:14
2 | userB | 22 | 2014-01-22 12:26:06
I tried to achieve this by adding ORDER BY modified desc to the query, but it always returns the later score. I tried ORDER BY modified asc as well, but I got the same result
This is the classic greatest-n-per-group problem, which has been answered frequently on StackOverflow. Here's a solution for your case:
SELECT e.*
FROM entries e
JOIN (
SELECT DATE(modified) AS modified_date, MAX(score) AS score
FROM entries
GROUP BY modified_date
) t ON DATE(e.modified) = t.modified_date AND e.score = t.score
WHERE DATE(e.modified) = CURDATE()
I think this would works for you and is the simplest way:
SELECT username, MAX(score), MIN(modified)
FROM entries
GROUP BY username
This returns this in your case:
"userB";22;"2014-01-22 12:26:06"
"userA";22;"2014-01-22 08:00:14"
However, I think what you want (in your example would be wrong) the most recent row. To do it, you need this:
SELECT username, MAX(score), MAX(modified)
FROM entries
GROUP BY username
Which returns:
"userB";22;"2014-01-22 18:49:01"
"userA";22;"2014-01-22 16:13:22"

SUM of Multiple COUNTs on Different Tables

This topic appears to be a popular one and definitely saturated in terms of the number of related posts, however, I've been working on this for 3 days and I cannot get this figured out.
I've been scouring this site and many others with potential solutions to this and some are executing, but I am not getting the expected results.
Here's what I'm trying to do...
SELECT and COUNT the number of reviews a user has submitted in the reviews table.
SELECT and COUNT the number of up-votes a user has in the reviewVotes table.
GROUP BY username (which is a key in both tables - usernames are unique, but exist in multiple rows).
Order the result set by the SUM of those COUNTs DESC. (This is something I keep trying, but can't get to even execute, so I am ordering by userReviewNum DESC right now.)
LIMIT the result set to the first 10.
The result set should give me the top 10 reviewers which is calculated by the number of reviews plus (+) the number of up-votes.
Here is my latest attempt which executes, but appears to be multiplying userReviewNum * reviewVotesNum and I need it to add them (but I have been extremely unsuccessful at any attempt to include the SUM command - so bad in fact that I am embarrassed to even show my attempts).
SELECT
reviews.username,
count(reviews.username) userReviewNum,
count(reviewVotes.username) reviewVotesNum
FROM reviews
LEFT JOIN reviewVotes ON reviews.username = reviewVotes.username
GROUP by reviews.username
ORDER BY userReviewNum DESC
LIMIT 0, 10
I've tried using a JOIN and a UNION and I can't seem to get either of them to work.
Any help anyone can provide is greatly appreciated!
UPDATE:
Here is the structure and some sample data.
Reviews Table (there are other fields, but these are the important ones):
| username | comment | rating | productID |
| foo | this is awesome! | 5 | xxxx |
| bar | i don't like this | 1 | xxxx |
| foo2 | it's ok | 3 | xxxx |
| foo | bleh - nasty | 1 | xxxx |
reviewVotes Table (again, more fields than this, but these are the important ones):
| username | voterUsername | productID |
| foo | foo2 | xxxx |
| foo2 | foo | xxxx | (the simple idea here is one user is up-voting another user's post)
So I need to count the number of reviews a user has in the Reviews table, then count the number of upvotes a user has in the reviewVotes table, and then order by the sum of those two numbers.
Additional UPDATE:
In the example above, here are the expected results:
Username | # Reviews
foo | 2
bar | 1
foo2 | 1
Username | # Up-Votes
foo | 1
foo2 | 1
Username | Total Sum
foo | 3
bar | 1
foo2 | 2
Try counting distinct reviews and votes like this:
SELECT
reviews.username,
COUNT(DISTINCT reviews.id) AS userReviewNum,
COUNT(DICTINCT reviewVotes.id) AS reviewVotesNum,
COUNT(DISTINCT reviews.id) + COUNT(DICTINCT reviewVotes.id) AS userRating
FROM
reviews
LEFT JOIN reviewVotes ON reviews.username = reviewVotes.username
GROUP by reviews.username
ORDER BY userRating DESC
LIMIT 10
Try this:
SELECT username, SUM(userReviewNum + reviewVotesNum) AS userRank
FROM (
SELECT
reviews.username,
count(reviews.username) userReviewNum,
count(reviewVotes.username) reviewVotesNum
FROM reviews
LEFT JOIN reviewVotes ON reviews.username = reviewVotes.username
GROUP by reviews.username
ORDER BY userReviewNum DESC
LIMIT 0, 10)
AS result_set
GROUP BY username
The group by there is, I think, required for the SUM to work.
Try this:
SELECT Res1.*, SUM(IF(reviewVotes.Username IS NULL, 0, 1)) AS UpVotes,
userReviewNum + SUM(IF(reviewVotes.Username IS NULL, 0, 1)) AS TotalSum FROM (
SELECT username, Count(*) AS userReviewNum
FROM reviews
GROUP BY username) AS Res1
LEFT OUTER JOIN reviewVotes ON res1.username = reviewVotes.username
GROUP BY Res1.username
ORDER BY TotalSum DESC
There result would be this:
foo 2 1 3
foo2 1 1 2
bar 1 0 1

Select distinct column, and a count of the entries

I have a database table which stores competition entries from users.
I am wanting to select distinct email address, and then the number of entries for each email address. So I want to see how many times each email address has been used for entries.
I am thinking something along the lines of
SELECT DISTINCT `email` FROM `tablename` but have a count in there somewhere?
Sorry, probably a very basic question really. But I can't seem to get it.
Is this what you want?
SELECT email, COUNT(*) totalCount
FROM tableName
GROUP BY email
This will give you unique set of email and give you the total records for the specific email.
COUNT() is an aggregate function which basically count the number of records for each group if GROUP BY is specified, in this case email, but if no GROUP BY then it will count all records in the table.
CREATE TABLE tbl (`email` varchar(10));
INSERT INTO tbl (`email`)
VALUES
('a#b.com'),
('b#b.com'),
('c#b.com'),
('d#b.com'),
('e#b.com'),
('a#b.com'),
('b#b.com'),
('c#b.com'),
('c#b.com');
SELECT email, COUNT(*)
FROM tbl
GROUP BY email;
Result
| EMAIL | COUNT(*) |
----------------------
| a#b.com | 2 |
| b#b.com | 2 |
| c#b.com | 3 |
| d#b.com | 1 |
| e#b.com | 1 |
See a demo