How to count distinct values from two columns into one number - mysql

The two tables I'm working on are these:
Submissions:
+----+------------+
| id | student_id |
+----+------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------------+
Group_submissions:
+----+---------------+------------+
| id | submission_id | student_id |
+----+---------------+------------+
| 1 | 1 | 2 |
| 2 | 2 | 1 |
+----+---------------+------------+
Only one student actually makes the submission and goes into the submissions table while the others go to the group_submissions table(if the submission is a group submission)
I want to count the unique number of students that have made submission either as a group or alone
I want just the number to be returned in the end (3 based on the data on the tables above)
A student that is in the submissions table should not be counted twice if he is in the group_submission table and vice-versa.
Also students that only have done individual submissions(that are not in the group_submissions table) also should be counted regardless if the have ever been in a group submission
I'm already doing some other operations on these table in a query I'm building so if you can give me a solution based on joining these two tables that would help.
This is what i have tried:
count(distinct case when group_submissions.student_id is not null then group_submissions.student_id end) + count(distinct case when submissions.student_id is not null then submissions.student_id end)
But it gives me duplicates so if a student is in both tables he is counted two times.
Any ideas?
NOTE: This is a MySQL database.

I think you want union and a count:
select count(*)
from ((select student_id
from submissions
)
union -- on purpose to remove duplicates
(select student_id
from group_submissions
)
) s;

After listening to the clarification, I think it is not wise to force yourself to compute this using the join. You can instead make the count just a simple expression as the final outcome. Use UNION and then distinct will help for building such an expression.
OLD ANSWER BELOW THAT DOES NOT FIT THE PROBLEM:
Very simple fix is needed to your current version...
count(distinct case when group_submissions.student_id is not null then
group_submissions.student_id when assignment_submissions.student_id is
not null then assignment_submissions.student_id end)
Note:
your original expression is an addition between 2 case expressions, each with a single WHEN inside
now I turn it into a single case expression with 2 WHEN's```SQL

Related

SQL query to find duplicate rows and return both IDs

I have a table of customers:
id | name | email
--------------------------
1 | Rob | spam#email.com
2 | Jim | spam#email.com
3 | Dave | ham#email.com
4 | Fred | eggs#email.com
5 | Ben | ham#email.com
6 | Tom | ham#email.com
I'm trying to write an SQL query that returns all the rows with duplicate email addresses but... I'd like the query result to return the original ID and the duplicate ID. (The original ID is the first occurrence of the duplicate email.)
The desired result:
original_id | duplicate_id | email
-------------------------------------------
1 | 2 | spam#email.com
3 | 5 | ham#email.com
3 | 6 | ham#email.com
My research so far has indicated it might involve some kind of self join, but I'm stuck on the actual implementation. Can anyone help?
We could handle this using a join, but I might actually go for an option which generates a CSV list of id corresponding to duplicates:
SELECT
email,
GROUP_CONCAT(id ORDER BY id) AS duplicate_ids
FROM yourTable
GROUP BY email
HAVING COUNT(*) > 1
Functionally speaking, this gives you the same information you wanted in your question, but in what is a much simplified form in my opinion. Because we order the id values when concatenating, the original id will always appear first, on the left side of the CSV list. Also, if you have many duplicates your requested output could become verbose and harder to read.
Output:
Demo
select
orig.original_id,
t.id as duplicate_id,
orig.email
from t
inner join (select min(id) as original_id, email
from t
group by email
having count(*)>1) orig on orig.email = t.email
having t.id!=orig.original_id
By the subquery we can find all ids for emails with duplicates.
Then we join the subquery by email and for each one use minimal id as original
UPDATE: http://rextester.com/BLIHK20984 cloned #Tim Biegeleisen's answer

How to create virtual column with multiple value using MySQL SELECT?

I can add virtual columns as
SELECT '1' as id
| id |
-------
| 1 |
But I want add multiple values, example:
SELECT ('1','2','3') as id
| id |
-------
| 1 |
| 2 |
| 3 |
But this don't work
Like Marc B said in a comment you can't have a single query split a single row into multiple rows, but you can have multiple queries, each producing one of the values, by chaining them together with union.
SELECT 1 id
UNION
SELECT 2
UNION
SELECT 3
As the answer was provided in a couple of comments I'll post it as a community wiki.

MySQL - Is it possible to sort data from one table based on the presence of data in another?

I have a table that I want to be able to sort based on the existence of data in another, related table, but I'm not sure what I want to do is possible in a single query.
For example, say I have a Products table and a Notifications table. Each table has a bunch of columns, but the important ones for this purpose is an Active column, and a foreign key in the Notifications table that references the Products table. Each row in the Products table may be referenced 0 to N times in the Notifications table.
Products Notifications
ProductID | Active NotificationID | ProductID | Active | Type
----------+------- ---------------+-----------+--------+-----
1 | 1 1 | 2 | 1 | 2
2 | 1 2 | 3 | 0 | 1
3 | 1 3 | 3 | 1 | 1
4 | 1 4 | 5 | 1 | 1
5 | 1 5 | 3 | 1 | 1
One use case I'd like to support is to sort the data from the Products table based on whether or not there is an active Notification of a particular Type (Type=1) for the Product. So in the above example, Products 3 and 5 to be collated first or last, but all five products should still be in the result set.
I haven't been able to figure out a way to manage this in a single SELECT statement. I can easily pull just the Products that do or don't have an active Notification of a certain type, but I can't figure out a way to get them all at once and sort them based on that. Is it possible or do I just need to run a couple of separate queries?
What you want is accomplished through a join and aggregation. I would suggest summarizing the notifications table as a subquery to get what you want:
select p.*
from products p left join
(select productId, count(*) as cnt
from notifications n
where active = 1
group by productid
) n
on p.productid = n.productid
order by (n.productid is not null) desc;
This structure gives you the flexibility of using the existence (as shown above), or the count, or including the count in the select list.

Getting Distinct Value Counts Across Multiple Fields/Tables in One MySQL Query

I have a table called visits which contains the following
link_id, id, browser, country, referer
Now, this basically records visits of a certain link and inserts the browser, country and referer of whomever visted that link in a database
Now I need to show statistics for each link
I used the following query to get me all the browsers
SELECT browser, COUNT(browser) FROM visits GROUP BY browser
Which produced something like
Browser Count(Browser)
Internet Explorer | 5
Chrome | 3
Now this worked as expected for browsers only but I'm looking for a way to count all occurrences of referers, browsers and countries in one single query.
Is there a way to do this?
To count multiple, different occurence counts of values in the DB can very easily be done in just one query.
Keep in mind, the column header in SELECT COUNT(tablename) returns only one column, with only one numeric value. For every distinct value (from the GROUP BY clause), you have two columns: Value, Count. To count for different fields, you'll need three: Field, Value, Count, and if you want to count different fields in different tables, you'll need four: Table, Field, Value, Count.
Observe how I am using UNION below for two different tables:
SELECT
"Table1" AS TableName,
"Field1" AS Field,
Field1 AS Value,
COUNT(Field1) AS COUNT
FROM Table1
GROUP BY Value
UNION
SELECT
"Table2" as TableName,
"Field2" as Field,
Field2 as Value,
COUNT(Field2) AS COUNT
FROM Table2
GROUP BY Value
You'll notice I need to use aliases: "Table2" as TableName, this is because the UNION'd columns ought to have matching column headers.
So you can visualize what this returns, take a look:
+-------------------+----------------+----------+--------+
| TableName | Field | Value | COUNT |
+-------------------+----------------+----------+--------+
| ItemFee | PaymentType | | 228 |
| ItemFee | PaymentType | All | 1 |
| ItemFee | PaymentType | PaidOnly | 1 |
| Person | Presenter | | 692258 |
| Person | Presenter | N | 590 |
| Person | Presenter | Y | 8103 |
+-------------------+----------------+----------+--------+

Counting votes in a MySQL table only once or twice

I've got the following table:
+-----------------+
| id| user | vote |
+-----------------+
| 1 | 1 | text |
| 2 | 1 | text2|
| 3 | 2 | text |
| 4 | 3 | text3|
| 5 | 2 | text |
+-----------------+
What I want to do is to count the "votes"
SELECT COUNT(vote), vote FROM table GROUP BY vote
That works fine. Output:
+-------------------+
| count(vote)| vote |
+-------------------+
| 3 | text |
| 1 | text2|
| 1 | text3|
+-------------------+
But now I only want to count the first or the first and the second vote from a user.
So result what I want is (if I count only the first vote):
+-------------------+
| count(vote)| vote |
+-------------------+
| 2 | text |
| 1 | text3|
+-------------------+
I tried to work with count(distinct...) but can get it work.
Any hint in the right direction?
You can do this in a single SQL statement with something like this:
SELECT vote, COUNT(vote)
FROM
(
SELECT MAX(user), vote
FROM table1
GROUP BY user
) d
GROUP BY vote
Note that this only gives you 1 vote not 1 or 2.
The easiest way would be to use one of the "row numbering" solutions listed in this SO question. Then your original query's almost there:
SELECT
COUNT(vote),
vote
FROM tableWithRowNumberAdded
WHERE MadeUpRowNumber IN (1,2)
GROUP BY vote
My alternative is much longer winded and calls for working tables. These can be "real" tables in your schema, or whatever flavour of intermediate resultsets you are comfortable with.
Start by getting the first vote for each user:
SELECT user, min(id) FROM table GROUP BY user
Put this in a working table; let's call it FirstVote. Next we can get each user's second vote, if any:
SELECT user, min(id) FROM table WHERE id not in (select id from FirstVote) GROUP BY user
Let's call the result of this SecondVote. UNION FirstVote to SecondVote, join this to the original table and group by vote. There's your answer!
SELECT
vote,
COUNT(*)
FROM table
INNER JOIN
(
SELECT id FROM FirstVote
UNION ALL
SELECT id FROM SecondVote
) as BothVotes
ON BothVotes.id = table.id
GROUP BY vote
Of course it could be structured as a single statement with multiple sub-queries but that would be horrendous to maintain, or read in this forum.
This is a very triky question for MySQL. On other systems there windowed functions: it performs a calculation across a set of table rows that are somehow related to the current row.
MySQL lacks this functionality. So one should look for a workaround. Here is the problem description and couple solutions suggested: MySQL and window functions.
I also assume that first 2 votes by the User can be determined by Id: earlier vote has smaller Id.
Based on this I would suggest this solution to your problem:
SELECT
Vote,
Count (*)
FROM
Table,
(
SELECT
user_id, SUBSTRING_INDEX(GROUP_CONCAT(Id ORDER BY user_id ASC), ',', 2) AS top_IDs_per_user
FROM
Table
GROUP BY
user_id
) s_top_IDs_per_User
WHERE
Table.user_id = s_top_IDs_per_User.User_id and
FIND_IN_SET(Id, s_top_IDs_per_User.top_IDs_per_user)
GROUP BY Vote
;