MYSQL - GROUP_CONCAT show duplicates only

MYSQL - GROUP_CONCAT show duplicates only - mysql

I have 2 SQL tables.
The first table has a mapping of department-ids to student-ids. (A student may belong to more than one department and a department has many students).
The second table has a mapping of students and their hobbies. (which also has a many to many relationship)
Sample Snapshots of tables:
DESC DEPARTMENT_STUDENTS;
----------------------------
DEPT_ID STUDENT_ID
----------------------------
Physics 123
Mathematics 111
Physics 111
CS 45
CS 56
Mathematics 89
DESC STUDENTS_HOBBIES;
-------------------------------
STUDENT_ID HOBBY
-------------------------------
111 Skiing
111 Singing
111 Browsing
123 Singing
123 Browsing
123 Reading
45 Origami
56 Origami
56 Making Prank Calls
89 Reading
I am looking for a query that will provide a list of common hobbies among all the students per department.
Required output would look like :
-----------------------------------------
DEPT_ID GROUP_CONCAT(...)
-----------------------------------------
Physics (Singing, Browsing)
Mathematics ()
CS (Origami)
I am almost there and after a little bit of fiddling around with MySQL, I was able to group ALL(not the common ones) the hobbies of a department. GROUP_CONCAT provides ways to eliminate duplicates also. One way that I could think of , is to retrieve duplicates programatically over the query below:
SELECT DEPT_ID, GROUP_CONCAT(OP.HOBBIES) FROM DEPARTMENT_STUDENTS DS
INNER JOIN (SELECT STUDENT_ID, GROUP_CONCAT(HOBBY)AS HOBBIES FROM STUDENTS_HOBBIES
GROUP BY STUDENT_ID) OP
ON OP.STUDENT_ID = DS.STUDENT_ID
GROUP BY DS.DEPT_ID;
Is there a way to preserve duplicates alone? Even in this query, any optimizations are welcome. Thanks!

Here you go:
SELECT DEPT_ID, GROUP_CONCAT(HOBBY) FROM (
SELECT
d.DEPT_ID
, s.HOBBY
FROM
department_students d
INNER JOIN students_hobbies s USING(STUDENT_ID)
GROUP BY d.DEPT_ID, s.HOBBY
HAVING COUNT(DISTINCT s.STUDENT_ID) > 1
)sq
GROUP BY DEPT_ID

Related

SQL nested query under WHERE

One of the test questions came by with following schemas, to look for the best doctor in terms of:
Best scored;
The most times/attempts;
For each medical procedures (in terms of name)
[doctor] table
id
first_name
last_name
age
1
Phillip
Singleton
50
2
Heidi
Elliott
34
3
Beulah
Townsend
35
4
Gary
Pena
36
5
Doug
Lowe
45
[medical_procedure] table
id
doctor_id
name
score
1
3
colonoscopy
44
2
1
colonoscopy
37
3
4
ulcer surgery
98
4
2
angiography
79
5
3
angiography
84
6
3
embolization
87
and list goes on...
Given solution as follow:
WITH cte AS(
SELECT
name,
first_name,
last_name,
COUNT(*) AS procedure_count,
RANK() OVER(
PARTITION BY name
ORDER BY COUNT(*) DESC) AS place
FROM
medical_procedure p JOIN doctor d
ON p.doctor_id = d.id
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
GROUP BY
name,
first_name,
last_name
)
SELECT
name,
first_name,
last_name
FROM cte
WHERE place = 1;
It'll mean a lot to be clarified on/explain on how the WHERE clause worked out under the subquery:
How it worked out in general
Why must we match the two pp.name and p.name for it to reflect the correct rows...
...
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
...
Thanks a heap!

Above is join with doctor and medical procedure and group by procedure name and you need doctor names with most attempt and best scored.
Subquery will join by procedure avg score and those who have better score than avg will be filtered.
Now there can be multiple doctor better than avg so taken rank by procedure count so most attempted will come first and then you taken first to pick top one

SQL group by function to categorize only the most recent data

So I have this table called title where it stores all of the title held by each employee which will look like this
emp_no
title
start_date
101
Engineer
2019-01-01
101
Senior Engineer
2020-02-01
102
Engineer
2019-01-11
102
Senior Engineer
2020-02-11
103
Engineer
2019-01-21
104
Engineer
2019-01-31
105
Associate
2019-01-01
106
Associate
2019-01-11
106
Manager
2020-02-11
107
Associate
2019-01-21
107
Manager
2020-02-21
108
Associate
2019-01-31
Notice that each employee can have more than 1 title. For example emp 101 title is engineer in 1st January 2019 but got promoted as senior engineer one year later.
Now lets say i want to count how many employees for each position. I have tried using the count function along with group by (to group the number of employee by the title) but the problem is, the SQL query also count the past position of every employee.
To be exact, I only want to include the most recent role that an employee currently has. So in this case, the result I am expecting is
Engineer: 2 employees (because the other 2 has been promotod to senior engineer),
Senior engineer: 2 employees,
Associate: 2 employees (because the other 2 has been promotod to manager),
Manager: 2 employees
Is there some kind of way to achieve that?
NOTE: this table format is from one of the SQL online course that i'm taking so I'm not the one who make the table. and also in the original table in containes tens of thousands of data.

You can use not exists as follows:
select title, count(*) as Count
from your_table t
where not exists
(select 1 from your_table tt
where tt.emp_no = t.emp_no and tt.start_date> t.start_date)
group by title

select title,COUNT(*) numberOfEmp from
(
select distinct emp_no
,(select top 1 title from [dbo].[Tbl_title] a where a.emp_no=m.emp_no
order by [start_date] desc
) title
from [dbo].[Tbl_title] m
) mTable
group by title

I am going to recommend a correlated subquery, but for a very particular reason:
select title, count(*)
from t
where t.start_date = (select max(t2.start_date)
from t t2
where t2.emp_no = t.emp_no
);
The particular reason for suggesting this is that it is easy to modify this for the number of employees "as of" a particular date. For instance, if you want the number of employees as of 2019-01-01, you change the where to:
where t2.emp_no = t.emp_no and t2.start_date <= '2019-01-01'

You can simply filter the data in where condition while counting. Query as follows:
select title, count(distinct emp_no) as Count
from (select emp_np, title, max(start_date) as start_date
from table
group by emp_np, title) subset
group by title

Second max based on Category in SQL

I am trying to find the second max based on two different categories. I can use analtycal function or logic to get this. I have been trying to find this through a logic.
My question is I am trying to fetch the records of second most taken exam per country by unique students.
T1
Exam_ID Student_ID
123 553
123 457
345 563
567 765
678 543
678 543
987 123
678 123
T2
Exam_ID Exam_name Country_name
123 SAT USA
345 CAT USA
567 GRE USA
678 TOEFL UK
987 IELTS UK
222 CBAP UK
This is what I tried so far,
select count(distinct T1.Student_ID) count_user,
t2.Country_name,t2.Exam_name
from T1
join T2
on T1.Exam_ID = T2.Exam_ID
group by t2.Exam_name, t2.Country_name
By doing this I am able to get the unique student count based on each exam and country.
How can I get the second max no of exams taken by unique students based on the country?

I'm not sure I fully understand what you mean by your question.
Could you post the expected result along with what you are getting now?
In the mean time, I'm taking a guess that exam_id 678 in the UK (with 3 students) is the top result and 987 in the UK is the "second top result"???
If so, Row_number () might work for you. Bear in mind that row_number is usually an expensive operation in relational databases as it involves a redistribution and a sort. A similar function Rank () may be better for you depending upon how you want to handle ties. The syntax is similar, you could try both.
Try modifying your query as follows:
select count(distinct T1.student_id) count_user, Country_name, Exam_name,
row_number () over (partition by country_name order by count_user desc) as row_num
...
If that gives you the numbering you want, you can then restrict the output using the qualify clause i.e.
qualify row_num = 2
You may need to wrap the whole thing in a derived table as follows:
select count_user, country_name, exam_name,
row_number () over (partition by country_name order by count_user desc) as row_num
from (
select count(distinct T1.Student_ID) count_user,
t2.Country_name,t2.Exam_name,
from T1 join T2
on T1.Exam_ID = T2.Exam_ID
group by t2.Exam_name, t2.Country_name
) detail_recs
qualify row_num = 2

complex query in mysql

i have three tables in mysql like this,
triz_sti
stu_id name
-----------------
1 x1
2 x2
triz_sub
sub_id sub_name
------------------
1 english
2 maths
3 science
triz
stu_id sub_id marks
-------------------------
1 1 23
1 2 56
1 3 83
2 1 78
2 2 23
2 3 50
i want the result like
display all subject with higest mark in perticular subject with student name,
max_marks sub_name student_name
--------------------------------------
78 english x2
56 maths x1
83 science x2
so please help for this output that i want, i have tried but i m not get it desire output.

How about something like this?
SELECT
t.stu_id, t.sub_id, t.marks
FROM
triz t
JOIN (SELECT sub_id, MAX(marks) max_mark FROM triz GROUP BY sub_id) a ON (a.sub_id = t.sub_id AND a.max_mark = t.marks)
Of course you'll need to join it with lookup tables for names.
Have to say, it's early here so I might have missed something.
BR

The general, simplified syntax in this case is
SELECT stuff FROM joined tables ORDER BY whatever
The easiest is the ORDER BY: you want to sort descending by marks, so you ORDER BY marks DESC.
Where do the data come from? From triz, joined to the others. So
triz JOIN triz_sti USING (stu_id) JOIN triz_sub USING (sub_id)
And you want to display the marks.
So you get
SELECT marks, sub_name, name AS student_name
FROM triz JOIN triz_sti USING (stu_id) JOIN triz_sub USING (sub_id)
ORDER BY marks DESC
.
The rest I leave to you. :-)

how to use union resolve this full outer join problem under mySQL

Here is the table
stuid stuname subject grade
1 alex algo 99
1 alex dastr 100
2 bob algo 90
2 bob dastr 95
3 casy algo 100
4 Daisy dastr 100
case1: assuming there are only two subjects in the table
Following is the expected output
stuname algo dastr
alex 99 100
bob 90 95
casy 100 0
Daisy 0 100
I think following is a workable query
select g1.stuname,
COALESCE(g1.grade,0) as algo
COALESCE(g2.grade,0) as dastr
from grades g1
full outer join grades g2 on g1.stuid = g2.stuid
where g1.subject = algo and g2.subject = dastr;
But, mysql doesnt support full outer join. Is there any other way to resolve the problem?
Also, case 2
assuming there are unknown number of subjects in the table
and the expected output would be
stuname subj1 subj2 subj3 ... subjn
I know I might be using procedure resolve it, is there any other way that I can use to compose columns in mySQL?

Your queries would work better if you re-structured your tables. You are attempting to store too much information in one table. Here is a proposed structure:
Students
student_id student_name
1 Alex
2 Bob
3 Casy
4 Daisy
Subjects
subject_id subject_name
1 Algo
2 Dastr
Grades
student_id subject_id grade
1 1 99
1 2 100
2 1 90
2 2 95
3 1 100
4 2 100
In grades, student_id and subject_id would be a composite key, meaning a unique combination of the two becomes the unique identifier (student 1, subject 1 is unique from student 1, subject 2)
To return the data based on your comment, try:
SELECT a.student_name, b.subject_name, c.grade
FROM students a, subjects b, grades c
WHERE a.student_id = c.student_id
AND b.subject_id = c.subject_id
ORDER BY a.student_id

Have you tried something along the line of:
SELECT a.stuid as sidA, a.grade as grA, a.grade as grB
FROM grades a JOIN grades b ON (a.stuname = b.stuname)
But as D.N. suggested, it may be worth restructuring your tables

From your existing data...
select
stuid,
max( stuName ) stuName,
max( if( subject = "algo", grade, 000 )) as Algo,
max( if( subject = "dastr", grade, 000 )) as Dastr
from
Grades
group by
stuid
order by
stuName
However, if you have multiple people with the same "StuName", by grouping by their unique ID, it will keep them differentiated, so for clarification, I've included the ID column in the final query.
However, the data restructuring as suggested by #D.N. would be a cleaner approach.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MYSQL - GROUP_CONCAT show duplicates only - mysql

Here you go: SELECT DEPT_ID, GROUP_CONCAT(HOBBY) FROM ( SELECT d.DEPT_ID , s.HOBBY FROM department_students d INNER JOIN students_hobbies s USING(STUDENT_ID) GROUP BY d.DEPT_ID, s.HOBBY HAVING COUNT(DISTINCT s.STUDENT_ID) > 1 )sq GROUP BY DEPT_ID

Related

SQL nested query under WHERE

SQL group by function to categorize only the most recent data

Second max based on Category in SQL

complex query in mysql

how to use union resolve this full outer join problem under mySQL

Categories

Resources