I have to find a pair of students who take exactly the same classes from table that has studentID and courseID.
studentID | courseID
1 1
1 2
1 3
2 1
3 1
3 2
3 3
Query should return (1, 3).
The result also should not have duplicate rows such as (1,3) and (3,1).
Given sample data:
CREATE TABLE student_course (
student_id integer,
course_id integer,
PRIMARY KEY (student_id, course_id)
);
INSERT INTO student_course (student_id, course_id)
VALUES (1, 1), (1, 2), (1, 3), (2, 1), (3, 1), (3, 2), (3, 3) ;
Use array aggregation
One option is to use a CTE to join on the ordered lists of courses each student is taking:
WITH student_coursearray(student_id, courses) AS (
SELECT student_id, array_agg(course_id ORDER BY course_id)
FROM student_course
GROUP BY student_id
)
SELECT a.student_id, b.student_id
FROM student_coursearray a INNER JOIN student_coursearray b ON (a.courses = b.courses)
WHERE a.student_id > b.student_id;
array_agg is actually part of the SQL standard, as is the WITH common-table expression syntax. Neither are supported by MySQL so you'll have to express this a different way if you want to support MySQL.
Find missing course pairings per-student
Another way to think about this would be "for every student pairing, find out if one is taking a class the other is not". This would lend its self to a FULL OUTER JOIN, but it's pretty awkward to express. You have to determine the pairings of student IDs of interest, then for each pairing do a full outer join across the set of classes each takes. If there are any null rows then one took a class the other didn't, so you can use that with a NOT EXISTS filter to exclude such pairings. That gives you this monster:
WITH student_id_pairs(left_student, right_student) AS (
SELECT DISTINCT a.student_id, b.student_id
FROM student_course a
INNER JOIN student_course b ON (a.student_id > b.student_id)
)
SELECT left_student, right_student
FROM student_id_pairs
WHERE NOT EXISTS (
SELECT 1
FROM (SELECT course_id FROM student_course WHERE student_id = left_student) a
FULL OUTER JOIN (SELECT course_id FROM student_course b WHERE student_id = right_student) b
ON (a.course_id = b.course_id)
WHERE a.course_id IS NULL or b.course_id IS NULL
);
The CTE is optional and may be replaced by a CREATE TEMPORARY TABLE AS SELECT ... or whatever if your DB doesn't support CTEs.
Which to use?
I'm very confident that the array approach will perform better in all cases, particularly because for a really large data set you can take the WITH expression, create a temporary table from the query instead, add an index on (courses, student_id) to it and do crazy-fast equality searching that'll well and truly pay off the cost of the index creation time. You can't do that with the subquery joins approach.
select courses,group_concat(studentID) from
(select studentID,
group_concat(courseID order by courseID) as courses
from Table1 group by studentID) abc
group by courses having courses like('%,%');
fiddle
Test case:
I created a somewhat realistic test case:
CREATE TEMP TABLE student_course (
student_id integer
,course_id integer
,PRIMARY KEY (student_id, course_id)
);
INSERT INTO student_course
SELECT *
FROM (VALUES (1, 1), (1, 2), (1, 3), (2, 1), (3, 1), (3, 2), (3, 3)) v
-- to include some non-random values in test
UNION ALL
SELECT DISTINCT student_id, normal_rand((random() * 30)::int, 1000, 35)::int
FROM generate_series(4, 5000) AS student_id;
DELETE FROM student_course WHERE random() > 0.9; -- create some dead tuples
ANALYZE student_course; -- needed for temp table
Note the use of normal_rand() to populate the dummy table with a normal distribution of values. It's shipped with the tablefunc module, and since i am going to use that further down anyway ...
Also note the bold emphasis on the numbers I am going to manipulate for the benchmark to simulate various test cases.
Plain SQL
The question is rather basic and unclear. Find the first two students with matching courses? Or find all? Find couples of them or groups of students sharing the same courses?
Craig answers to:
Find all couples sharing the same courses.
C1 - Craig's first query
Plain SQL With a CTE and grouping by arrays, slightly formatted:
WITH student_coursearray(student_id, courses) AS (
SELECT student_id, array_agg(course_id ORDER BY course_id)
FROM student_course
GROUP BY student_id
)
SELECT a.student_id, b.student_id
FROM student_coursearray a
JOIN student_coursearray b ON (a.courses = b.courses)
WHERE a.student_id < b.student_id
ORDER BY a.student_id, b.student_id;
The second query in Craig's answer dropped out of the race right away. With more than just a few rows, performance quickly deteriorates badly. The CROSS JOIN is poison.
E1 - Improved version
There is one major weakness, ORDER BY per aggregate is a bad performer, so I rewrote with ORDER BY in a subquery:
WITH cte AS (
SELECT student_id, array_agg(course_id) AS courses
FROM (SELECT student_id, course_id FROM student_course ORDER BY 1, 2) sub
GROUP BY student_id
)
SELECT a.student_id, b.student_id
FROM cte a
JOIN cte b USING (courses)
WHERE a.student_id < b.student_id
ORDER BY 1,2;
E2 - Alternative interpretation of question
I think the generally more useful case is:
Find all students sharing the same courses.
So I return arrays of students with matching courses.
WITH s AS (
SELECT student_id, array_agg(course_id) AS courses
FROM (SELECT student_id, course_id FROM student_course ORDER BY 1, 2) sub
GROUP BY student_id
)
SELECT array_agg(student_id)
FROM s
GROUP BY courses
HAVING count(*) > 1
ORDER BY array_agg(student_id);
F1 - Dynamic PL/pgSQL function
To make this generic and fast I wrapped it into a plpgsql function with dynamic SQL:
CREATE OR REPLACE FUNCTION f_same_set(_tbl regclass, _id text, _match_id text)
RETURNS SETOF int[] AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$f$
WITH s AS (
SELECT %1$I AS id, array_agg(%2$I) AS courses
FROM (SELECT %1$I, %2$I FROM %3$s ORDER BY 1, 2) s
GROUP BY 1
)
SELECT array_agg(id)
FROM s
GROUP BY courses
HAVING count(*) > 1
ORDER BY array_agg(id)
$f$
,_id, _match_id, _tbl
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_same_set('student_course', 'student_id', 'course_id');
Works for any table with numeric columns. It's trivial to extend for other data types, too.
crosstab()
For a relatively small number of courses (and arbitrarily big number of students) crosstab() provided by the additional tablefunc module is another option in PostgreSQL. More general info here:
PostgreSQL Crosstab Query
Simple case
A simple case for the simple example in the question, much like explained in the linked answer:
SELECT array_agg(student_id)
FROM crosstab('
SELECT student_id, course_id, TRUE
FROM student_course
ORDER BY 1'
,'VALUES (1),(2),(3)'
)
AS t(student_id int, c1 bool, c2 bool, c3 bool)
GROUP BY c1, c2, c3
HAVING count(*) > 1;
F2 - Dynamic crosstab function
For the simple case, the crosstab variant was faster, so I build a plpgsql function with dynamic SQL and included it in the test. Functionally identical with F1.
CREATE OR REPLACE FUNCTION f_same_set_x(_tbl regclass, _id text, _match_id text)
RETURNS SETOF int[] AS
$func$
DECLARE
_ids int[]; -- for array of match_ids (course_id in example)
BEGIN
-- Get list of match_ids
EXECUTE format(
'SELECT array_agg(DISTINCT %1$I ORDER BY %1$I) FROM %2$s',_match_id, _tbl)
INTO _ids;
-- Main query
RETURN QUERY EXECUTE format(
$f$
SELECT array_agg(%1$I)
FROM crosstab('SELECT %1$I, %2$I, TRUE FROM %3$s ORDER BY 1'
,'VALUES (%4$s)')
AS t(student_id int, c%5$s bool)
GROUP BY c%6$s
HAVING count(*) > 1
ORDER BY array_agg(student_id)
$f$
,_id
,_match_id
,_tbl
,array_to_string(_ids, '),(') -- values
,array_to_string(_ids, ' bool,c') -- column def list
,array_to_string(_ids, ',c') -- names
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_same_set_x('student_course', 'student_id', 'course_id');
Benchmark
I tested on my small PostgreSQL test server.
PostgreSQL 9.1.9 on Debian Linux on an ~ 6 years old AMD Opteron Server. I ran 5 test sets with the above settings and each of the presented queries. Best of 5 with EXPLAIN ANALYZE.
I used these values for the bold numbers in the above test case to populate:
nr. of students / max. nr. of courses / standard deviation (results in more distinct course_ids)
1. 1000 / 30 / 35
2. 5000 / 30 / 50
3. 10000 / 30 / 100
4. 10000 / 10 / 10
5. 10000 / 5 / 5
C1
1. Total runtime: 57 ms
2. Total runtime: 315 ms
3. Total runtime: 663 ms
4. Total runtime: 543 ms
5. Total runtime: 2345 ms (!) - deteriorates with many pairs
E1
1. Total runtime: 46 ms
2. Total runtime: 251 ms
3. Total runtime: 529 ms
4. Total runtime: 338 ms
5. Total runtime: 734 ms
E2
1. Total runtime: 45 ms
2. Total runtime: 245 ms
3. Total runtime: 515 ms
4. Total runtime: 218 ms
5. Total runtime: 143 ms
F1 victor
1. Total runtime: 14 ms
2. Total runtime: 77 ms
3. Total runtime: 166 ms
4. Total runtime: 80 ms
5. Total runtime: 54 ms
F2
1. Total runtime: 62 ms
2. Total runtime: 336 ms
3. Total runtime: 1053 ms (!) crosstab() deteriorates with many distinct values
4. Total runtime: 195 ms
5. Total runtime: 105 ms (!) but performs well with fewer distinct values
The PL/pgSQL function with dynamic SQL, sorting rows in a subquery is clear victor.
Naive relational division implementation, with CTE:
WITH pairs AS (
SELECT DISTINCT a.student_id AS aaa
, b.student_id AS bbb
FROM student_course a
JOIN student_course b ON a.course_id = b.course_id
)
SELECT *
FROM pairs p
WHERE p.aaa < p.bbb
AND NOT EXISTS (
SELECT * FROM student_course nx1
WHERE nx1.student_id = p.aaa
AND NOT EXISTS (
SELECT * FROM student_course nx2
WHERE nx2.student_id = p.bbb
AND nx2.course_id = nx1.course_id
)
)
AND NOT EXISTS (
SELECT * FROM student_course nx1
WHERE nx1.student_id = p.bbb
AND NOT EXISTS (
SELECT * FROM student_course nx2
WHERE nx2.student_id = p.aaa
AND nx2.course_id = nx1.course_id
)
)
;
The same, without CTE's:
SELECT *
FROM (
SELECT DISTINCT a.student_id AS aaa
, b.student_id AS bbb
FROM student_course a
JOIN student_course b ON a.course_id = b.course_id
) p
WHERE p.aaa < p.bbb
AND NOT EXISTS (
SELECT * FROM student_course nx1
WHERE nx1.student_id = p.aaa
AND NOT EXISTS (
SELECT * FROM student_course nx2
WHERE nx2.student_id = p.bbb
AND nx2.course_id = nx1.course_id
)
)
AND NOT EXISTS (
SELECT * FROM student_course nx1
WHERE nx1.student_id = p.bbb
AND NOT EXISTS (
SELECT * FROM student_course nx2
WHERE nx2.student_id = p.aaa
AND nx2.course_id = nx1.course_id
)
)
;
The non-CTE version is faster, obviously.
Process to get this done in mysql
Create table student_course_agg
(
student_id int,
courses varchar(150)
);
INSERT INTO student_course_agg
select studentID ,GROUP_CONCAT(courseID ORDER BY courseID) courses
FROM STUDENTS
GROUP BY 1;
SELECT master.student_id m_student_id,child.student_id c_student_id
FROM student_course_agg master
JOIN student_course_ag child
ON master.student_id<child.student_id and master.courses=child.courses;
Direct query.
SELECT master.student_id m_student_id,child.student_id c_student_id
FROM (select studentID ,GROUP_CONCAT(courseID ORDER BY courseID) courses
FROM STUDENTS
GROUP BY 1) master
JOIN (select studentID ,GROUP_CONCAT(courseID ORDER BY courseID) courses
FROM STUDENTS
GROUP BY 1) child
ON master.studentID <child.studentID and master.courses=child.courses;
Related
Write a SQL statement which can generate the list of customers whose minutes Streamed is consistently less than the previous minutes Streamed. As in minutes Streamed in the nth order is less than minutes Streamed in n-1th order, and the next previous order is also less. Another way to say it, list the customers that watch less and less minutes each time they watch a movie.
The table, query:
sqlfiddle link:
I have come up with the following query:
select distinct c1.customer_Id
from Customer c1
join Customer c2
where c1.customer_Id = c2.customer_Id
and c1.purchaseDate > c2.purchaseDate
and c1.minutesStreamed < c2.minutesStreamed;
This query doesn't deal with the (n-1)st and (n-2)nd comparison, i.e. "minutes Streamed in the nth order is less than minutes Streamed in n-1th order, and the next previous order is also less." condition.
I have attached a link for sqlfiddle, where I have created the table.
Hello Continuous Learner,
the following statement works for the n-1 and n-2 relation.
select distinct c1.customer_Id
from Customer c1
join Customer c2
on c1.customer_Id = c2.customer_Id
join Customer c3
on c1.customer_Id = c3.customer_Id
where c1.purchaseDate < c2.purchaseDate
and c1.minutesStreamed > c2.minutesStreamed
and c2.purchaseDate < c3.purchaseDate
and c2.minutesStreamed > c3.minutesStreamed
Although, I currently don't have an automatic solution for this problem.
Cheers
I would use a ROW_NUMBER() function with partition by customer id.
and then do a self join, on customer id and rank = rank-1, to bring new and old at the same level
Like:
create temp_rank_table as
(
select
customer_Id,
purchaseDate ,
minutesStreamed,
ROW_NUMBER() OVER (PARTITION BY customer_Id, ORDER BY purchaseDate, minutesStreamed) as cust_row
from Customer
)
self join
select customer_Id
( select
newval.customer_Id,
sum(case when newval.minutesStreamed < oldval.minutesStreamed then 1 else 0 end) as LessThanPrevCount,
max(newval.cust_row) as totalStreamCount
from temp_rank_table newval
left join temp_rank_table oldval
on newval.customer_id = oldval.customer_id
and newval.cust_row-1 = oldval.cust_row -- cust_row 2 matches to cust_row 1
group by newval.customer_id
)A
where A.LessThanPrevCount = (A.totalStreamCount-1)
-- get customers who always stream lesser than previous
--you can use having clause instead of a subquery too
DECLARE #TBL AS TABLE ( [NO] INT, [CODE] VARCHAR(50), [AREA]
VARCHAR(50) )
/* EXAMPLE 1 */ INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(1,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(2,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(3,'001','B00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(4,'001','C00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(5,'001','C00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(6,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(7,'001','A00')
/* EXAMPLE 2 / / ***** USE THIS CODE TO ENTER DATA FROM DIRECT TABLE
***** SELECT ROW_NUMBER() OVER(ORDER BY [FIELD_DATE]) AS [NO] ,[FIELD_CODE] AS [CODE] ,[FIELD_AREA] AS [AREA] FROM TABLE_A WHERE
CAST([FIELD_DATE] AS DATE) >= CAST('20200307' AS DATE) ORDER BY
[FIELD_DATE],[FIELD_CODE]
*/
SELECT A.NO AS ANO ,A.CODE AS ACODE ,A.AREA AS AAREA ,B.NO AS BNO
,B.CODE AS BCODE ,B.AREA AS BAREA ,CASE WHEN A.AREA=B.AREA THEN
'EQUAL' ELSE 'NOT EQUAL' END AS [COMPARE AREA] FROM #TBL A LEFT JOIN
#TBL B ON A.NO=B.NO+1
Blockquote
Considering I have the following two sets of rows (same type) in a WHERE clause:
A B
1 1
2 2
3 4
I need to find how many A is in B
For example, for the given table above, it would be 66% since 2 out of 3 numbers are in B
Another example:
A B
1 1
2 2
3 4
5
3
Would give 100% since all of the numbers in A are in B
Here is what I tried myself: (Doesn't work on all test cases..)
DROP PROCEDURE IF EXISTS getProductsByDate;
DELIMITER //
CREATE PROCEDURE getProductsByDate (IN d_given date)
BEGIN
SELECT
Product,
COUNT(*) AS 'total Number',
(SELECT
(SELECT COUNT(DISTINCT Part) FROM products WHERE Product=B.Product) - COUNT(*)
FROM
products AS b2
WHERE
b2.SOP < B.SOP AND b2.Part != B.Part) AS 'New Parts',
CONCAT(round((SELECT
(SELECT COUNT(DISTINCT Part) FROM products WHERE Product=B.Product) - COUNT(*)
FROM
products AS b2
WHERE
b2.SOP < B.SOP AND b2.Part != B.Part)/count(DISTINCT part)*100, 0), '%') as 'Share New'
FROM
products AS B
WHERE
b.SOP < d_given
GROUP BY Product;
END//
DELIMITER ;
CALL getProductsByDate (date("2018-01-01"));
Thanks.
Naming your tables TA and TB respectively you could try something like this (test made on MSSQL and Mysql at moment)
SELECT ROUND(SUM(PERC) ,4)AS PERC_TOT
FROM (
SELECT DISTINCT TA.ID , 1.00/ (SELECT COUNT(DISTINCT ID) FROM TA) AS PERC
FROM TA
WHERE EXISTS ( SELECT DISTINCT ID FROM TB WHERE TB.ID=TA.ID)
) C;
Output with your first sample data set:
PERC_TOT
0,6667
Output with your second sample data set:
PERC_TOT
1,0000
Update (I made the original for two tables, as I was thinking at solution). This is for one single table (is almost the same than the former query): (I used ID1 for column A and ID2 for column B)
SELECT ROUND(SUM(PERC) ,4)AS PERC_TOT
FROM (
SELECT DISTINCT TA.ID1 , 1.00/ (SELECT COUNT(DISTINCT ID1) FROM TA) AS PERC
FROM TA
WHERE EXISTS ( SELECT DISTINCT ID2 FROM TA AS TB WHERE TB.ID2=TA.ID1)
) C;
table a
no name
2001 jon
2002 jonny
2003 mik
2004 mike
2005 mikey
2006 tom
2007 tomo
2008 tommy
table b
code name credits courseCode
A2 JAVA 25 wer
A3 php 25 wer
A4 oracle 25 wer
B2 p.e 50 oth
B3 sport 50 oth
C2 r.e 25 rst
C3 science 25 rst
C4 networks 25 rst
table c
studentNumber grade coursecode
2003 68 A2
2003 72 A3
2003 53 A4
2005 48 A2
2005 52 A3
2002 20 A2
2002 30 A3
2002 50 A4
2008 90 B2
2007 73 B2
2007 63 B3
SELECT a.num, a.Fname,
b.courseName, b.cMAXscore, b.cCode, c.stuGrade
FROM a
INNER JOIN c
ON a.no = c.no
INNER JOIN b
ON c.moduleCode = b.cCode
INNER JOIN b
ON SUM(b.cMAXscore) / (c.stuGrade)
AND b.cMAXscore = c.stug=Grade
GROUP BY a.Fname, b.cMAXscore, b.cCode, b.courseName,c.stuGrade
"calculate and display every student name(a.Fname) and their ID number(a.num) along with their grade (c.grade) versus the coursse name(b.courseName) and the courses max score(b.cMAXscoure). "
I cant figure out how to divide the MAX by the grade, can someone help?
From the specification, it doesn't look like an aggregate function or a GROUP BY would be necessary. But the specification is ambiguous. There's no table definitions (beyond the unfortunate names and some column references).
Definitions of the tables, along with example data and an example of the desired resultset would go a long ways to removing the ambiguity.
Based on the join predicates in the OP query, I'd suggest something like this query, as a starting point:
SELECT a.Fname
, a.num
, c.grade
, b.courseName
, b.cMAXsource
FROM a
JOIN c
ON c.no = a.no
JOIN b
ON b.cCode = c.moduleCode
ORDER
BY a.Fname
, a.num
, c.grade
, b.courseName
, b.cMAXsource
It seems like that would return the specified result (based on my interpretation of the vague specification.) If that's insufficient i.e. if that doesn't return the desired resultset, then in what way does the desired result differ from the result from this query?
(For more help with your question, I suggest you setup a sqlfiddle example with tables and example data. That will make it easier for someone to help you.)
FOLLOWUP
Based on the additional information provided in the question (table definitions and example data...
To get the maximum (highest) grade for a given course, you could use a query like this:
SELECT MAX(c.grade)
FROM c
WHERE c.coursecode = 'A2'
To get the highest grade for all courses:
SELECT c.coursecode
, MAX(c.grade) AS max_grade
FROM c
GROUP BY c.coursecode
ORDER BY c.coursecode
To match the highest grade for each course to each student grade, use that previous query as an inline view in another query. Something like this:
SELECT g.studentNumber
, g.grade
, g.coursecode
, h.coursecode
, h.highest_grade
FROM c g
JOIN ( SELECT c.coursecode
, MAX(c.grade) AS highest_grade
FROM c
GROUP BY c.coursecode
) h
ON h.coursecode = g.coursecode
To perform a calculation, you can use an expression in the SELECT list of the outer query.
For example, to divide the value of one column by another, you can use the division operator:
SELECT g.studentNumber AS student_number
, g.grade AS student_grade
, g.coursecode AS student_coursecode
, h.coursecode
, h.highest_grade
, g.grade / h.highest_grade AS `student_grade_divided_by_highest_grade`
FROM c g
JOIN ( SELECT c.coursecode
, MAX(c.grade) AS highest_grade
FROM c
GROUP BY c.coursecode
) h
ON h.coursecode = g.coursecode
If you want to also return the name of the student, you can perform a join operation to (the unfortunately named) table a. Assuming that studentnumber is UNIQUE in a :
LEFT
JOIN a
ON a.studentnumber = c.studentnumber
And include a.Fname AS student_first_name in the SELECT list.
If you also need columns from table b, then join that table as well. Assuming that coursecode is UNIQUE in b:
LEFT
JOIN b
ON b.coursecode = g.courscode
Then b.credits can be referenced in an expression in the SELECT list.
Beyond that, you need to be a little more explicit about what result should be returned by the query.
If you are after a "total overall grade" for a student, you'd need to specify how that result should be obtained.
Without knowing table definations it is very hard to provide solution to your problem.
Here is my version of what you are trying to look for:
DECLARE #Student TABLE
(StudentID INT IDENTITY,
FirstName VARCHAR(255),
LastName VARCHAR(255)
);
DECLARE #Course TABLE
(CourseID INT IDENTITY,
CourseCode VARCHAR(25),
CourseName VARCHAR(255),
MaxScore INT
);
DECLARE #Grade TABLE
(ID INT IDENTITY,
CourseID INT,
StudentID INT,
Score INT
);
--Student
insert into #Student(FirstName, LastName)
values ('Test', 'B')
insert into #Student(FirstName, LastName)
values ('Test123', 'K')
--Course
insert into #Course(CourseCode, CourseName, MaxScore)
values ('MAT101', 'MATH',100.00)
insert into #Course(CourseCode, CourseName, MaxScore)
values ('ENG101', 'ENGLISH',100.00)
--Grade
insert into #Grade(CourseID, StudentID, Score)
values (1, 1,93)
insert into #Grade(CourseID, StudentID, Score)
values (1, 1,65)
insert into #Grade(CourseID, StudentID, Score)
values (1, 1,100)
insert into #Grade(CourseID, StudentID, Score)
values (2, 1,100)
insert into #Grade(CourseID, StudentID, Score)
values (2, 1,69)
insert into #Grade(CourseID, StudentID, Score)
values (2, 1,95)
insert into #Grade(CourseID, StudentID, Score)
values (1, 2,100)
insert into #Grade(CourseID, StudentID, Score)
values (1, 2,65)
insert into #Grade(CourseID, StudentID, Score)
values (1, 2,100)
insert into #Grade(CourseID, StudentID, Score)
values (2, 2,100)
insert into #Grade(CourseID, StudentID, Score)
values (2, 2,88)
insert into #Grade(CourseID, StudentID, Score)
values (2, 2,96)
SELECT a.StudentID,
a.FirstName,
a.LastName,
c.CourseCode,
SUM(b.Score) AS 'StudentScore',
SUM(c.MaxScore) AS 'MaxCourseScore',
SUM(CAST(b.Score AS DECIMAL(5, 2))) / SUM(CAST(c.MaxScore AS DECIMAL(5, 2))) AS 'Grade'
FROM #Student a
INNER JOIN #Grade b ON a.StudentID = b.StudentID
INNER JOIN #Course c ON c.CourseID = b.CourseID
GROUP BY a.StudentID,
a.FirstName,
a.LastName,
c.CourseCode;
The problem statement doesn't say anything about dividing by the max, I think you're misunderstanding it.
You need to write a subquery that gets the maximum score for each class, using MAX and GROUP BY. You can then join this with the other tables.
SELECT s.name AS student_name, c.name AS course_name, g.grade, m.max_grade
FROM student AS s
JOIN grade AS g ON s.no = g.studentNumber
JOIN course AS c ON c.code = g.courseCode
JOIN (SELECT courseCode, MAX(grade) AS max_grade
FROM grade
GROUP BY courseCode) AS m
ON m.courseCode = c.courseCode
If you did need to divide the grade by the maximum, you can use g.grade/m.max_grade.
Please take a look at this fiddle.
I'm working on a search filter select box and I want to insert the field names of a table as rows.
Here's the table schemea:
CREATE TABLE general
(`ID` int, `letter` varchar(21), `double-letters` varchar(21))
;
INSERT INTO general
(`ID`,`letter`,`double-letters`)
VALUES
(1, 'A','BB'),
(2, 'A','CC'),
(3, 'C','BB'),
(4, 'D','DD'),
(5, 'D','EE'),
(6, 'F','TT'),
(7, 'G','UU'),
(8, 'G','ZZ'),
(9, 'I','UU')
;
CREATE TABLE options
(`ID` int, `options` varchar(15))
;
INSERT INTO options
(`ID`,`options`)
VALUES
(1, 'letter'),
(2, 'double-letters')
;
The ID field in options table acts as a foreign key, and I want to get an output like the following and insert into a new table:
id field value
1 1 A
2 1 C
3 1 D
4 1 F
5 1 G
6 1 I
7 2 BB
8 2 CC
9 2 DD
10 2 EE
11 2 TT
12 2 UU
13 2 ZZ
My failed attempt:
select DISTINCT(a.letter),'letter' AS field
from general a
INNER JOIN
options b ON b.options = field
union all
select DISTINCT(a.double-letters), 'double-letters' AS field
from general a
INNER JOIN
options b ON b.options = field
Pretty sure you want this:
select distinct a.letter, 'letter' AS field
from general a
cross JOIN options b
where b.options = 'letter'
union all
select distinct a.`double-letters`, 'double-letters' AS field
from general a
cross JOIN options b
where b.options = 'double-letters'
Fiddle: http://sqlfiddle.com/#!2/bbf0b/18/0
A couple to things to point out, you can't join on a column alias. Because that column you're aliasing is a literal that you're selecting you can specify that literal as criteria in the WHERE clause.
You're not really joining on anything between GENERAL and OPTIONS, so what you really want is a CROSS JOIN; the criteria that you're putting into the ON clause actually belongs in the WHERE clause.
I just made this query on Oracle.
It works and produces the output you described :
SELECT ID, CASE WHEN LENGTH(VALUE)=2THEN 2 ELSE 1 END AS FIELD, VALUE
FROM (
SELECT rownum AS ID, letter AS VALUE FROM (SELECT DISTINCT letter FROM general ORDER BY letter)
UNION
SELECT (SELECT COUNT(DISTINCT LETTER) FROM general) +rownum AS ID, double_letters AS VALUE
FROM (
SELECT DISTINCT double_letters FROM general ORDER BY double_letters)
)
It should also run on Mysql.
I did not used the options table. I do not understand his role. And for this example, and this type of output it seems unnecessary
Hope this could help you to.
I have a table that contains two columns
ID | Name
----------------
1 | John
2 | Sam
3 | Peter
6 | Mike
It has missed IDs. In this case these are 4 and 5.
How do I find and insert them together with random names into this table?
Update: cursors and temp tables are not allowed. The random name should be 'Name_'+ some random number. Maybe it would be the specified value like 'Abby'. So it doesn't matter.
Using a recursive CTE you can determine the missing IDs as follows
DECLARE #Table TABLE(
ID INT,
Name VARCHAR(10)
)
INSERT INTO #Table VALUES (1, 'John'),(2, 'Sam'),(3,'Peter'),(6, 'Mike')
DECLARE #StartID INT,
#EndID INT
SELECT #StartID = MIN(ID),
#EndID = MAX(ID)
FROM #Table
;WITH IDS AS (
SELECT #StartID IDEntry
UNION ALL
SELECT IDEntry + 1
FROM IDS
WHERE IDEntry + 1 <= #EndID
)
SELECT IDS.IDEntry [ID]
FROM IDS LEFT JOIN
#Table t ON IDS.IDEntry = t.ID
WHERE t.ID IS NULL
OPTION (MAXRECURSION 0)
The option MAXRECURSION 0 will allow the code to avoid the recursion limit of SQL SERVER
From Query Hints and WITH common_table_expression (Transact-SQL)
MAXRECURSION number Specifies the maximum number of recursions
allowed for this query. number is a nonnegative integer between 0 and
32767. When 0 is specified, no limit is applied. If this option is not specified, the default limit for the server is 100.
When the specified or default number for MAXRECURSION limit is reached
during query execution, the query is ended and an error is returned.
Because of this error, all effects of the statement are rolled back.
If the statement is a SELECT statement, partial results or no results
may be returned. Any partial results returned may not include all rows
on recursion levels beyond the specified maximum recursion level.
Generating the RANDOM names will largly be affected by the requirements of such a name, and the column type of such a name. What exactly does this random name entail?
You can do this using a recursive Common Table Expression CTE. Here's an example how:
DECLARE #MaxId INT
SELECT #MaxId = MAX(ID) from MyTable
;WITH Numbers(Number) AS
(
SELECT 1
UNION ALL
SELECT Number + 1 FROM Numbers WHERE Number < #MaxId
)
SELECT n.Number, 'Random Name'
FROM Numbers n
LEFT OUTER JOIN MyTable t ON n.Number=t.ID
WHERE t.ID IS NULL
Here are a couple of articles about CTEs that will be helpful to Using Common Table Expressions and Recursive Queries Using Common Table Expressions
Start by selecting the highest number in the table (select top 1 id desc), or select max(id), then run a while loop to iterate from 1...max.
See this article about looping.
For each iteration, see if the row exists, and if not, insert into table, with that ID.
I think recursive CTE is a better solution, because it's going to be faster, but here is what worked for me:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TestTable]') AND type in (N'U'))
DROP TABLE [dbo].[TestTable]
GO
CREATE TABLE [dbo].[TestTable](
[Id] [int] NOT NULL,
[Name] [varchar](50) NOT NULL,
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
(
[Id] ASC
))
GO
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (1, 'John')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (2, 'Sam')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (3, 'Peter')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (6, 'Mike')
GO
declare #mod int
select #mod = MAX(number)+1 from master..spt_values where [type] = 'P'
INSERT INTO [dbo].[TestTable]
SELECT y.Id,'Name_' + cast(newid() as varchar(45)) Name from
(
SELECT TOP (select MAX(Id) from [dbo].[TestTable]) x.Id from
(
SELECT
t1.number*#mod + t2.number Id
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
WHERE t1.[type] = 'P' and t2.[type] = 'P'
) x
WHERE x.Id > 0
ORDER BY x.Id
) y
LEFT JOIN [dbo].[TestTable] on [TestTable].Id = y.Id
where [TestTable].Id IS NULL
GO
select * from [dbo].[TestTable]
order by Id
GO
http://www.sqlfiddle.com/#!3/46c7b/18
It's actually very simple :
Create a table called #All_numbers which should contain all the natural number in the range that you are looking for.
#list is a table containing your data
select a.num as missing_number ,
'Random_Name' + convert(varchar, a.num)
from #All_numbers a left outer join #list l on a.num = l.Id
where l.id is null