sql queries using group by - mysql

I have a quiz_tracking table (in mysql) that contains the ff fields:
subjectid - the id of the subject ( think of your subjects in school)
assessmentname - the name of the assessment ( think of it as exam or like chapter exams, the exams you found at the end of each chapter) found in each subject
questionid - the id of the question
userid - the id of the user who took the exam
attempt - attempt number
answer - answer given
score - score gained for this question
rownum - disregard this column, i only put in this one so its easier to point out which row numbers i am particularly interested in.
When the user takes a quiz, the records are recorded here. The assessments have different number of questions. For this example, first assessment has 3, second assessment has 3, third assessment has 1. The user can leave the quiz at the middle, in this case the answer column will be null since the quiz taker abandoned it.
I created a sql fiddle for the table and the data.
http://sqlfiddle.com/#!9/a41473/1
Basically, what I want to come up is I need to filter this data set and present a data set that only contains completed assessment attempts. Meaning, if the assessment has 3 questions and all these questions were answered (answer is not null) then that should be in the filtered data set.
The way I figured out how to determine if there was a complete assessments is via this sql:
select max(a.question_count) from (
select count(distinct qt.questionid) as 'question_count'
from quiz_tracking qt
where qt.subjectid=22380
and qt.assessmentname = 'first assessment'
and qt.userid in (555,121)
group by qt.attempt, qt.userid ) a )
I count all the question ids. Then i do a
having ( sum(if(answer is not null,1,0)) = result of above subquery.
The assumptions here are,
the subjectid is provided,
all the assessment names are provided
all the userids are provided.
In the sql fiddle,i can do it for 1 assessment (e.g. 'first assessment'), but what I need to do is to produce a filtered data set that contains all the assessments (first assessment, second assessment, third assessment). The expected result should be row numbers 1,2,3,4,5,6,12,13,14,15,16,17,21,22.

Simply join the main table to an aggregate derived table that calculates question count and answer count, then in outer query return when both are equal:
select z.*, q_cnt.question_count, q_cnt.answer_count
from quiz_tracking z
inner join
(select c.userid, c.subjectid, c.assessmentname, c.attempt,
count(distinct c.questionid) as 'question_count',
sum(if(c.answer is not null,1,0)) as 'answer_count'
from quiz_tracking c
group by c.userid, c.subjectid, c.assessmentname, c.attempt) q_cnt
on q_cnt.userid = z.userid
and q_cnt.subjectid = z.subjectid
and q_cnt.assessmentname = z.assessmentname
and q_cnt.attempt = z.attempt
where q_cnt.question_count = q_cnt.answer_count
SQL Fiddle DEMO

Related

How can i get a specific data from my dataset using SQL and PHPmyadmin?

I'm new to sql.
I have 3 datasets ,
patient (columns : id age Zip_Code, size, weight, sex)
blood_tests (columns : test_ID, test_date, blood_sugar, laboratory_ID, patient_ID)
laboratory (columns : id, name, Zip_code, departments)
how can i get the number of patients per center ?
i did this code but it doesnt give the number per
select DISTINCT patient_ID, laboratory_ID from patient,blood_tests where patient.id = blood_tests.patient_ID AND blood_tests.laboratory_ID = laboratory.id;
but i don't know how to get the total number of patients per center, because some of them did more than one exam in the same center and they have done tests in many labs?
For the second question. he aks us to get the 4 tests that a specific patient carried out in a laboratory called 'NWB'.
and i did this and noticed that he is patient with and ID = 25 but how can i get that without specifying that the id is 25.
select patient_ID, laboratory.name from patient, blood_tests,laboratory where patient_ID = blood_tests.patient_ID AND blood_tests.laboratory_ID = laboratory.id HAVING laboratory.name = "NWB";
Thank You in advance.
This query will give you the number of distinct patients that had a blood test done for each laboratory. The query GROUPs all the records with the same laboratory_ID and then counts the number of DISTINCT patients per GROUP. Distinct in this case means that each patient is only counted once per GROUP even if there are multiple records with the same patient_ID and laboratory_ID. However the patient can still be counted in other GROUPs.
SELECT laboratory_ID, COUNT(DISTINCT patient_ID)
FROM blood_tests
GROUP BY laboratory_ID;
You could join the laboratory table to show the name of the laboratory instead of the id.
SELECT laboratory.nom, COUNT(DISTINCT blood_tests.patient_ID)
FROM blood_tests
JOIN laboratory ON laboratory.id = blood_tests.laboratory_ID
GROUP BY blood_tests.laboratory_ID;
Your other question is confusing because there is no 'the' test for a patient having carried out 4 tests. There would be 4 tests - which one of them is 'the' test.
Here is a query that will list the ids of all patients that have had 4 test carried out at the laboratory with the id 42.
SELECT patient_ID
FROM blood_tests
WHERE laboratory_ID = 42
GROUP BY patient_ID
HAVING COUNT(patient_ID) = 4;
EDIT based on comment from OP:
I hope this is useful for anyone trying to learn how to query a database.
When you query a database your asking it for information based on what is found in the database not based on what you already know to be true. You first need to figure out what it is that you are asking.
If you say you know that there is some (some means at least one) specific patient that did exactly 4 tests at laboratory NWB, you could ask the database to list all patients that carried out exactly 4 tests at laboratory NWB.
If you know the specific patient and want to get the tests then you would ask the database for all the tests carried out at at laboratory NWB for this specific patient. In this case the fact that you know there are 4 of them is not a criteria for selection, so it wouldn't appear in your query.
If you know there is some patient that did exactly 4 tests and you want to get their tests. What are you asking the database? You could ask: Find me all tests for any patient that has done exactly 4 tests at laboratory NWB. But if there are other patients that also had exactly 4 tests at that laboratory you would get their tests as well.
Above you saw how to group records together by a criteria and how to count the members of each group.
The HAVING clause allows you to limit results based on criteria that applies to a group.
The WHERE clause allows you to limit results based on criteria that applies to a record.
You should be able to construct a query for any of those scenarios using what was shown above.
You need to use COUNT(DISTINCT ~ ) and GROUP BY. We don't need to use the table patients because we have patient_ID in blood_tests.
SELECT
COUNT( DISTINCT bt.patient_ID) number_patients ,
bt.laboratory_ID
FROM
blood_tests bt
JOIN
laboratory l
ON bt.laboratory_ID = l.id
GROUP BY
laboratory_ID;
For the second request:
SELECT
GROUP_CONCAT(
bt.test_ID
SEPARATOR CHAR(10)
) test_IDs,
bt.patient_ID patientID,
bt.laboratory_ID. labo_ID,
l.name lab_name,
l.Zip_code. lab_zip,
l.departments. lab_dept
FROM
blood_tests bt
JOIN
laboratory l
ON
bt.laboratory_ID = l.id
WHERE
l.name = 'NWB'
GROUP BY
bt.patient_ID ,
bt.laboratory_ID,
l.name lab_name,
l.Zip_code,
l.departments
HAVING
COUNT(bt.test_ID) = 4;
Assuming you're using Mysql with PhpMyAdmin (from now i'll call it PMA)
You get the COUNT of patients for each laboratory using GROUP BY keyword.
You can read the (unofficial) documentation here.
Using GROUP BY you can get the count of tests for each laboratory with
SELECT laboratory_ID, COUNT(*) As num_tests FROM blood_tests WHERE 1 GROUP BY laboratory_ID;
So the query returns 2 columns one for the laboratory ID and one for the number of tests for that one.
You can see the zip code of laboratory using JOINS statements, in this case you should use INNER JOIN click here for (unofficial) documentation

Group by one field and then group the result by another field

I have a Query that Groups by a column which is needed so I get the result that I need, and then I need to return results which should be done by Grouping the previous results by another field.
So basically I have a Survey table,
sql = SELECT * FROM Survey S
WHERE S.UserId = 79
Group By S.SurveyNumber
Having SUM (S.Counter) <> 0 ORDER BY S.SubmittedDate DESC
This returns the Survey grouped by the Number, and then I need to Group the result by SurveyName and return the Last Submitted Survey for that SurveyName ( Max(submittedDate).
Can I achieve this in using one query ? If I have
GroupBy S.SurveyNumber, S.SurveyName
Then it will try to find that have BOTH of the columns same.
How do I do this ?
i think it works:
SELECT
S2.SurveyName
,SUM(S2.Count) as SurveyCount
FROM (
select
SUM(S.SurveyNumber) as Count
,S.SurveyName
FROM Survey S
where S.SurveyNumber <> 0
Group By S.SurveyNumber,S.SurveyName
) as S2
Group By S2.SurveyName
This is how I understand this:
Every survey belongs to one user. You want the survey of one particular user.
The table is actually not a Survey table (with one record representing a survey), but a kind of survey chronology table. There are multiple records per survey.
You must look at all records per survey in order to know whether its paid.
For each paid survey you want the last chronology record.
You have already shown how to check whether a survey is paid. Now select the last date for them (the maximum date). Based on this get the related records from the table.
select *
from survey
where (surveynumber, submitteddate) in
(
select surveynumber, max(submitteddate)
from survey
where userid = 79
group by surveynumber
having sum(counter) <> 0
);
I may be wrong though, because you make it sound like a survey number is somehow independent from the survey name. (One survey number with various names? The same survey name for multiple survey numbers?)
You would certaily benefit from a better data model with at least two separate tables for survey and survey details.

Mysql Calculate Percentage Of Selected Rows

I'm sorry if my question irrelevant. Because I'm not expert of mysql.
I have surveys in my website. If someone votes my survey, i want to show survey results to the voter.
Here is the DB Structure;
surveys Table (Includes Survey Name Description and survey_id)
survey_choices Table (Includes Survey Chocies and related with survey_id to Surveys Table)
I'm trying to calculate only visible survey choices %percentage.
If i have only 1 survey, it's calculating correct results. But if i have more than 1 survey, mysql calculating whole survey_vote_count values of all table.
Here is my MySQL query;
SELECT
survey_vote_id, survey_vote_name, survey_vote_count, survey_id,
survey_vote_count * 100 / t.s AS survey_percentage
FROM survey_votes
CROSS JOIN (SELECT SUM(survey_vote_count) AS s FROM survey_votes) t
WHERE visibility = :visibility
AND survey_id = :surveyId
ORDER BY `survey_votes`.`order_id` ASC
How can i calculate for eg only survey_id = 1 and visibility = 1 %percentage = %100?
Any help will greatly appricated.
You should add the same condition to your cross join. At the moment, in your select you sum all survey_vote_count without the same where condition (visibility and survey_id.

How to select all subjects that are taken by all students such that all students took grade > 50 in them

I have a table called records containing 3 columns: student, subject and grade
Each {x,y,z} entry in the table means student x took class y with grade z. If student x didn't take class y then the entry doesn't exist in the table. So this table is like a university record of all students and the subjects taken by them.
I want to select all subjects that are taken by all students such that the grades of all students in these subjects are above 60.
I tried creating the table
CREATE TABLE temp SELECT subject FROM records WHERE grade > 60;
Then I used temp to create a new table that has subject and count, where count counts the number of students that took that subject, and then I deleted all rows that have count< number of students. But I know this is very inefficient.
How can I do this more efficiently using MySQL ?
Also if you can provide me with good MySQL resource/tutorial link so that I can practice, I would be thankful. I am new to MySQL and I am working on large databases and I need to make my queries more efficient and straight forward.
How about
SELECT subject FROM records
WHERE subject NOT IN
(
SELECT subject FROM records
WHERE grade <=60
)
AND subject IN
(
SELECT subject FROM records
GROUP BY subject
HAVING count(*) = (SELECT COUNT(DISTINCT student) FROM records)
)
As further reading I'd recommend this and this
EDITED: now includes "subjects taken by ALL students"
I would use the no exists clause because it simply confirms if there are no records within the subquery, but will not try to obtain the resultset. It is more effective than a not in clause.
SELECT subject FROM records r1
WHERE subject NOT EXISTS
(
SELECT 1 FROM records r2
WHERE r2.subject=r1.SUBJECT AND grade<=60
)

Finding the sum of a field in a linked table per group with additional search criteria

My actual tables are much more complex but here is a simplified example of the problem I am trying to work out.
Table contact: ContactID, ContactName, Pending
Table purchase: PurchaseID, ContactID, Amount, Pending, Date
Table contact_purchase_link: ContactID, PurchaseID (although it may seem like the link table is not necessary in this simplified example it is necessary in the large table schema)
Here is the query that I currently have:
SELECT DISTINCT contact.ContactID,
( SELECT SUM(Amount)
FROM purchase
WHERE purchase.ContactID = contact.ContactID
AND purchase.Pending = 0
) totalpurchase
FROM contact
INNER JOIN ( contact_purchase_link JOIN purchase
ON (contact_purchase_link.PurchaseID = purchase.PurchaseID
))
USING (ContactID)
WHERE purchase.Date > '2013-12-06' AND
AND contact.Pending =0
The problem is that I want the totalpurchase (the sum of the amount field) to be limited to the search criteria of the purchase table - meaning the query should only return the sum of the purchases after the specified date per contact. I think in order to use a group by clause the query would have to be based off the purchase table but I need the query to use the contact table so that all contacts are listed with their total purchase amounts and other relevant client data.
Is there any way to do this within one query?
To further clarify:
This query is being generated as part of a search engine. An example of why a query like this would be done is if a user wanted to generate a contact list of lastnames starting with A with purchases of a specific item or as in this example of purchases for a specific date. So that in general the query would have to generate a list of all contacts and their data (with possible search criteria on the type of contact such as all lastnames starting with 'A' etc.) and the query can also include search criteria on the purchase table such as the date of the purchase and whether the purchase was for specific items etc.
I am trying to add in the option to also list the sum of the purchases for the contact however that sum has to be limited to the search criteria for the purchase table as well and not the sum of all the contacts purchases.
If I understand your question correctly, you need to move the date comparison inside the first subquery:
SELECT DISTINCT contact.ContactID,
( SELECT SUM(Amount)
FROM purchase
WHERE purchase.ContactID = contact.ContactID
AND purchase.Pending = 0
AND purchase.Date > '2013-12-06'
) totalpurchase
FROM contact
INNER JOIN ( contact_purchase_link JOIN purchase
ON (contact_purchase_link.PurchaseID = purchase.PurchaseID
)
USING (ContactID)
WHERE purchase.Date > '2013-12-06'
AND contact.Pending =0
But the comments are right - I corrected a couple of what appears to be syntax errors, and I'm not sure about the join to contact_purchase_link. Improve your question and my answer will be less like guesswork.