How do I compare two queries by two columns in MySQL? - mysql

What's the best way to compare two queries by two columns? these are my tables:
This table shows exam questions
idEvaluation | Question | AllowMChoice | CorrectAnswer|
1 1 0 3
1 2 1 4
1 2 1 5
1 3 0 9
This table shows a completed exam
idExam| idEvaluation | Question | ChosenAnswer|
25 1 1 2
25 1 2 4
25 1 2 5
25 1 3 8
I have to calculate the percentage of correct Answers, considering to certain questions may allow multiple selection.
Correct Answers / Total Answers * 100
thanks for your tips!

This code will show you a listing of Questions and whether or not they were answered correctly.
select
A.Question,
min(1) as QuestionsCount,
-- if this evaluates to null, they got A) the answer wrong or B) this portion of the answer wrong
-- we use MIN() here because we want to mark multi-answer questions as wrong if any part of the answer is wrong.
min(case when Q.idEvaluation IS NULL then 0 else 1 end) as QuestionsCorrect
from
ExamAnswers as A
left join ExamQuestions as Q on Q.Question = A.Question and Q.CorrectAnswer = A.ChosenAnswer
group by
A.Question -- We group by question to merge multi-answer-questions into 1
Output Confirmed:
Note, the columns are intentionally named this way, as they are to be included as a subquery in part-2 below.
This code will give you the test score.
select
sum(I.QuestionsCorrect) as AnswersCorrect,
sum(I.QuestionsCount) as QuestionTotal,
convert(float,sum(I.QuestionsCorrect)) / sum(I.QuestionsCount) as PercentCorrect -- Note, not sure of the cast-to-float syntax for MySQL
from
(select
A.Eval,
A.Question,
min(1) as QuestionsCount,
min(case when Q.idEvaluation IS NULL then 0 else 1 end) as QuestionsCorrect
from
ExamAnswers as A
left join ExamQuestions as Q on Q.Question = A.Question and Q.CorrectAnswer = A.ChosenAnswer
where
A.Eval = 25
group by
A.Question, A.Eval) as I
group by
I.Eval
Output Confirmed:
This will communicate the general concept. Your column names idEvaluation and Eval are difficult for me to understand, but I'm sure you can adjust the code above to suit your purposes.
Note, I did this in sql server, but I used fairly basic SQL functionality, so it should translate to MySQL well.

Related

Query to list users who have answered similarly to questions as the specified user

I'm making a project in which users answer questions with yes/no choices and an option of must-match, e.g.:
Question 1) Stack Overflow is helpful? [Yes / No] [MustMatch]
Question 2) Hills are better than beaches? [Yes / No] [MustMatch]
etc.
Users can skip questions if they want.
I need MySQL query to calculate match percentage or match ratio between two specified users; the match ratio being number of same answers / number of common questions. (The MustMatch option should be ignored for this query.) Example:
User 3 has answered as: Q1) Yes, Q2) No, Q3) Skip, Q4) Yes
User 5 has answered as: Q1) Yes, Q2) Yes, Q3) No, Q4) Yes
User 9 has answered as: Q1) Yes, Q2) No, Q3) Yes, Q4) No
Suppose we want match ratio of user 3 and user 5. The match ratio will be same-answers / common-questions = 2/3 as they have answered same to 2 questions (Q1 and Q4) and because user 3 skipped a question so they have 3 questions in common.
If we want match ratio of user 5 and user 9, it'll be 1/4.
I also need MySQL query to show top 20 matches in descending order of match ratio for a user. The MustMatch option should be considered for this query. This list of top 20 matches will contain userid, no. of common answers, no. of common questions and (match percentage or match ratio). For top 20 matches of example user 5: user 3 will come before user 9 in descending order of match ratio:
UserID CommonAnswers CommonQuestions Percentage
...
3 2 3 66%
...
9 1 4 25%
...
"MustMatch" explained:
Simply, the answer must match. If the answer to a question is important compatibility match to a user then they can mark their answer as MustMatch. If a user marks their answer as MustMatch and that particular answer doesn't match with the other user then don't show their match in top 20 list of either of the users.
For example, if user 5 marks their answer to the 4th question as MustMatch then in user 5's top 20 list, user 9 won't display because of mismatching answers to 4th question. Also then in user 9's top 20 list, user 5 won't display.
If user 5 marks their answer to the 3rd question as MustMatch then user 5 won't match with user 3 as user 3 has skipped that question and user 5 won't match with user 9 because of mismatching answers to that question.
Skipped question can't be marked MustMatch, as you can see in below database design that it's not possible.
This is my table for storing answers:
create table `answers`(
`userid` mediumint unsigned not null,
`qid` tinyint unsigned not null, # question id
`answer` bit(1) not null,
`mustMatch` bit(1) not null default b'0',
primary key (`userid`,`qid`)
);
If a question is skipped then don't have its record in this answers table.
I've a separate table for storing questions' texts:
create table `questions`(
`qid` tinyint unsigned not null auto_increment primary key,
`question` tinytext not null
);
In calculating top 20 matches, apart from MustMatch, there's another factor: "MinMatch", in which users can set the minimum matching answers they want from their matches. If user 5 set their MinMatch as 3 then in top 20 list of any user, user 5 will match with only those users who have minimum 3 same answers as user 5. MinMatch is set by the user; it should be >=1 and <= total questions answered by the user.
create table `users`(
`userid` mediumint unsigned not null auto_increment primary key,
`minMatch` tinyint unsigned default 1 # minimum matching answers
);
Can you please tell how to make the two queries? (my two bolded sentences)
Here's how the 2nd query may be: (you can skip this)
SELECT
U2.userid,
SUM(CASE
WHEN A1.answer = A2.answer THEN 1
ELSE 0
END) AS common_answers,
SUM(CASE
WHEN A1.answer IS NOT NULL AND A2.answer IS NOT NULL THEN 1
ELSE 0
END) AS common_questions,
common_answers/common_questions AS ratio
FROM
questions Q
LEFT OUTER JOIN answers A1 ON
A1.qid = Q.qid AND
A1.userid = 1
LEFT OUTER JOIN answers A2 ON
A2.qid = A1.qid AND
A2.userid <> A1.userid
LEFT OUTER JOIN users U2 ON
U2.userid = A2.userid
GROUP BY
U2.userid
ORDER BY
ratio DESC
LIMIT 20
Here i've modified someone(Tom H)'s query i found on Stack Overflow for a similar question. But my this query doesn't work (error: Unknown column 'common_answers' in 'field list') and doesn't include the MustMatch and MinMatch factors.
More edit: (More explanation:)
How my first example is stored in the database:
answers table:
userid qid answer mustMatch
3 1 1 0
3 2 0 0
3 4 1 0
5 1 1 0
5 2 1 0
5 3 0 0
5 4 1 0
9 1 1 0
9 2 0 0
9 3 1 0
9 4 0 0
As discussed, match ratio of user 3 and user 5 = 2/3, match ratio of user 5 and user 9 = 1/4.
Let's add a new user: User 4:
User 4 has answered as: Q1) Yes, Q2) No, Q3) No, Q4) Yes
userid qid answer mustMatch
4 1 1 0
4 2 0 0
4 3 0 0
4 4 1 0
Let's start calculating top 20 but before that we need to know the users' minMatch values:
users table:
userid minMatch
3 1
5 1
9 1
4 1
Top 20 matches of user 5:
Result:
UserID CommonAnswers CommonQuestions Percentage
4 3 4 75%
3 2 3 66%
9 1 4 25%
If user 5 marks their answer to 3rd question as must match, i.e. this record:
userid qid answer mustMatch
5 3 0 0
changes to this:
5 3 0 1
Then user 5's top 20 list will be:
UserID CommonAnswers CommonQuestions Percentage
4 3 4 75%
Because only user 4 matches with user 5's answer to 3rd question.
Also then user 3's top 20 list will be:
UserID CommonAnswers CommonQuestions Percentage
4 3 3 100%
9 2 3 66%
Now if user 9 increases their min match to 3, i.e. this record:
userid minMatch
9 1
changes to this:
9 3
Then user 3's top 20 list will be:
UserID CommonAnswers CommonQuestions Percentage
4 3 3 100%
User 9 got removed from user 3's top 20 list, because user 3 and user 9 have only 2 answers in common while user 9 requires minimum 3 answers to be in common.
Also then user 9's top 20 list will be empty as no user has at least 3 matching answers with user 9.
Instead of storing answers as i'm storing now, i can store them as strings in the users table if that's a better method. Adding a field answers to the users table:
create table `users`(
`userid` mediumint unsigned not null auto_increment primary key,
`minMatch` tinyint unsigned default 1, # minimum matching answers
`answers` char(5) not null default 'sssss' # 5 questions
);
Then answers can be stored as strings e.g. yYNns, where y=yes to 1st and 2nd questions in this string, n=no to 3rd and 4th questions, capitalizations to 2nd and 3rd questions ('Y' and 'N') means must match, s=skip the 5th question.
Or maybe some other better storage method. You can give me the 2 SQL queries according to whichever method you think is the best.
For the first part where you just want to get the ratio of equally answered questions to common answered questions, I'd use this query:
select a1.userid, a2.userid, sum(a1.answer = a2.answer) / count(*) as ratio
from answers a1
join answers a2 on a2.qid = a1.qid and a2.userid > a1.userid
group by a1.userid, a2.userid
order by a1.userid, a2.userid;
I am using a2.userid > a1.userid to get each user pair only once. (If I used a2.userid <> a1.userid, I'd get users 3 and 5 twice for instance, one time as 3/5 and one time as 5/3.)
If you want to look up two particular users, we add the appropriate WHERE clause and remove and a2.userid > a1.userid. We can also get rid of GROUP BY, because there is just one row to return.
select sum(a1.answer = a2.answer) / count(*) as ratio
from answers a1
join answers a2 on a2.qid = a1.qid
where a1.userid = 3 and a2.userid = 5;
For the top 20 query we have one user given and need a HAVING clause to make sure no mustmatch and minmatch is violated.
select
a1.userid,
a2.userid,
count(*) as common_questions,
sum(a1.answer = a2.answer) as common_answers,
sum(a1.answer = a2.answer) / count(*) as ratio
from answers a1
join answers a2 on a2.qid = a1.qid and a2.userid <> a1.userid
where a2.userid = 9
group by a1.userid, a2.userid
having sum(a1.answer <> a2.answer and (a1.mustmatch = 1 or a2.mustmatch = 1)) = 0
and sum(a1.answer = a2.answer) >=
all (select minmatch from users u where u.userid in (a1.userid, a2.userid))
order by ratio desc
limit 20;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=21edc084a0843987a14cb8b49f0f0213
Both queries use sum( <boolean expresssion> ). As true is 1 and false is 0 in MySQL this counts matches. The standard SQL way of writing this would be sum(case when <boolean expresssion> then 1 else 0 end) or count(*) filter (where <boolean expresssion>).
Here is my answer, I tried to break it up into many steps for clarification.
Please note we choose here userid 5 as the base user to compare to him all the rest. This can be changed easily in user1 and user2 cte table:
WITH
answers (userid, qid, answer, must_match) AS
(
VALUES ROW(3,1,'y',false), ROW(3,2,'n',false), ROW(3,3,'s',false), ROW(3,4,'y',false),
ROW(5,1,'y',false), ROW(5,2,'y',false), ROW(5,3,'n',false), ROW(5,4,'y',true),
ROW(9,1,'y',false), ROW(9,2,'n',false), ROW(9,3,'y',false), ROW(9,4,'n',false)
),
user1 AS -- the desired user
(
SELECT *
FROM answers
WHERE userid = 5
AND answer != 's'
),
user2 AS -- all the rest of users
(
SELECT *
FROM answers
WHERE userid != 5
AND answer != 's'
),
join_t AS
(
SELECT user1.qid qid, user2.userid u2_id,
user1.answer = user2.answer answers_match,
( user1.must_match OR user2.must_match ) AND ( user1.answer != user2.answer ) breaks_match
FROM user1
JOIN user2
ON user1.qid = user2.qid
),
count_t AS
(
SELECT u2_id,
count(*) common_questions,
count(case when answers_match then 1 end) same_answers,
max(breaks_match) breaks_match
FROM join_t
GROUP BY u2_id
),
ratio_t AS
(
SELECT u2_id userid,
same_answers,
common_questions,
concat(same_answers, '/', common_questions, ' = ') r,
round( (same_answers / common_questions) * 100, 2) as ratio,
breaks_match
FROM count_t
)
SELECT *
FROM ratio_t
WHERE not breaks_match
ORDER BY ratio DESC
LIMIT 20
;
This of course answers the second question (that is more general).
For answering the first question, just change the last select with this:
SELECT *
FROM ratio_t
WHERE userid = 9
;
Link to try: https://paiza.io/projects/e/5vyWa_aUb-zK8MQ4N7x3Kg
A direction to consider... Have a bit string. Have 3 bits per question, one for each of Yes/No/Skip. Then count the number of identical responses via:
BIT_COUNT(user1.bits & user2.bits)
Before MySQL 8.0, BIGINT UNSIGNED holds 64 bits, hence handling up to 21 questions in one table column. For more questions, break into multiple bigints. With BIGINT, 1 << n gives a mask for bit #n (n=0..63). That can be or'd (via operator |) to set a bit. Other operations are almost as easy.
For 8.0, BLOB can handle bit operations, thereby allowing an essentially unlimited number of questions in a single column.
An optimization: 2 bits per question: one for Yes, one for No. Skip is indicated by both bits being 0. The BIT_COUNT(...) works as described above.

Different entries within 5 seconds

Sorry I don't know how to describe the topic.
i have a database where i store the unixtime of the entries and some other stuff, in this case the column "name" for the user and "type" it can be 1 or 2.
I want to check if there are entries where name is the same and type switches from 1 to 2 and back to 1 or 2 1 2 within 5 seconds.
So it shows me something like this:
Unixtime Name type
1550293559 Peter 2
1550293560 Peter 1
1550293561 Peter 2
Is there a query that can help me do this?
Sorry I really hope you guys understand that, I don't know how to explain the problem properly.
Thanks.
You can do that with a 3x self join on that table and the necessary conditions (All 3 rows have the same name etc.). See http://www.mysqltutorial.org/mysql-self-join/ for more info.
Note that as the join produces all the possible permutations as input material, you don't have to 'permute' the conditions in the where part of the query. E.g. To get the 5 second rule, you can just say
... where e1.unixtime > e2.unixtime and e2.unixtime > e3.unixtime and e3.unixtime+6 > e1.unixtime ...
Edit: since the original answer was downwoted, here is the full query (grumble grumble) assuming the table name 'sotest':
SELECT
*
FROM
sotest e1
JOIN
sotest e2
JOIN
sotest e3
WHERE
(e1.name = e2.name AND e2.name = e3.name
AND e1.unixtime > e2.unixtime
AND e2.unixtime > e3.unixtime
AND e3.unixtime + 6 > e1.unixtime)
AND ((e1.type = 1 AND e2.type = 2
AND e3.type = 1)
OR (e1.type = 2 AND e2.type = 1
AND e3.type = 2))

combining two select queries from the same table

I need to do something like this:
id tag status user trial Value (other columns)...
1 A Pass peter first 0
2 A Pass peter second 1
3 A Fail peter third 3
4 B Pass peter first 4
5 B Pass peter second 5
6 B Pass peter third 6
select the rows that tag equal A and status equal to Pass and find the same value for other tag ex:B
id tag status user trial Value_tag_A Value_tag_B (other columns)...
1 A Pass peter first O 4
2 A Pass peter second 1 5
I can do some processing using php to get this result, but i'm wondering if i can do it directly using sql
I've tried numerous variations and can't seem to get close to the result.
Solution: http://sqlfiddle.com/#!9/e9068d/17
I don't know why in the rows where tag=A also have Value_tag_B. I will ignore this and maybe the following query is an approach.
SELECT DISTINCT y.status, y.`user`, y.trial,
(SELECT Value FROM toto WHERE y.`user` = `user` and y.trial = trial and tag = 'A' ) AS Value_tag_A,
(SELECT Value FROM toto WHERE y.`user` = `user` and y.trial = trial and tag = 'B' ) AS Value_tag_B
FROM toto y
WHERE y.trial NOT IN (SELECT DISTINCT trial FROM toto WHERE `status` <> 'Pass')
The code has been modified.
SQL Fiddle

Using IN clause in sql server

My query is like below.I want to select values if Type = 1 and subtype = 1,3 or 2.
select sum(case when Type = 1 and SubType in (1, 3 or 2) then 1 else 0 end) as 'WorkStations'
Is this right way?
Since you're only trying to get a count of the workstations that meet the criteria as far as I can see:
SELECT COUNT(*) AS Workstations FROM MyWorkStationTable WHERE Type = 1 AND SubType IN (1, 2, 3)
Also, an IN clause is by nature already an OR. It is neither valid syntax nor necessary to state it.
If you're simply counting records, your best bet is to use the COUNT function provided by SQL Server. Consider using the following:
SELECT COUNT(*) FROM [Table] WHERE TYPE = 1
AND (SUBTYPE = 1
OR SUBTYPE = 2
OR SUBTYPE = 3)
It is best to avoid using 'IN' as it can lead to unnecessary calls to the SQL engine.
SELECT COUNT(*) [Workstations] FROM [YourTable] t WHERE t.Type = 1 AND t.SubType IN (1, 2, 3)
Try avoiding IN Predicates and instead use Joins because it Iterate unnecessarily despite of the fact that there is just one/two match. I will explain it with an example.
Suppose I have two list objects.
List 1 List 2
1 12
2 7
3 8
4 98
5 9
6 10
7 6
Using IN, it will search for each List-1 item in List-2 that means iteration will happen 49 times !!!

How to transform rows into columns having no column with unique values?

I am trying to transform some rows into columns in MySQL. I know it has been asked and answered previously, like here.
My problem is, there is nothing in my rows on which I can apply the 'if' construct. (At least I think so.)
E.g. For the following input,
2 5 1000
2 6 2000
I can run this query:
INSERT INTO SUMMARY
(user_id,valueA,valueB)
SELECT d.user_id,
MAX(CASE WHEN d.code = 5 THEN d.value ELSE NULL END),
MAX(CASE WHEN d.code = 6 THEN d.value ELSE NULL END),
FROM DETAILS d
GROUP BY d.user_id
and get this output:
2 1000 2000
But my problem is, my input is something like this:
2 6 1000
2 6 2000
(The values in the second column are not unique.)
And i still need the same output, i.e.:
2 1000 2000
Can it be done in MySQL? If yes, can anyone help me with this?
Well, if you don't have any idea of how many columns there will be in your pivot table, nor have a value to decide which value should go in a given column, the best solution I can recommend is to use a GROUP_CONCAT function and then do some parsing in your code:
SELECT d.user_id, GROUP_CONCAT(
d.value SEPARATOR ','
) AS val
FROM details d
GROUP BY d.user_id;