Suggestions on Database Design

Suggestions on Database Design - mysql

I am building a sample online examination platform (I'm in the process of learning Ruby on Rails) with the following specifications:
There are 1000 different multiple choice questions.
Each question can have up to 5 possible answers, 1 of them is correct.
A user is presented with 10 random questions at a time (let's call this a test). If a user answers a question correctly 2 times, then this question will not be shown to him again.
A user passes the exam if he has answered every question correctly 2 times, in other words when there are no more questions left to show to him.
A first try :
student
-student_id
-name
question
-question_id
-text
option
-option_id
-text
-is_correct
-question_id
student_answer
-student_id
-question_id
-option_id
Although we could only store the correct questions, I've decided to include the 'option_id' in the student_answer table in case I need to display statistics (hardest question etc) in the future.
So up to this point, a question has many options, every option belongs to a single question and every time a student answers a question a student_answer row is created.
It seems to me that this approach would have some performance issues since for each test we'd have to select all the answers given by the user, group them by the question_id, for each question calculate the correct times it has been answered, get a set of question_id that shouldn't be displayed and finally select 10 random questions out of the 1000 initial ones minus those we just excluded.
Another thought I had was to have a JSON array in the form of {[0,0,1,...,1]} for every user. Each cell would be the number of correct answers for the question with an id matching the array index but I find that a bad idea.
Since I'm relatively a beginner when it comes to database design I'd like some feedback on my approach. Please feel free to suggest ways that are completely different than the above.
Thank you very much.

I think you may need to include the question_id in the option table.
One approach would be to move some of the processing into Ruby.
Select the 1000 questions.
Select the answers given by the user: SELECT count(*) counter, question_id, option_id FROM student_answer JOIN option USING (question_id,option_id) WHERE student_id=x AND option.is_correct=1 GROUP BY question_id,option_id HAVING counter>1
Randomize the 1000 questions in ruby, iterate though them, and exclude any that were found in your query of correct answers for this student. Stop after 10 questions.

If only one answer can be correct, then why store the correctness in the option table, the question record should contain the foreign key of the correct answer.
You describe some entities not addresed by your design. You maybe don't need to store 'a test' but this, and a primary key on student_answer makes for a model which makes it a bit easier to answer different questions about the data.

I think you have a good approach going, I would probably do the same. - symcbean does make a good point above, but my solution for this would be to store a boolean column within the student_answer table for whether the answer is correct.

Related

Surveys and Answers, relation on many-to-many

I have the following database scheme (I don't know if it's perfect but I think it's allright?)
It's a system where a User has many Surveys, the Users provide Answers for Questions in the Survey_Answers table.
Now a User can have multiple Surveys, it's the same questions but in a later time of the year they have to fill in the survey again.
I'm nearly there, I'm just wondering how to connect the answers to the survey. Should I make a relation between survey_answers and user_surveys.. thus adding an id to the user_surveys table
Or do you think it's ok to make a relation to the surveys table? I'm not sure which one is correct.
I outlined the 2 possibilities in the second screenshot.
Looking forward to your responses!
Thank you.

This probably depends on how your system is most likely/most frequently going to navigate the relationship.
If you are more likely to be looking at a Users answers and saying - hey let me see when this question was answered, as part of which dated survey, then you should join on user_surveys (I am assuming that the employee_id you are storing would match the user_id in user_surveys)
If you're more likely to be looking at a Users answers and saying - hey what survey did this question belong to, then you should join on Surveys.
You can still answer either question whichever join you use, it will just be a matter of more optimal performance (fewer table joins when trying to answer the most common query).
In reality there probably isn't much in it, so you could always toss a coin :)

Storing Sub-Questions in a database for a Survey/Questionnaire System

I am currently in the process of creating my tables for a survey/questionnaire system. As I got to creating the questions table, I think I came across a slight issue that could impact the whole application if I continue. Within my questions table, I have a column called "subBelongsToQuestion", which is an integer value for identifying which sub-questions belong to which parent questions (if any). Then in my answers table, I have a column called "responseRevealSubQuestion", which is an integer value for identifying which sub-questions to reveal if the trigger answer within the "responseRevealSubQuestion" column value matches with the "response" column value.
So for example, if a user answered yes to a question such as "Do you like cheese?", then a sub-question would appear saying "What do you like about cheese?".
I am wanting to convert this vision into a database format and I wasn't sure if I should continue with the approach I am using, or to change? It is so that if say a user deletes a question that contains sub-questions, then the application can run the required code to also delete the sub-questions and trigger answers as well.

Usually for survey apps you dont use SubQuestions, you define flow conditions
Imagine you have this questions on your db
Q_ID Question
1 Do you like cheese?
2 What do you like about cheese?
3 Do you like meat?"
4 What do you like about meat?
5 ...
Then you have a flow table to validate after one answer.
Q_FROM Q_VALUE Q_TO
1 NO 3
3 NO 5
In this case you only take detour for NO answer. otherwise you continue with the next question.
After your end each question you do
SELECT Q_to
FROM FlowTable
WHERE Q_from = #CurrentQuestion
AND Q_value = #CurrentAnswer

Creating a Poll application. Need advice on database structure for the vote count

I'm writing an application to allow users to create a Poll. They ask a question and set n number of predefined answers to the question. Other users can vote on the answers provided for that question.
Current structure
I have designed the database like this:
Storing the vote count
Current thinking is, I create a new column called vote_count on the link table and every time that answer gets voted, it updates the record.
This works. But is it right? I'm new to database systems, so I can't imagine I'm doing much right. What are some more efficient ways to achieve this?

As far as it goes yes that's OK. However these tables will be incomplete. When your second quiz is created, you'll have to extend the QUESTIONS table. If this second quiz's Q1 also has a yes/no answer, you're going to have to extend the LINK/VOTES table.
You also have to think about how it's going to be queried and design indexes to support those queries.
Cheers -

Database structuration suggestions for a poll web app

Im designing a poll application, where the user creates one or several polls whith questions and predefined answers for each question, so far no problem, im thinking the easiest way to do this is with 3 tables:
Polls Table:
id title description
Questions table:
id poll_id question
Answers table:
id question_id answer
The problem is, the user may select a different behavior on the questioning flow of the poll, for example a normal poll will go from question 1 to question N (being N the final question), but in my case the user may want if the user choose answer 2 of question 4 to jump to question 7 and ignore the rest between them.
Im a bit confuse about how to store in database this behavior, any suggestions?

Looks like you need something similar to this:
Look at the construction of keys here:
The QUESTION is in identifying relationship with POLL and the resulting natural key provides not just uniqueness but ordering as well: QUESTION_NO can monotonically increase while keeping the same POLL_ID.
The equivalent effect is accomplished by ANSWER_NO in POSSIBLE_ANSWER.
The user can pick at most one answer for any given question. That's why ANSWER_NO is outside of the ACTUAL_ANSWER primary key.
On the other hand, USER_ID is kept inside the ACTUAL_ANSWER PK, to allow the same answer to be picked by more than one user.
Theoretically, there should be a key in QUESTION_TABLE on {POLL_ID, QUESTION_TEXT}, to prevent two different questions having the same text in the same poll. However, QUESTION_TEXT might be long and is potentially implemented as BLOB, which most DBMSes can't index or constrain by a key. The similar dilemma exists for POSSIBLE_ANSWER.ANSWER_TEXT.
If user skips a question, just omit the corresponding ACTUAL_ANSWER.

Answer > NextQuestion table
AnswerID NextQuestionID
Based on your answer, the next question is defined in here

'questions' and 'answers' with multiple answers

This question is related to this post:
SQL design for survey with answers of different data types
I have a survey app where most questions have a set of answers that are 1-5. Now we have to do questions that could have a variety of different answer types -- numeric, date, string, etc. Thanks to suggestions from stack, I went with a string column to store answers. Some questions are multiple choice, so along with the table 'questions', I have a table 'answers' which has the set of possible answers for a question.
Now: how should I store answers for a question that is "pick all that apply"? Should I make a child table that is "chosen_answers" or something like that? Or should the answers table have a 'chosen' column that indicates that a respondent chose that answer?

a possible solution is a UsersAnswers table with 4 columns: primary key, user's id, question's id, and answer's id
with multiple entries for any questions where more than one answer can be selected

I have two suggestions.
Normalize your database, and create a table called question_answer, or something that fits more in line with the nomenclature of your schema. This is how I would lay it out.
CREATE TABLE question_answer (
id INT NOT NULL AUTO INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
question_id INT NOT NULL,
answer_id INT NOT NULL
);
Create five columns in your answers table, each of which refers to a specific answer. In MySQL I would use set these columns up as bit(1) values.
IMHO, unless you see the number of choices changing, I would stick with option 2. It's a faster option and will most likely also save you space.

As you're not going to have many options selected I'd be tempted to store the answers as a comma-separated list of values in your string answer column.
If the user is selecting their answers from a group of checkboxes on the web page with the question (assuming it is a web app) then you'll get back a comma-separated list from there too. (Although you won't just be able to compare the lists as strings since the answer "red,blue" is the same as "blue,red".)

Another option, (And I've seen cases where this was how questions like this were scored as well), is to treat each possible answer as a separate Yes/No question, and record the testee's response (Chose it, or didn't) as a boolean...

these survey questions always have one, universal answer: it depends on what you want to do with the answers when you're done.
for example, if all you want to to is keep a record of each individual answer (and never do any totaling or find all users that answered question x with answer y), then the simplest design is to denormalize the answers in to a serialized field.
if you need totals, you can probably also get away with denormalized answers in to a serialized table if you calculate the totals in a summary table and update the values when a quiz is submitted.
so for your specific question, you need to decide if it's more useful to your final product to store 5 when you mean "all of the above" or if it's more useful to have each of the four options individually selected.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008