'questions' and 'answers' with multiple answers - mysql

This question is related to this post:
SQL design for survey with answers of different data types
I have a survey app where most questions have a set of answers that are 1-5. Now we have to do questions that could have a variety of different answer types -- numeric, date, string, etc. Thanks to suggestions from stack, I went with a string column to store answers. Some questions are multiple choice, so along with the table 'questions', I have a table 'answers' which has the set of possible answers for a question.
Now: how should I store answers for a question that is "pick all that apply"? Should I make a child table that is "chosen_answers" or something like that? Or should the answers table have a 'chosen' column that indicates that a respondent chose that answer?

a possible solution is a UsersAnswers table with 4 columns: primary key, user's id, question's id, and answer's id
with multiple entries for any questions where more than one answer can be selected

I have two suggestions.
Normalize your database, and create a table called question_answer, or something that fits more in line with the nomenclature of your schema. This is how I would lay it out.
CREATE TABLE question_answer (
id INT NOT NULL AUTO INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
question_id INT NOT NULL,
answer_id INT NOT NULL
);
Create five columns in your answers table, each of which refers to a specific answer. In MySQL I would use set these columns up as bit(1) values.
IMHO, unless you see the number of choices changing, I would stick with option 2. It's a faster option and will most likely also save you space.

As you're not going to have many options selected I'd be tempted to store the answers as a comma-separated list of values in your string answer column.
If the user is selecting their answers from a group of checkboxes on the web page with the question (assuming it is a web app) then you'll get back a comma-separated list from there too. (Although you won't just be able to compare the lists as strings since the answer "red,blue" is the same as "blue,red".)

Another option, (And I've seen cases where this was how questions like this were scored as well), is to treat each possible answer as a separate Yes/No question, and record the testee's response (Chose it, or didn't) as a boolean...

these survey questions always have one, universal answer: it depends on what you want to do with the answers when you're done.
for example, if all you want to to is keep a record of each individual answer (and never do any totaling or find all users that answered question x with answer y), then the simplest design is to denormalize the answers in to a serialized field.
if you need totals, you can probably also get away with denormalized answers in to a serialized table if you calculate the totals in a summary table and update the values when a quiz is submitted.
so for your specific question, you need to decide if it's more useful to your final product to store 5 when you mean "all of the above" or if it's more useful to have each of the four options individually selected.

Related

Suggestions on Database Design

I am building a sample online examination platform (I'm in the process of learning Ruby on Rails) with the following specifications:
There are 1000 different multiple choice questions.
Each question can have up to 5 possible answers, 1 of them is correct.
A user is presented with 10 random questions at a time (let's call this a test). If a user answers a question correctly 2 times, then this question will not be shown to him again.
A user passes the exam if he has answered every question correctly 2 times, in other words when there are no more questions left to show to him.
A first try :
student
-student_id
-name
question
-question_id
-text
option
-option_id
-text
-is_correct
-question_id
student_answer
-student_id
-question_id
-option_id
Although we could only store the correct questions, I've decided to include the 'option_id' in the student_answer table in case I need to display statistics (hardest question etc) in the future.
So up to this point, a question has many options, every option belongs to a single question and every time a student answers a question a student_answer row is created.
It seems to me that this approach would have some performance issues since for each test we'd have to select all the answers given by the user, group them by the question_id, for each question calculate the correct times it has been answered, get a set of question_id that shouldn't be displayed and finally select 10 random questions out of the 1000 initial ones minus those we just excluded.
Another thought I had was to have a JSON array in the form of {[0,0,1,...,1]} for every user. Each cell would be the number of correct answers for the question with an id matching the array index but I find that a bad idea.
Since I'm relatively a beginner when it comes to database design I'd like some feedback on my approach. Please feel free to suggest ways that are completely different than the above.
Thank you very much.
I think you may need to include the question_id in the option table.
One approach would be to move some of the processing into Ruby.
Select the 1000 questions.
Select the answers given by the user: SELECT count(*) counter, question_id, option_id FROM student_answer JOIN option USING (question_id,option_id) WHERE student_id=x AND option.is_correct=1 GROUP BY question_id,option_id HAVING counter>1
Randomize the 1000 questions in ruby, iterate though them, and exclude any that were found in your query of correct answers for this student. Stop after 10 questions.
If only one answer can be correct, then why store the correctness in the option table, the question record should contain the foreign key of the correct answer.
You describe some entities not addresed by your design. You maybe don't need to store 'a test' but this, and a primary key on student_answer makes for a model which makes it a bit easier to answer different questions about the data.
I think you have a good approach going, I would probably do the same. - symcbean does make a good point above, but my solution for this would be to store a boolean column within the student_answer table for whether the answer is correct.

Updating the dependent table based on the primary table

I am trying to build a table for Multiple Choice Question system where each question has unbounded number of choices to select from(Not a fixed number of choices). These number of choices vary from question to question.I am trying to build a database which stores the question as well as the choices.
Table Question
{ // Though just two fields are shown, there are many fields in the table actually
questionId;
question;
}
Table Choices{
choiceId;
questionId;
choice;
}
One can argue that we can dynamically enter the choice directly into Question Table by having a field but this duplicates the other field data. Like if we have 10 choices for a single Question,then we would have 10 rows in Question table with a lot of duplication. So I have separated the Tables as Question and Choice.
The main problem here. We do not know what the question Id is till the question is actually created. We cannot use the Question Id from the Questions table during entering data into the Choices Table. Any suggestion on how to do this?
Your structure would be able to handle the requirement you are looking for. In the table Choices you can use a primary key combining questionID and choiceID so that you can use choideIDs starting from 1 for each of the questions rather than trying to find out which ID the choices start for each question.
As for your problem on not knowing what questionID is generated, assuming your questionID is an auto_increment column, you can use your last_insert_id function in whatever programming language you are using to find out what questionID was generated by the last insert. As you will be having multiple entries for the choices, it would be hard for you to do this in a single SQL insert command.
If you are using Entity Framework...
You should save Question (Even field "question" is empty) and get ID...
If User Cancels everything just remove that question by ID...

Database structuration suggestions for a poll web app

Im designing a poll application, where the user creates one or several polls whith questions and predefined answers for each question, so far no problem, im thinking the easiest way to do this is with 3 tables:
Polls Table:
id title description
Questions table:
id poll_id question
Answers table:
id question_id answer
The problem is, the user may select a different behavior on the questioning flow of the poll, for example a normal poll will go from question 1 to question N (being N the final question), but in my case the user may want if the user choose answer 2 of question 4 to jump to question 7 and ignore the rest between them.
Im a bit confuse about how to store in database this behavior, any suggestions?
Looks like you need something similar to this:
Look at the construction of keys here:
The QUESTION is in identifying relationship with POLL and the resulting natural key provides not just uniqueness but ordering as well: QUESTION_NO can monotonically increase while keeping the same POLL_ID.
The equivalent effect is accomplished by ANSWER_NO in POSSIBLE_ANSWER.
The user can pick at most one answer for any given question. That's why ANSWER_NO is outside of the ACTUAL_ANSWER primary key.
On the other hand, USER_ID is kept inside the ACTUAL_ANSWER PK, to allow the same answer to be picked by more than one user.
Theoretically, there should be a key in QUESTION_TABLE on {POLL_ID, QUESTION_TEXT}, to prevent two different questions having the same text in the same poll. However, QUESTION_TEXT might be long and is potentially implemented as BLOB, which most DBMSes can't index or constrain by a key. The similar dilemma exists for POSSIBLE_ANSWER.ANSWER_TEXT.
If user skips a question, just omit the corresponding ACTUAL_ANSWER.
Answer > NextQuestion table
AnswerID NextQuestionID
Based on your answer, the next question is defined in here

Simple survey database design

I'm creating a survey for visitors of my event.
However it's been a while since I created a database. So I need some help.
I found some solutions but they are way to extensive and that is something I don't need.
Visitors need to stay anonymous but they can leave their email behind (seperate table Emails that isn't linked to anything atm).
They have about 20 questions, some are open, some are one option(radio) and some are multiple options (checkboxes).
The questions need to be reusable.
That's about it.
I just don't know how to go beyond the many-to-many in the diagram you see below.
How do I go from here? An Answers table needs to have a relationship with? The Surveys_have_Questions, or with Questions?
Edit:
As the answer in the following links mentions, most surveys are based upon classic design patterns. Mine is one of those surveys. More info in the link below:
What mysql database tables and relationships would support a Q&A survey with conditional questions?
I would probably model the event of a user taking a survey, perhaps a table called "User_Answer_Session", which has links to the survey and the user; and then "User_Answers", which are tied to the session and the question and include the actual blob of the answer. How exactly I modeled the answers would depend on a few things (mainly how robustly I wanted to be able to look them up). For instance, do I want to be able to index multiple-choice answers for extremely rapid reporting? If so, then you need to model for that. This may include creating a "Question_Options" table, which is a one-to-many between a question and the available options...
This should get you thinking along a good path. :-)
well i dont see reason why you need all these tables ! i think it can be much simpler than that.
surverys
desc VarChar
startDate timestamp
endDate timestamp
isOpen boolean
survery_questions
survery_id int (FK)
question Text
vote_count unsigned INT
user_survery
user_id
survery_id
unique key (user_id_survery_id) #to ensure no duplicate votes
That all :).
when ever a user vote just run
insert into user_survery (user_id,survery_id) VALUES (1,1);
update survery_questions set vote_count = vote_count+1;
when u need to get a survery result
select * from survery_questions where survery_id = X;
etc.

A Beginner Question on database design

this is a follow-up question on my previous one.We junior year students are doing website development for the univeristy as volunteering work.We are using PHP+MySQL technique.
Now I am mainly responsible for the database development using MySQL,but I am a MySQL designer.I am now asking for some hints on writing my first table,to get my hands on it,then I could work well with other tables.
The quesiton is like this,the first thing our website is going to do is to present a Survey to the user to collect their preference on when they want to use the bus service.
and this is where I am going to start my database development.
The User Requirement Document specifies that for the survey,there should be
Customer side:
Survery will be available to customers,with a set of predefined questions and answers and should be easy to fill out
Business side:
Survery info. will be stored,outputed and displayable for analysis.
It doesnt sound too much work,and I dont need to care about any PHP thing,but I am just confused on :should I just creat a single table called " Survery",or two tables "Survey_business" and "Survey_Customer",and how can the database store the info.?
I would be grateful if you guys could give me some help so I can work along,because the first step is always the hardest and most important.
Thanks.
I would use multiple tables. One for the surveys themselves, and another for the questions. Maybe one more for the answer options, if you want to go with multiple-choice questions. Another table for the answers with a record per question per answerer. The complexity escalates as you consider multiple types of answers (choice, fill-in-the-blank single-line, free-form multiline, etc.) and display options (radio button, dropdown list, textbox, yada yada), but for a simple multiple-choice example with a single rendering type, this would work, I think.
Something like:
-- Survey info such as title, publish dates, etc.
create table Surveys
(
survey_id number,
survey_title varchar2(200)
)
-- one record per question, associated with the parent survey
create table Questions
(
question_id number,
survey_id number,
question varchar2(200)
)
-- one record per multiple-choice option in a question
create table Choices
(
choice_id number,
question_id number,
choice varchar2(200)
)
-- one record per question per answerer to keep track of who
-- answered each question
create table Answers
(
answer_id number,
answerer_id number,
choice_id number
)
Then use application code to:
Insert new surveys and questions.
Populate answers as people take the surveys.
Report on the results after the survey is in progress.
You, as the database developer, could work with the web app developer to design the queries that would both populate and retrieve the appropriate data for each task.
only 1 table, you'll change only the way you use the table for each ocasion
customers side insert data into the table
business side read the data and results from the same table
Survey.Customer sounds like a storage function, while Survey.Business sounds like a retrieval function.
The only tables you need are for storage. The retrieval operations will take place using queries and reports of the existing storage tables, so you don't need additional tables for those.
Use a single table only. If you were to use two tables, then anytime you make a change you would in effect have to do everything twice. That's a big pain for maintenance for you and anyone else who comes in to do it in the future.
most of the advice/answers so far are applicable but make certain (unstated!) assumptions about your domain
try to make a logical model of the entities and attributes that are required to capture the requirements, examine the relationships, consider how the data will be used on both sides of the process, and then design the tables. Talk to the users, talk to the people that will be running the reports, talk to whoever is designing the user interface (screens and reports) to get the complete picture.
pay close attention the the reporting requirements, as they often imply additional attributes and entities not extant in the data-entry schema
i think 2 tables needed:
a survey table for storing questions and choices for answer. each survey will be stored in one row with a unique survey id
other table is for storing answers. i think its better to store each customers answer in one row with a survey id and a customer id if necessary.
then you can compute results and store them in a surveyResults view.
Is the data you're presenting as the questions and answers going to be dynamic? Is this a long-term project that's going to have questions swapped in and out? If so, you'll probably want to have the questions and answers in your database as well.
The way I'd do it would be to define your entities and figure out how to design your tables so relationships are straightforward. Sounds to me like you have three entities:
Question
Answer
Completed Survey
Just a sample elaboration of what Steven and Chris has mentioned above.
There are gonna be multiple tables, if there are gonna be multiple surveys, and each survey has a different set of questions, and if same user can take multiple surveys.
Customer Table with CustID as the primary key
Questions Table with a Question ID as the primary key. If a question cannot belong to more than one survey (a N:1 relationship), then can also have Survey ID (of table Survey table mentioned in point 3) as one of the values in the table.
But if a Survey to Question relationship is N:M, then
(SurveryID, QuestionID) would become a composite key for the SurveyTable, else it would just have the SurveyID with the high level details of the survey like description.
UserSurvey table which would contain (USerID, SurveryID, QuestionID, AnswerGiven)
[Note: if same user can take the same survey again and again, either the old survey has to be updated or the repeat attempts have to stored as another rows with some serial number)