Best Approach for Storing Huge Amount of Data - mysql

I am working on a quizzing project where each user will be given questions based on the category they choose. There will be a lot of questions in each data set. Each user will get a randomized pattern of the questions generated. The server needs to track of what questions the user has answered and what are left. The user can switch category anytime and come back later to the previous category. He can answer the questions he didn't answer, but he won't be able to answer the ones that he has already answered (correct or wrong) What is the best approach for this?
1)Should the questions be stored in tables, each category will have a table for it. The problem with this approach is to:
a) Keep track of what questions the user has already answered. I can have a data structure for that, but then also everytime the user asks for another question or question from different category, the query would have to ensure that it doesn't return a question he has already answered
2) The questions should be hardcoded in data structures

One table per category - NO. Instead, a column with category_id.
Tables: Categories, Questions, Users, Responses (user_id, question_id, response, etc)

Related

Store lists/checklist into MySQL database

I want to make a demo application that is able to ask a user if he/she has followed the correct methods to build an item.
I have created a 'checklist' for the user to fill in as he/she builds the item. For example some of the questions could be:
Have you received the correct parts?
Are the parts in good condition?
Are you building a chair?
Do you have the correct specifications for the chair?
...
...
...
And so on...
So these questions have yes/no answers only. My plan was to create a table and call each column by the questions' number. So column 1 will be called '1' and it's the first question. Column 2 will be called '2' and it's the second question and so on.
So this table will be called Chair inspection. I then have another table called Table inspection with its own set of checklist questions.
This data is captured using an android application. The development of the application is done. Just need advice on the database part.
Is this the correct approach to storing the user's inputs?
I advice you have three tables, one for the questions, the other for the users who will be answering those questions and the last one is for the answers, then you establish the relationship between those three tables. That means Many users can answer many questions. Therefore there will be many to many relationship between users and questions. Then there will be relationship between questions and answers and answer with the users who responded to the questions.
I think that way you will be able to avoid redundancy and simplify the process of updating, and retrieving you data.
A normalised schema might be as follows (incomplete, and ignoring 'tables' for the time being) :
inspection
inspection_id* item inspected_by date
inspection_detail
inspection_id* checklist_id* status
* = (component of) PRIMARY KEY

Suggestions on Database Design

I am building a sample online examination platform (I'm in the process of learning Ruby on Rails) with the following specifications:
There are 1000 different multiple choice questions.
Each question can have up to 5 possible answers, 1 of them is correct.
A user is presented with 10 random questions at a time (let's call this a test). If a user answers a question correctly 2 times, then this question will not be shown to him again.
A user passes the exam if he has answered every question correctly 2 times, in other words when there are no more questions left to show to him.
A first try :
student
-student_id
-name
question
-question_id
-text
option
-option_id
-text
-is_correct
-question_id
student_answer
-student_id
-question_id
-option_id
Although we could only store the correct questions, I've decided to include the 'option_id' in the student_answer table in case I need to display statistics (hardest question etc) in the future.
So up to this point, a question has many options, every option belongs to a single question and every time a student answers a question a student_answer row is created.
It seems to me that this approach would have some performance issues since for each test we'd have to select all the answers given by the user, group them by the question_id, for each question calculate the correct times it has been answered, get a set of question_id that shouldn't be displayed and finally select 10 random questions out of the 1000 initial ones minus those we just excluded.
Another thought I had was to have a JSON array in the form of {[0,0,1,...,1]} for every user. Each cell would be the number of correct answers for the question with an id matching the array index but I find that a bad idea.
Since I'm relatively a beginner when it comes to database design I'd like some feedback on my approach. Please feel free to suggest ways that are completely different than the above.
Thank you very much.
I think you may need to include the question_id in the option table.
One approach would be to move some of the processing into Ruby.
Select the 1000 questions.
Select the answers given by the user: SELECT count(*) counter, question_id, option_id FROM student_answer JOIN option USING (question_id,option_id) WHERE student_id=x AND option.is_correct=1 GROUP BY question_id,option_id HAVING counter>1
Randomize the 1000 questions in ruby, iterate though them, and exclude any that were found in your query of correct answers for this student. Stop after 10 questions.
If only one answer can be correct, then why store the correctness in the option table, the question record should contain the foreign key of the correct answer.
You describe some entities not addresed by your design. You maybe don't need to store 'a test' but this, and a primary key on student_answer makes for a model which makes it a bit easier to answer different questions about the data.
I think you have a good approach going, I would probably do the same. - symcbean does make a good point above, but my solution for this would be to store a boolean column within the student_answer table for whether the answer is correct.

Surveys and Answers, relation on many-to-many

I have the following database scheme (I don't know if it's perfect but I think it's allright?)
It's a system where a User has many Surveys, the Users provide Answers for Questions in the Survey_Answers table.
Now a User can have multiple Surveys, it's the same questions but in a later time of the year they have to fill in the survey again.
I'm nearly there, I'm just wondering how to connect the answers to the survey. Should I make a relation between survey_answers and user_surveys.. thus adding an id to the user_surveys table
Or do you think it's ok to make a relation to the surveys table? I'm not sure which one is correct.
I outlined the 2 possibilities in the second screenshot.
Looking forward to your responses!
Thank you.
This probably depends on how your system is most likely/most frequently going to navigate the relationship.
If you are more likely to be looking at a Users answers and saying - hey let me see when this question was answered, as part of which dated survey, then you should join on user_surveys (I am assuming that the employee_id you are storing would match the user_id in user_surveys)
If you're more likely to be looking at a Users answers and saying - hey what survey did this question belong to, then you should join on Surveys.
You can still answer either question whichever join you use, it will just be a matter of more optimal performance (fewer table joins when trying to answer the most common query).
In reality there probably isn't much in it, so you could always toss a coin :)

Reusable Questions in MySQL Survey Design

I'm working on a Reusable Survey database design. So the idea is.
A Client has many users, A client has categories which consist of questions. Every User has to answer all questions to complete the Survey. Those answers are stored in the Answers table.
The hard part
Some users are coaches, so a coach can fill in the survey for the user, thus providing a score on what they would answer in the place of the user. So we can later compare what the user answered and what the coach answered for each user. That's not to hard! The following is:
After some months we should be able to let the users redo the surveys, so with new answers to all the still existing questions.
I'm wondering if my db design is allright for this.
I have the feeling that this isn't optimal.
For example the following queries seem difficult with my design
For a given Scan, give me all categories and questions
(because of the many tables in between)
Looking very forward to your responses!
Think about how you are going to want to use this information. Are you going to want to compare users scores to coaches scores to their new scores? I think that is likely. Will they end up taking the survey multiple times if they don't improve enough? Are there going to be questions that do not have integer answers? How are you going to store those results? When they create a new survey are they going to want to reuse some previous questions or answers (like yes/no). How are you going to identify a unique user, names are not unique and autogenerated IDs are unique, but how will you know which John Smith belongs to which of the 12 ids you have?
I would rename the Answer table as SurveyResponse.
To it I would add the datetime of the surveyresponse (so people can
answer it multiple times and you can compare the answers) and a
Survey ID (from the new table in the next suggestion).
I would create a Survey table that stores the questions that belong
to a particular survey.
I would create a new Answer table that just has possible answers and
an ID.
I would create a table called SurveyQuestionAnswer which stores the
allowed answers to the question for each survey (different surveys
might have different possible responses to the same question).

categories in questions algorithm

Situation
I'm a learner and im building a webapp in which people can login and ask/answer questions, sort of like quora.
Problem
I want the questions to be asked in perticular categories like..funny,tech,news,etc. and each questions can have multiple categories. So Im having a difficulty to desing a database based on this.
Possible Solution
My each question will have a questionid, unique and i can have a whole another table just for categories -> (questionid,category) which may have multiple entries per questionid. But is there any better solution?
Since each question can have multiple categories and each category can have multiple questions;Like this,
CATEGORY_TABLE
-categoryId(PK)
-categoryName
QUESTION_TABLE
-questionId(PK)
-question
QUESTION_CATEGORY_TABLE
-questionId(PK)
-categoryId(PK)
I guess this is what you have suggested as a solution right?