sql good practice for holding long strings as fields - mysql

I am creating a sql database with a table holding questionnaire answers. The questions are full sentences (about 150 characters each) and I want to know what is the best method for maintaining that information as the fields. I am till new to SQL, but I see two options:
set each question as a number (1, 2, 3, 4...) and have a separate table holding the actual questions as the data that links to the number in the first table.
some method in CREATE TABLE that lets you set the field as a sentence. I though quotes would work, but they do not.
EDIT:
a quick example of what i am trying to do:
CREATE TABLE survey(
index_id INT PRIMARY KEY,
'between 1 and 10, how do you feel about the transparency of the scientific community?' VARCHAR(5)
);
Thanks!

You are mixing up the data in a table and creating the table.
When you create the table you define the structure of the table
Then you can add data to the table
Then you can query the table.
So for example create a table.
create table questionanswer (
questionnumber integer,
answer varchar(200)
)
add data to the table
insert into questionanswer (questionnumber, answer)
values (1, 'election day')
query the table for values
select answer
from questionanswer
where questionnumber = 1

Generally using VARCHAR(255) with encoding utf8mb4 is a good default. If you need long-form data, like essays, multiple paragraphs, etc. then use TEXT or LONGTEXT.
This is really a one-table problem:
CREATE TABLE questions (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
questionnaire_id INT NOT NULL,
num INT NOT NULL DEFAULT 0,
question VARCHAR(255) NOT NULL
);
Where if you want you can have multiple questionnaires by adding another questionnaire table, or just use that number as-is for partitioning the questions.

Related

How to design a table, where the values of one column may come from either of two other tables in a database?

I am trying to implement the following in a SQL database:
I have three tables.
CREATE TABLE company1(
id int NOT NULL,
prod_id int NOT NULL,
company1_quality_value int)
CREATE TABLE company2(
id int NOT NULL,
prod_id int NOT NULL,
company2_quality_value int)
CREATE TABLE production(
id int NOT NULL,
prod_id int NOT NULL,
corrected_quality_value int)
Suppose two companies calculate some quality value for a product and I store this in two separate tables. Now also suppose that I want to make a production table for this product, with a column corrected_quality_value. Since I trust company2 more than company1, I would like this column to be company2_quality_value whenever company2 has the value stored for the specific product ID prod_id. However, this value may be NULL. Only in this case I will take the value company1_quality_value, which we can suppose will always exist.
My question is: should I try and design the relationship between these values from the get go (and if so, how would I do it best) or should I just do it in the backend and leave the values without a relationship.
The second option just seems odd, since in my mind this would create at least some data duplication (since the values in corrected_quality_value are already stored somewhere else).

What's the best practice to design a table that would have different fields on different conditions?

I need advice in creating tables where there would be different fields based on a condition. I'm pretty new to psql, so I don't really know if I'm going the right way and would appreciate any tips / advice!
Currently I have a table to represent a meeting_note, which can either be a voice recording OR a text.
When the meeting note is of type text, it must have a meeting_content, and can have an optional meeting_summary. audio_source should be null.
When the meeting note is of type audio, it must have an audio_source and the fields meeting_content and meeting_summary should be null.
I was also thinking of creating two tables - one for type audio and another for text, but there is a unique constraint on created_at which represents a date like May 11th. I wasn't sure how to add this constraint between two tables.
Here are the fields for the table meeting_note
id serial PRIMARY KEY,
meeting_id integer REFERENCES meeting(id),
meeting_note_type enum('audio', 'text') NOT NULL,
meeting_content text,
summary varchar(255),
created_at varchar(10) NOT NULL,
recording_source varchar(255)
and the constraints:
UNIQUE (to_char(created_at, 'YYYY-MM-DD')),
CHECK (NOT (meeting_note_type = 'text' AND meeting_content IS NULL)),
CHECK (NOT (meeting_note_type = 'audio' AND audio_source IS NULL)),
CHECK (NOT (meeting_content IS NULL AND audio_source IS NULL),
CHECK (NOT (meeting_content IS NOT NULL AND audio_source IS NOT NULL),
CHECK (NOT (audio_source IS NOT NULL AND summary IS NOT NULL))
Appreciate any help on this. Thank you so much in advance!
There are two common approaches to this problem - using a table-per-type and using one table for everything. The approach you describe in the question is one table for everything; your definition is pretty accurate.
Here is how to do a table-per-type solution: make a "master" table for all notes, and then a table for each note sub-type, like this:
create table note_master(
id serial PRIMARY KEY,
meeting_id integer REFERENCES meeting(id),
created_at varchar(10) NOT NULL
)
create table note_text (
id serial REFERENCES note_master(id),
meeting_content text,
summary varchar(255),
)
create table note_audio (
id serial REFERENCES note_master(id),
recording_source varchar(255)
)
To query for everything you do left-outer joins to note_text and note_audio. This approach lets you skip the enum because you can always figure out what kind of note it is by examining the results of the join.

MySQL - How can I make column be a sum of the two other columns from different tables?

I am new to MySQL and databases overall. Is it possible two create a table where a column is a sum of two other columns from two other tables.
For instance if I have database `Books :
CREATE TABLE `books` (
`book_id` int(100) NOT NULL,
`book_name` varchar(20) NOT NULL,
`book_author` varchar(20) NOT NULL,
`book_co-authors` varchar(20) NOT NULL,
`book_edition` tinyint(4) NOT NULL,
`book_creation` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`book_amount` int(100) NOT NULL COMMENT 'Amount of book copies in both University libraries'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
How can make column book_amount be a sum of the two book_amount columns from library1 and library2 tables where book_id = book_id?
Library1 :
CREATE TABLE `library1` (
`book_id` int(11) NOT NULL,
`book_amount` int(11) NOT NULL,
`available_amount` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
You can define a column with whatever type you want, so long as it's valid, and then populate it as you will with data from other tables. This is generally called "denormalizing" as under ideal circumstances you'll want that data stored in other tables and computed on demand so there's never a chance of your saved value and the source data falling out of sync.
You can also define a VIEW which is like a saved query that behaves as if it's a table. This can do all sorts of things, like dynamically query other tables, and presents the result as a column. A "materialized view" is something some databases support where the view is automatically updated and saved based on some implicit triggers. A non-materialized view is slower, but in your case the speed difference might not be a big deal.
So you have options in how you represent this.
One thing to note is that you should use INT as a default "integer" field, not wonky things like INT(100). The number for integer fields specifies how many significant digits you're expecting, and as INT can only store at most 11 this is wildly out of line.
Not directly, however there are a few ways to achieve what you're after.
Either create a psuedo column in your select clause which adds the other two columns
select *, columna+columnb AS `addition` from books
Don't forget to swap out columna and columnb to the name of the columns, and addition to the name you'd like the psuedo column to have.
Alternatively, you could use a view to auto add the psuedo field in the same way. However, views do not have indexes, so performing lookups in them and joining them can get rather slow very easily.
You could also use triggers to set the values upon insert and update, or simply calculate the value within the language that inserts into the DB.
Following query will work if library1 and library2 table has similar schema as table books:
Insert into books
select l1.book_id,
l1.book_name,
l1.book_authors,
l1.book_co_authors,
l1.book_edition,
l1.book_creation,
(l1.book_amount + l2.book_amount)
from library1 l1
inner join library2 l2
on l1.book_id = l2.book_id
group by l1.book_id
;

MySQL db structure help

I'm working on a quiz project and I want create a mysql structure in such a way that:
questionID: A unique question identification number(primary key)
testID: A unique test identification number(question belongs to this test)(primary key)
questionOrder: The order of the question within the quiz questions, ie this question is n-th question in the quiz. I want this value to come from mysql, so that when I insert a new question to db, I don't have to calculate it
One question can be in multiple different tests.
I have couple of questions:
1) I have the following code but I get:
Incorrect table definition; there can be only one auto column and it must be defined as a key
How can I fix this?
2) This structure doesn't allow a question to belong to multiple quizzes. Any idea to avoid this?
3) Do you think this structure is good/optimum, can you suggest anything better?
CREATE TABLE `quiz_question` (
`questionID` int(11) NOT NULL auto_increment,
`quizID` int(11) NOT NULL default '0',
`questionOrder` int(11) NOT NULL AUTO_INCREMENT,
`question` varchar(256) NOT NULL default '',
`answer` varchar(256) NOT NULL default '',
PRIMARY KEY (`questionID`),
UNIQUE KEY (`quizID`, `questionOrder`),
KEY `par_ind` (`quizID`, `questionOrder`)
) ENGINE=MyISAM;
ALTER TABLE `quiz_question`
ADD CONSTRAINT `0_133` FOREIGN KEY (`quizID`) REFERENCES `quiz_quiz` (`quizID`);
CREATE TABLE `quiz_quiz` (
`quizID` int(11) NOT NULL auto_increment,
`topic` varchar(100) NOT NULL default '',
`information` varchar(100) NOT NULL default '',
PRIMARY KEY (`quizID`)
) ENGINE=MyISAM;
Thanks for reading this.
1) You can only have one AUTO_INCREMENT column per table. It should be a key. Generally, it's part of / is the PK.
2) A 'quiz' would be an entity composed of questions. You should have 3 tables:
1 - quiz_question: quest_id, question, answer
2 - quiz_quiz: quiz_id, topic, info
3 - quiz_fact: quiz_id, quest_id, quest_order
The quiz and question tables hold the per-item (quiz/question) information. The quiz_fact defines how a quiz is composed (this quiz has this question in this order).
3) My only suggestion would be to use Drizzle instead ; ) Seriously though, play with things - 'good enough' often is. If it suits your needs, why tinker? Otherwise you can ask more detailed questions once you have this up and runnning (ie my queries are too slow on such and such operations).
1) Do the order increment yourself. The DB will only do it if it's part of a PK. You might be able to hack it by making a composite key containing the order column but it's not worth it.
2) Rename quiz_question to question (and quiz_quiz to quiz). Make a new quiz-question join table called quiz_question. It should have a quiz ID and a question ID, linking a quiz to a question. As the same question will have different orders on different quizes, put the question order on the new quiz_question. You no longer need a quiz ID on the question table.
Remove AUTO_INCREMENT from the questionOrder field.
As far as having MySQL set the value in the questionOrder field, then do that in a subsequent UPDATE query. Usually, you'd want the administrator of the test, using your admin utility, to be able to adjust the ordering of questions. In that case, you just enter an initial value +1 higher than the highest previous ordering value (on that test). Then, you can let them adjust it something like the manner of adjusting a Netflix queue :)

MySQL column with various types

I seem to often find myself wanting to store data of more than one type (usually specifically integers and text) in the same column in a MySQL database. I know this is horrible, but the reason it happens is when I'm storing responses that people have made to questions in a questionnaire. Some questions need an integer response, some need a text response and some might be an item selected from a list.
The approaches I've taken in the past have been:
Store everything as text and convert to int (or whatever) when needed later.
Have two columns - one for text and one for int. Then you just fill one in per row per response, and leave the other one as null.
Have two tables - one for text responses and one for integer responses.
I don't really like any of those, though, and I have a feeling there must be a much better way to deal with this kind of situation.
To make it more concrete, here's an example of the tables I might have:
CREATE TABLE question (
id int(11) NOT NULL auto_increment,
text VARCHAR(200) NOT NULL default '',
PRIMARY KEY ('id')
)
CREATE TABLE response (
id int(11) NOT NULL auto_increment,
question int (11) NOT NULL,
user int (11) NOT NULL,
response VARCHAR(200) NOT NULL default ''
)
or, if I went with using option 2 above:
CREATE TABLE response (
id int(11) NOT NULL auto_increment,
question int (11) NOT NULL,
user int (11) NOT NULL,
text_response VARCHAR(200),
numeric_response int(11)
)
and if I used option 3 there'd be a responseInteger table and a responseText table.
Is any of those the right approach, or am I missing an obvious alternative?
[Option 2 is] NOT the most normalized option [as #Ray claims]. The most normalized would have no nullable fields and obviously option 2 would require a null on every row.
At this point in your design you have to think about the usage, the queries you'll do, the reports you'll write. Will you want to do math on all of the numeric responses at the same time? i.e. WHERE numeric_response IS NOT NULL? Probably unlikely.
More likely would be, What's the average response WHERE Question = 11. In those cases you can either choose the INT table or the INT column and neither would be easier to do than the other.
If you did do two tables, you'd more than likely be constantly unioning them together for questions like, what % of questions have a response etc.
Can you see how the questions you ask your database to answer start to drive the design?
I'd opt for Option 1. The answers are always text strings, but sometimes the text string happens to be the representation of an integer. What is less easy is to determine what constraints, if any, should be placed on the answer to a given question. If some answer should only be a sequence of one or more digits, how do you validate that? Most likely, the Questions table should contain information about the possible answers, and that should guide the validation.
I note that the combination of QuestionID and UserID is (or should be) unique (for a given questionnaire). So, you really don't need the auto-increment column in the answer. You should also have a unique constraint (or primary key constraint) on the QuestionID and UserID anyway (regardless of whether you keep the auto-increment column).
Option 2 is the correct, most normalized option.