MySQL data structure for questions needing multiple inputs to be true - mysql

Apologies for the title as I'm a bit unsure how to phrase this myself so hopefully an example might help.
I've got a MySQL table that holds questions
question_id, PK Int
text - Text
I also have a table called value that looks like this
value_id, PK Int
value - varchar
I think I might need a mapping table for this along the lines of
question_to_value
question_id int
value_id int
Though if my example looks like I don't need one then I can change the structure
Basically, given a single or multiple value_ids I want to pick the question that should be asked
so if I am given value_ids 1,2 I should have a unique question_id relating to those ids in the database. Given a value_id of 3 there should be a different question_id, and an input of value_ids (1,2,3) or 1,3 should again retrieve unique question_ids for both permutations.
I'm struggling with how I should go about it. Should I use a) a sort of joining table for this, and b) Most efficient way of querying it.
My initial thought it to have a question_to_value table that holds a question_id and value_id on a 1-1 basis, then doing the following
select question_id from question WHERE value_id in (?,?,?) but i'm not sure if this is the optimal way to structure this. Especially as the trouble is using the 'IN' query above if I'm just given the value_id of 1, it would actually bring back all the questions where value_id '1' is the only or part of the group of values to product a particular question. e.g
question_id 1 maps to value 1
question_id 2 maps to values of 1 and 2.
my in statement would bring back question_ids 1 and 2 for a value_id of 1 when I only want question_id 1 as it should match all criteria.
Any ideas on how I should structure this?
*** editing I'm trying to come up with another way of phrasing this to avoid confusion so hopefully the following will help
consider I have 4 'questions'
a
b
c
d
If I'm given an input of 1 I only want to retrieve a.
If I'm given 2 inputs of 1 and 2 I only want to retrieve b.
If I'm given 3 inputs of 1,2,3 I only want to retrieve c.
If I'm given 2 inputs of 1 and 3 I only want to retrieve d.

I'm trying to break the problem apart into component parts to define the question in a way that can be expressed in code.
I'm working on the idea that the proper question must match two rules:
The question must have an entry for each of the input values
The question must not have entries for any other input value
First, consider a query that returns all the value ids for the inputs.
SELECT value_id from values where `value` IN <inputs>
I don't know what language you are using, so I can't tell you how to build the inputs list. In php, it would be something like:
"('" . implode("','" $inputs) . "')"
to properly wrap each option in quotes (assuming the values don't also have quotes, but that's a separate, language-specific problem to solve).
Now it is simple to use that query to create a query that returns questions that have ANY of the input values:
SELECT question_id
FROM mapping_table
WHERE value_id IN (SELECT value_id from values where `value` IN <inputs>)
Finally, we can tweak that query to only return the question_ids with the right number of matches. We want one row for each question, and we only want questions that match all the values:
SELECT question_id
FROM mapping_table
WHERE value_id IN (SELECT value_id from values where `value` IN <inputs>)
GROUP BY question_id
HAVING COUNT(question_id) = <number of inputs>
That will give you a list of question_ids that match constraint 1 above. It will not test for constraint 2 in every case; if the input is (1,2) then this will match the question for (1,2) and the question for (1,2,3).
However, if you only get one row back from this query, there is no need to filter that list with the second query.
If you did get more than one match for the above, the second query is pretty simple - it's similar to the above query but with a different WHERE clause. This will get all the mapping table entries for the ids from the first query, and match the ones that don't have any extras:
SELECT question_id
FROM mapping_table
WHERE question_id IN <ids from first query>
GROUP BY question_id
HAVING COUNT(question_id) = <number of inputs>
You could combine the two queries, but unless performance is an issue, I'd keep them separate to make things easier to maintain.
NOTE: there is nothing here to constrain your data to make sure it is valid. In other words, there's nothing to prevent two questions from matching (1,2) when the questions are added to the database.

Related

How to select comma-separated values from a field in one table joined to another table with a specific where condition?

I'm working on a mysql database select and cannot find a solution for this tricky problem.
There's one table "words" with id and names of objects (in this case possible objects in a picture).
words
ID object
house
tree
car
…
In the other table "pictures" all the information to a picture is saved. Besides to information to resolution, etc. there are especially informations on the objects in the picture. They are saved in the column objects by the ids from the table words like 1,5,122,345, etc.
Also the table pictures has a column "location", where the id of the place is written, where I took the picture.
pictures
location objectsinpicture ...
1 - 1,2,3,4
2 - 1,5,122,34
1 - 50,122,345
1 - 91,35,122,345
2 - 1,14,32
1 - 1,5,122,345
To tag new pictures of a particular place I want to become suggestions of already saved information. So I can create buttons in php to update the database instead of using a dropdown with multiple select.
What I have tried so far is the following:
SELECT words.id, words.object
FROM words, pictures
WHERE location = 2 AND FIND_IN_SET(words.id, pictures.objectsinpicture)
GROUP BY words.id
ORDER BY words.id
This nearly shows the expected values. But some information is missing. It doesn't show all the possible objects and I cannot find any reason for this.
What I want is for example all ids fo location 2 joined to the table words and to group double entries of objectsinpicture:
1,5,122,34
1,14,32
1,5,14,32,34,122
house
...
...
...
...
...
Maybe I need to use group_concat with comma separator. But this doesn't work, either. The problem seems to be where condition with the location.
I hope that anyone has an idea of solving this request.
Thanks in advance for any support!!!
This is a classic problem of denormalization causing problems.
What you need to do is store each object/picture association separately, in another table:
create table objectsinpicture (
picture_id int,
object_id int,
primary key (picture_id, object_id)
);
Instead of storing a comma-separated list, you would store one association per row in this table. It will grow to a large number of rows of course, but each row is just a pair of id's so the total size won't be too great.
Then you can query:
SELECT w.id, w.object
FROM pictures AS p
JOIN objectsinpicture AS o ON o.picture_id = p.id
JOIN words AS w ON o.object_id = w.id
WHERE p.location = 2;

How can I combine these two tables so that I can sort with information on each table, but not get duplicate answers?

I have two tables. The first is named master_list. It has these fields: master_id, item_id, name, img, item_code, and length. My second table is named types_join. It has these fields: master_id and type_id. (There is a third table, but it is not being used in the queries. It is more for reference.) I need to be able to combine these two tables so that I can sift the results to only show certain ones but part of the information to sift is on one table and the other part is on the other one. I don't want duplicate answers.
For example say I only want items that have a type_id of 3 and a length of 18.
When I use
SELECT * FROM master_list LEFT JOIN types_join ON master_list.master_id=types_join.master_id WHERE types_join.type_id = 3 AND master_list.length = 18"
it finds the same thing twice.
How can I query this so I won't get duplicate answers?
Here are the samples from my tables and the result I am getting.
This is what I get with an INNER JOIN:
BTW, master_id and name both only have unique information on the master_list table. However, the types_join table does use the master_id multiple times later on, but not for Lye. That is why I know it is duplicating information.
If you want unique rows from master_list, use exists:
SELECT ml.*
FROM master_list ml
WHERE ml.length = 18 AND
EXISTS (SELECT 1
FROM types_join tj
WHERE ml.master_id = tj.master_id AND tj.type_id = 3
);
Any duplicates you get will be duplicates in master_list. If you want to remove them, you need to provide more information -- I would recommend a new question.
Thank you for the data. But as you can see enter link description here, there is nothing wrong with your query.
Have you tried create an unique index over master_id, just to make sure that you do not have duplicated rows?
CREATE UNIQUE INDEX MyMasterUnique
ON master_list(master_id);

is there anyway to know which values of a set of option for an WHERE IN clause were the ones that matched?

I am trying to figure out if this is possible (I think its not).
I have a query
Select ID from table, where table.someCode IN (code1,code2,code3...)
As result of this query, I will get all the rows that matches this paramenter.
My question basically is, there is a way to return which was the code that matched, or codes that matched? like : code1,code3 matched?
Thanks
EDIT ----------------------------
For example, I have rows like this
ID 1
name somename
somecode abc
ID 2
name someothername
somecode def
ID 3
name someotherothername
somecode qwer
So, I want to make a select ID from table, where somecode IN (abc,asdf,wefwerw,qwer, etc...)
But I want also to know (without using a loop in programming to go to each result and collect all the codes), which codes from the list of IN matched, in my example, abc,qwer
Any idea?
If you need the select * then you'll have to either have your invoking code figure it out while parsing/looping through results or you'll need another query (eg SELECT distinct(id) FROM table where table.id IN (...), -or- SELECT id, count(*) from table where table.id IN (...) group by id;)
--- EDITED ANSWER:
nah you cant do that. sorry man.
here's examples of some mysql-only things you can do:
(i'll fill this in in a few minutes)
You could select the unique identifier that identifies each of them. Each of them must have a primary key. Your select by printing ID for each of them will print the ones that matched, code1 is printed iff code1 = ID, code2 is printed iff code2 = ID and so on.
UPDATE: Just change to SELECT ID, code.

The optimal way to store multiple-selection survey answers in a database

I'm currently working on a survey creation/administration web application with PHP/MySQL. I have gone through several revisions of the database tables, and I once again find that I may need to rethink the storage of a certain type of answer.
Right now, I have a table that looks like this:
survey_answers
id PK
eid
sesid
intvalue Nullable
charvalue Nullable
id = unique value assigned to each row
eid = Survey question that this answer is in reply to
sesid = The survey 'session' (information about the time and date of a survey take) id
intvalue = The value of the answer if it is a numerical value
charvalue = the value of the answer if it is a textual representation
This allowed me to continue using MySQL's mathematical functions to speed up processing.
I have however found a new challenge: storing questions that have multiple responses.
An example would be:
Which of the following do you enjoy eating? (choose all the apply)
Girl Scout Cookies
Bacon
Corn
Whale Fat
Now, when I want to store the result, I'm not sure of the best way to handle it.
Currently, I have a table just for multiple choice options that looks like this:
survey_element_options
id PK
eid
value
id = unique value associated with each row
eid = question/element that this option is associated with
value = textual value of that option
With this setup, I then store my returned multiple selection answers in 'survey_answers' as strings of comma separated id's of the element_options rows that were selected in the survey. (ie something like "4,6,7,9") I'm wondering if that is indeed the best solution, or if it would be more practical to create a new table that would hold each answer chosen, and then reference back to a given answer row which in turn references back to the element and ultimately the survey.
EDIT
for anyone interested, here is the approach I ended up taking (In PhpMyAdmin Relations View):
And a rudimentary query to gather the counts for a multiple select question would look like this:
SELECT e.question AS question, eo.value AS value, COUNT(eo.value) AS count
FROM survey_elements e, survey_element_options eo, survey_answer_options ao
WHERE e.id = 19
AND eo.eid = e.id
AND ao.oid = eo.id
GROUP BY eo.value
This really depends on a lot of things.
Generally, storing lists of comma separated values in a database is bad, especially if you plan to do anything remotely intelligent with that data. Especially if you want to do any kind of advanced reporting on the answers.
The best relational way to store this is to also define the answers in a second table and then link them to the users response to a question in a third table (with multiple entries per user-question, or possibly user-survey-question if the user could take multiple surveys with the same question on it.
This can get slightly complex as a a possible scenario as a simple example:
Example tables:
Users (Username, UserID)
Questions (qID, QuestionsText)
Answers (AnswerText [in this case example could be reusable, but this does cause an extra layer of complexity as well], aID)
Question_Answers ([Available answers for this question, multiple entries per question] qaID, qID, aID),
UserQuestionAnswers (qaID, uID)
Note: Meant as an example, not a recommendation
Convert primary key to not unique index and add answers for the same question under the same id.
For example.
id | eid | sesid | intval | charval
3 45 30 2
3 45 30 4
You can still add another column for regular unique PK if needed.
Keep things simple. No need for relation here.
It's a horses for courses thing really.
You can store as a comma separated string (But then what happens when you have a literal comma in one of your answers).
You can store as a one-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
intvalue Nullable
charvalue Nullable
And then loop over that table. If you picked one answer, it would create one row in this table. If you pick two answers, it will create two rows in this table, etc. Then you would remove the intvalue and charvalue from the survey_answers table.
Another choice, since you're already storing the element options in their own table, is to create a many-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
survey_element_options_id FK
Again, one row per option selected.
Another option yet again is to store a bitmask value. This will remove the need for a many-to-many table.
survey_element_options
id PK
eid FK
value Text
optionnumber unique for each eid
optionbitmask 2 ^ optionnumber
optionnumber should be unique for each eid, and increment starting with one. There will impose a limit of 63 options if you are using bigint, or 31 options if you are using int.
And then in your survey_answers
id PK
eid
sesid
answerbitmask bigint
Answerbitmask is calculated by adding all of the optionbitmask's together, for each option the user selected. For example, if 7 were stored in Answerbitmask, then that means that the user selected the first three options.
Joins can be done by:
WHERE survey_answers.answerbitmask & survey_element_options.optionbitmask > 0
So yeah, there's a few options to consider.
If you don't use the id as a foreign key in another query, or if you can query results using the sesid, try a many to one relationship.
Otherwise I'd store multiple choice answers as a serialized array, such as JSON or through php's serialize() function.

Newsletter Categories in one row like 1,2 - Mysql Simple Database Design

I'am using a simple newsletter-script where different categories for one user are possible. But I want to get the different categories in one row like 1,2,3
The tables:
newsletter_emails
id email category
1 test#test.com 1
2 test#test.com 2
newsletter_categories
id name
1 firstcategory
2 secondcategory
But what Iam looking for is like this:
newsletter_emails
user_id email category
1 test#test.com 1,2
2 person#person.com 1
what's the best solution for this?
PS: The User can select his own Categorys at the profile page. (maybe with Mysql Update?)
SQL and the relational data model aren't exactly made for this kind of thing. You can do either of the following:
use a simple SELECT query on the first table, then in your consuming code, iterate over the result, fetching the corresponding rows from the second table and combining them into a string (how you'd do this exactly depends on the language you're using)
use a JOIN on both tables, iterate over the result set and accumulate values from table 2 as long as the ID from table 1 remains the same. This is harder to code than the first solution, and the result set you're pulling from the DB is larger, but you'll get away with just one query.
use DBMS-specific extensions to the SQL standard (e.g. GROUP_CONCAT) to achieve this. You'll get exactly what you asked for, but your SQL queries won't be as portable.
This is a many-to-many relationship case. Instead of having comma separated category ids make an associative table between newsletter_emails and newsletter_categories like user_category having the following schema:
user_id category
1 1
1 2
2 1
This way you won't have to do string processing if a user unsubscribes from a category. You will just have to remove the row from the user_category table.
Try this (completely untested):
SELECT id AS user_id, email, GROUP_CONCAT(category) AS category FROM newsletter_emails GROUP BY email ORDER BY user_id ASC;