Storing boolean options in SQL - mysql

I am writing code (using MySQL) to solve a problem similar to the following:
There are 20 boolean options (per every user).
Should I store 20 ENUM('false','true') or put into a table only IDs of these options which are true (so probably having less than 20 rows per user)?

If new options are likely to appear and you don't filter by the options, you may as well go with a EAV structure (a record per option).
This way, you can add new options more easily (no change to metadata).
Assuming that the options values are either TRUE or FALSE (no NULL possible), you should create records only for non-default option values (TRUE in your case). An absence of the record would mean false.
To retrieve all options, you could use this:
SELECT *, GROUP_CONCAT(CONCAT(o.id, ': ', ov.user IS NULL), ', ' ORDER BY o.id)
FROM users u
CROSS JOIN
options o
LEFT JOIN
option_value ov
ON (ov.user, ov.option) = (u.id, o.id)
GROUP BY
u.id
, which would give you dynamic output:
user_id options
1 1: 0, 2: 1, 3: 0

I'd suggest creating an Options table with the different options.
+---Options---+
ID
Option
+---Users---+
ID
Name
+---User_Options---+
User_id
Option_id
Now if you need more options, insert them into the Options table, you dont need to alter your database this way.
EDIT: Removed condition in user_options: like Quassnoi mensioned, it would be better to just add records in case of "TRUE" and the absence of a record should be considered "FALSE"

I would recommend storing as a TINYINT 0 or 1. many frameworks work out of box with the TINYINT data type and handle it as a boolean.

Create 3 tables . First one is 'user_table' . It contain username and user_id. Sample data is given below
Table create script is given below
Now create another table called options_table. It contain option_name and option_id for each option. Sample is given below
Now create a third table called 'selected_options'. That table maps user to options. It contain user_id and option_id
Sample is given below
In the above example user1 selected option1 and option2
and user2 selected option1,option2 and option3 ie..option1,option2 and option3 are true for user2

I would recommend using a bitmask column. If you have numerous options, rather than creating a new column per option, you would be able to quickly perform bit-wise comparisons.
For additional info, see:
SELECT users from MySQL database by privileges bitmask?
Implement bitmask or relational ACL in PHP
Using bitmasks to indicate status

Related

SQL IN Clause only returning rows with first match in comma separated list of IDs

I have 5 users which have a column 'shop_access' (which is a list of shop IDs eg: 1,2,3,4)
I am trying to get all users from the DB which have a shop ID (eg. 2) in their shop_access
Current Query:
SELECT * FROM users WHERE '2' IN (shop_access)
BUT, it only returns users which have shop_access starting with the number 2.
E.g
User 1 manages shops 1,2,3
User 2 manages shops 2,4,5
User 3 manages shops 1,3,4
User 4 manages shops 2,3
The only one which will be returned when running the IN Clause is User 2 and User 4.
User 1 is ignored (which it shouldn't as it has number 2 in the list) as it does not start with the number 2.
I'm not in a position to currently go back and change the way this is set up, eg convert it to JSON and handle it with PHP first, so if someone can try to make this work without having to change the column data (shop_access) that would be ideal.
A portable solution is to use like:
where concat(',', shop, ',') like '%,2,%'
Or if the value to search for is given as a parameter:
where concat(',', shop, ',') like concat('%,', ?, ',%')
Depending on your database, there may be neater options available. In MuSQL:
where find_in_set('2', shop)
That said, I would highly recommend fixing your data model. Storing CSV data in a database defeats the purpose of a relational database in many ways. You should have a separate table to store the user/shop relations, which each tuple on a separate row. Recommended reading: Is storing a delimited list in a database column really that bad?.
Also, you might want to consider using REGEXP here for an option:
SELECT *
FROM users
WHERE shop_access REGEXP '[[:<:]]2[[:>:]]';
-- [[:<:]] and [[:>:]] are word boundaries
SELECT * FROM users WHERE (shop_access = 2) OR (shop_access LIKE "2,%" OR shop_access LIKE "%,2,%" OR shop_access LIKE "%,2")

SELECT grouping by value in field

Given the following (greatly simplified) example table:
CREATE TABLE `permissions` (
`name` varchar(64) NOT NULL DEFAULT '',
`access` enum('read_only','read_write') NOT NULL DEFAULT 'read_only'
);
And the following example contents:
| name | access |
=====================
| foo | read_only |
| foo | read_write |
| bar | read_only |
What I want to do is run a SELECT query that fetches one row for each unique value in name, favouring those with an access value of read_write, is there a way that this can be done? i.e- such that the results I would get are:
foo | read_write |
bar | read_only |
I may need to add new options to the access column in future, but they will always be in order of importance (lowest to highest) so, if possible, a solution that can cope with this would be especially useful.
Also, to clarify, my actual table includes other fields than these, which is why I'm not using a unique key on the name column; there will be multiple rows by name by design to suit various criteria.
The following will work on your data:
select name, max(access)
from permissions
group by name;
However, this orders by the string values, not the indexes. Here is another method:
select name,
substring_index(group_concat(access order by access desc), ',') as access
from permissions
group by name;
It is rather funky that order by goes by the index but min() and max() use the character value. Some might even call that a bug.
You can create another table with the priority of the access (so you can add new options), and then group by and find the MIN() value of the priority table:
E.g. create a table called Priority with the values
| PriorityID| access |
========================
| 1 | read_write |
| 2 | read_only |
And then,
SELECT A.Name, B.Access
FROM (
SELECT A.name, MIN(B.PriorityID) AS Most_Valued_Option -- This will be 1 if there is a read_write for that name
FROM permissions A
INNER JOIN Priority B
ON A.Access = B.Access
GROUP BY A.Name ) A
INNER JOIN Priority B
ON A.Most_Valued_Option = B.PriorityID
-- Join that ID with the actual access
-- (and we will select the value of the access in the select statement)
The solution proposed by Gordon is sufficient for the current requirements.
If we anticipate a future requirement for a priority order to be other than alphabetical string order (or by enum index value)...
As a modified version of Gordon's answer, I would be tempted to use the MySQL FIELD function and (its converse) ELT function, something like this:
SELECT p.name
, ELT(
MIN(
FIELD(p.access
,'read_only','read_write','read_some'
)
)
,'read_only','read_write','read_some'
) AS access
FROM `permissions` p
GROUP BY p.name
If the specification is to pull the entire row, and not just the value of the access column, we could use an inline view query to find the preferred access, and a join back to the preferences table to pull the whole row...
SELECT p.*
FROM ( -- inline view, to get the highest priority value of access
SELECT r.name
, MIN(FIELD(r.access,'read_only','read_write','read_some')) AS ax
FROM `permissions` r
GROUP BY r.name
) q
JOIN `permissions` p
ON p.name = q.name
AND p.access = ELT(q.ax,'read_only','read_write','read_some')
Note that this query returns not just the access with the highest priority, but can also return any columns from that row.
With the FIELD and ELT functions, we can implement any ad-hoc ordering of a list of specific, known values. Not just alphabetic ordering, or ordering by the enum index value.
That logic for "priority" can be contained within the query, and won't rely on an extra column(s) in the permissions table, or the contents of any other table(s).
To get the behavior we are looking for, just specifying a priority for access, the "list of the values" used in the FIELD function will need to match the "list of values" in the ELT function, in the same order, and the lists should include all possible values of access.
Reference:
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_elt
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_field
ADVANCED USAGE
Not that you have a requirement to do this, but considering possible future requirements... we note that...
A different order of the "list of values" will result in a different ordering of priority of access. So a variety of queries could each implement their own different rules for the "priority". Which access value to look for first, second and so on, by reordering the complete "list of values".
Beyond just reordering, it is also possible to omit a possible value from the "list of values" in the FIELD and ELT functions. Consider for example, omitting the 'read_only' value from the list on this line:
, MIN(FIELD(r.access,'read_write','read_some')) AS ax
and from this line:
AND p.access = ELT(q.ax,'read_write','read_some')
That will effectively limit the name rows returned. Only name that have an access value of 'read_write' or 'read_some'. Another way to look at that, a name that has only a 'read_only' for access will not be returned by the query.
Other modifications to the "list of values", where the lists don't "match" are also possible, to implement even more powerful rules. For example, we could exclude a name that has a row with 'read_only'.
For example, in the ELT function, in place of the 'read_only' value, we use a value that we know does not (and cannot) exist on any rows. To illustrate,
we can include 'read_only' as the "highest priority" on this line...
, MIN(FIELD(r.access,'read_only','read_write','read_some')) AS ax
^^^^^^^^^^^
so if a row with 'read_only' is found, that will take priority. But in the ELT function in the outer query, we can translate that back to a different value...
AND p.access = ELT(q.ax,'eXcluDe','read_write','read_some')
^^^^^^^^^
If we know that 'eXcluDe' doesn't exist in the access column, we have effectively excluded any name which has a 'read_only' row, even if there is a 'read_write' row.
Not that you have a specification or current requirement to do any of that. Something to keep in mind for future queries that do have these kinds of requirements.
You can use distinct statement (or Group by)
SELECT distinct name, access
FROM tab;
This works too:
SELECT name, MAX(access)
FROM permissions
GROUP BY name ORDER BY MAX(access) desc

What's the difference between Rails active record's select and group?

I've been reading through tutorials on Rails' active record model operations. And I'm a little confused on the difference between .select and .group. If I wanted to get all the names of all my users in table User I believe I could do:
myUsers = User.select(:name)
so how would that be different from saying:
myUsers = User.group(:name)
thanks,
Will
The two differ like this:
User.select(:name)
is equivalent to this SQL statement
SELECT name from users;
and
User.group(:name)
is equivalent to
SELECT * from users GROUP BY name;
The difference is that with select(:name) you are taking all rows ordered by id, but only with column name. With group(:name) you are taking all rows and all columns, but ordered by column name.
User.pluck(:name) will be the fastest way to pull all the names from your db.
There is #to_sql method to check what DB query it is building. By looking at the DB query, you can confirm yourself what is going on. Look the below example :-
arup#linux-wzza:~/Rails/tv_sms_voting> rails c
Loading development environment (Rails 4.1.4)
>> Vote.group(:choice).to_sql
=> "SELECT \"votes\".* FROM \"votes\" GROUP BY choice"
>> Vote.select(:choice).to_sql
=> "SELECT \"votes\".\"choice\" FROM \"votes\""
>>
Now it is clear that Vote.select(:choice) is actually, SELECT "votes"."choice" FROM "votes", which means, select choice column from all rows of the table votes.
Vote.group(:choice) is grouping the rows of the votes table, based on the column choice and selecting all columns.
If I wanted to get all the names of all my users in table User.
Better is User.pluck(:name).

The optimal way to store multiple-selection survey answers in a database

I'm currently working on a survey creation/administration web application with PHP/MySQL. I have gone through several revisions of the database tables, and I once again find that I may need to rethink the storage of a certain type of answer.
Right now, I have a table that looks like this:
survey_answers
id PK
eid
sesid
intvalue Nullable
charvalue Nullable
id = unique value assigned to each row
eid = Survey question that this answer is in reply to
sesid = The survey 'session' (information about the time and date of a survey take) id
intvalue = The value of the answer if it is a numerical value
charvalue = the value of the answer if it is a textual representation
This allowed me to continue using MySQL's mathematical functions to speed up processing.
I have however found a new challenge: storing questions that have multiple responses.
An example would be:
Which of the following do you enjoy eating? (choose all the apply)
Girl Scout Cookies
Bacon
Corn
Whale Fat
Now, when I want to store the result, I'm not sure of the best way to handle it.
Currently, I have a table just for multiple choice options that looks like this:
survey_element_options
id PK
eid
value
id = unique value associated with each row
eid = question/element that this option is associated with
value = textual value of that option
With this setup, I then store my returned multiple selection answers in 'survey_answers' as strings of comma separated id's of the element_options rows that were selected in the survey. (ie something like "4,6,7,9") I'm wondering if that is indeed the best solution, or if it would be more practical to create a new table that would hold each answer chosen, and then reference back to a given answer row which in turn references back to the element and ultimately the survey.
EDIT
for anyone interested, here is the approach I ended up taking (In PhpMyAdmin Relations View):
And a rudimentary query to gather the counts for a multiple select question would look like this:
SELECT e.question AS question, eo.value AS value, COUNT(eo.value) AS count
FROM survey_elements e, survey_element_options eo, survey_answer_options ao
WHERE e.id = 19
AND eo.eid = e.id
AND ao.oid = eo.id
GROUP BY eo.value
This really depends on a lot of things.
Generally, storing lists of comma separated values in a database is bad, especially if you plan to do anything remotely intelligent with that data. Especially if you want to do any kind of advanced reporting on the answers.
The best relational way to store this is to also define the answers in a second table and then link them to the users response to a question in a third table (with multiple entries per user-question, or possibly user-survey-question if the user could take multiple surveys with the same question on it.
This can get slightly complex as a a possible scenario as a simple example:
Example tables:
Users (Username, UserID)
Questions (qID, QuestionsText)
Answers (AnswerText [in this case example could be reusable, but this does cause an extra layer of complexity as well], aID)
Question_Answers ([Available answers for this question, multiple entries per question] qaID, qID, aID),
UserQuestionAnswers (qaID, uID)
Note: Meant as an example, not a recommendation
Convert primary key to not unique index and add answers for the same question under the same id.
For example.
id | eid | sesid | intval | charval
3 45 30 2
3 45 30 4
You can still add another column for regular unique PK if needed.
Keep things simple. No need for relation here.
It's a horses for courses thing really.
You can store as a comma separated string (But then what happens when you have a literal comma in one of your answers).
You can store as a one-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
intvalue Nullable
charvalue Nullable
And then loop over that table. If you picked one answer, it would create one row in this table. If you pick two answers, it will create two rows in this table, etc. Then you would remove the intvalue and charvalue from the survey_answers table.
Another choice, since you're already storing the element options in their own table, is to create a many-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
survey_element_options_id FK
Again, one row per option selected.
Another option yet again is to store a bitmask value. This will remove the need for a many-to-many table.
survey_element_options
id PK
eid FK
value Text
optionnumber unique for each eid
optionbitmask 2 ^ optionnumber
optionnumber should be unique for each eid, and increment starting with one. There will impose a limit of 63 options if you are using bigint, or 31 options if you are using int.
And then in your survey_answers
id PK
eid
sesid
answerbitmask bigint
Answerbitmask is calculated by adding all of the optionbitmask's together, for each option the user selected. For example, if 7 were stored in Answerbitmask, then that means that the user selected the first three options.
Joins can be done by:
WHERE survey_answers.answerbitmask & survey_element_options.optionbitmask > 0
So yeah, there's a few options to consider.
If you don't use the id as a foreign key in another query, or if you can query results using the sesid, try a many to one relationship.
Otherwise I'd store multiple choice answers as a serialized array, such as JSON or through php's serialize() function.

Multiple "where" -s from one table into one view

I have a table called "users" with 4 fields: ID, UNAME, NAME, SHOW_NAME.
I wish to put this data into one view so that if SHOW_NAME is not set, "UNAME" should be selected as "NAME", otherwise "NAME".
My current query:
SELECT id AS id, uname AS name
FROM users
WHERE show_name != 1
UNION
SELECT id AS id, name AS name
FROM users
WHERE show_name = 1
This generally works, but it does seem to lose the primary key (NaviCat telling me "users_view does not have a primary key...") - which I think is bad.
Is there a better way?
That should be fine. I'm not sure why it's complaining about the loss of a primary key.
I will offer one piece of advice. When you know that there can be no duplicates in your union (such as the two parts being when x = 1 and when x != 1), you should use union all.
The union clause will attempt to remove duplicates which, in this case, is a waste of time.
If you want more targeted assistance, it's probably best if you post the details of the view and the underlying table. Views themselves don't tend to have primary keys or indexes, relying instead on the underlying tables.
So this may well be a problem with your "NaviCat" product (whatever that is) expecting to see a primary key (in other words, it's not built very well for views).
If i am understanding your question correctly, you should be able to just use a CASE statement like below for your logic
SELECT
CASE WHEN SHOW_NAME ==1 THEN NAME ELSE UNAME END
FROM users
This can likely be better written as the following:
SELECT id AS id, IF(show_name == 1, name, uname) AS name
FROM users