Integer values for status fields - mysql

Often I find myself creating 'status' fields for database tables. I set these up as TINYINT(1) as more than often I only need a handful of status values. I cross-reference these values to array-lookups in my code, an example is as follows:
0 - Pending
1 - Active
2 - Denied
3 - On Hold
This all works very well, except I'm now trying to create better database structures and realise that from a database point of view, these integer values don't actually mean anything.
Now a solution to this may be to create separate tables for statuses - but there could be several status columns across the database and to have separate tables for each status column seems a bit of overkill? (I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me).
Another option is to use the ENUM data type - but there are mixed opinions on this. I see many people not recommending to use ENUM fields.
So what would be the way to go? Do I absolutely need to be putting this data in to its own table?

I think the best approach is to have a single status table for each kind of status. For example, order_status ("placed", "paid", "processing", "completed") is qualitatively different from contact_status ("received", "replied", "resolved"), but the latter might work just as well for customer contacts as for supplier contacts.
This is probably already what you're doing — it's just that your "tables" are in-memory arrays rather than database tables.

As I really agree with "ruakh" on creating another table structured as id statusName which is great. However, I would like to add that for such a table you can still use tinyint(1) for the id field. as tinyint accepts values from 0 to 127 which would cover all status cases you might need.

Can you add (or remove) a status value without changing code?
If yes, then consider a separate lookup table for each status "type". You are already treating this data in a generic way in your code, so you should have a generic data structure for it.
I no, then keep the ENUM (or well-documented integer). You are treating each value in a special way, so there isn't much purpose in trying to generalize the data model.
(I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me
You should never mix several distinct sets of values within the same lookup table (regardless of your "zero issue"). Reasons:
A simple FOREIGN KEY alone won't be able to prevent referencing a value from the wrong set.
All values are forced into the same type, which may not always be desirable.
That's such a common anti-pattern that it even has a name: "one true lookup table".
Instead, keep each lookup "type" within a separate table. That way, FKs work predictably and you can tweak datatypes as necessary.

Related

Best database table design for a table with dependent column values

I would like know the best way of designing a table structure for dependent column values.
If i have a scenario like this this
if the status of the field is alive nothing to do
if the status is died some other column values are stored somehow.
What is the best way to handle this situation
whether to create table containing all columns ie 'Died in the hospital','Cause of death','Date of Death' and 'Please narrate the event' and let it be null when status is alive
or
to use seperate table for storing all the other attributes using Entity-attribute-value (EVA) concepts
in the above scenario signs and symptoms may be single, multiple or others with specification. how to store this .
what is the best way for performance and querying
either to provide 15 columns in single table and store null if no value or to store foreign key of symptoms in another table (in this strategy how to store other symptom description column).
In general, if you know what the columns are, you should include those in the table. So, a table with columns such as: died_in_hospital, cause_of_death, and so on seems like a reasonable solution.
Entity-attribute-value models are useful under two circumstances:
The attributes are not known and new ones are added over time.
The number of attributes is so large and sparsely populated that most columns would be NULL.
In your case, you know the attributes, so you should put them into a table as columns.
Entity-attribute-value models is the best method, it will be helpful in data filtering/searching. Keeping the columns in the base table itself is against Normalization rules.

MS Access 2013 query query criteria can't assess if value A is contained in value B string

Issue:
I am developing a simple issue tracking database and have hit a stumbling block that I’m not sure how to resolve. Have tried several approaches using queries, sql statement etc but still not working. I may have to rethink how I am doing this but hoping someone may be able to address the issue as it stands, though if a more elegant way of doing it happy to implement that.
Scenario:
A table called tblUsers has a field called Access that is a lookup to a table called tblCategory and allows for multiple values to be stored (one to many). In essence this is saying which category(s) of “issue” the user is allowed to
A simple msgbox test in code shows that this is correctly storing the values selected in the following format "1, 2, 3, 4"
In turn, each issue can only have a single category (one to one) which is stored in a field called Category in table tblGMPIssues and is also populated from a lookup to the tblCategory table.
So far so good ….
I then have a query called qryUserIssues that should show all issues from the table tblGMPIssues that are a) “Open” (status = 1) and that b) match any of the categories that the user is permitted to view.
I can get this to work with a single value i.e. as it stands query prompts for input and if you enter a single valid integer it returns expected results
But I can’t work out the syntax to get the criteria to accommodate multiple values. For example, in above scenario our user should be allowed to see 4 different category or calls “1, 2, 3, 4”
Tried using INNER joins, tried assigning to variables and using a LIKE criteria but can’t seem to get the syntax right.
If anyone could let me know if this can be done and if so how as it’s driving me nuts.
All help and suggestions gratefully received.
Updated relationship diagram --> 1
For precisely the reason that you've asked this question I would recommend never using the multi-select lookup option for columns in MS Access tables. Instead create an intersection table which tells you the combinations of values from the two main tables that are allowed. So instead of having the multi-select Access column in tblUsers, you should have a separate table called tblUserAccess with two columns (UserID and CategoryID). The two columns together will form a composite Primary Key for this table, and individually they will be Foreign Keys to tblUsers and tblCategory respectively. (You should do the same kind of thing with tblType - remove the Categories column and set up a separate table called tblTypeCategories).
Coming to your query, are you expecting this to show you all the relevant Issues for a particular user? At the moment, it is not doing this. The reason it is prompting you for input is because it doesn't understand ([tblUsers].[Access]) - tblUsers is not referenced in your query, and the query has no way of knowing which particular user you're interested in.
With your new table in place (and populated with the relevant data) you should add tblUserAccess to the query, joining tblGMPIssues.Category to tblUserAccess.CategoryID. Take the ([tblUsers].[Access]) condition off the Category column. Add the UserID column to the grid and set the criteria to [Input UserID]. Now when you run the query it will ask you for a user ID, and it should hopefully show you all the Issues that the given user can access.
Good luck!
First, I suggest you normalize your data a bit:
You have a number of tables that are reference data (e.g. tables tblStatus, tblSeverity, tblLocation). You have a s a primary key a (system generated) ID. That is wrong! The primary key of these should be their data, i.e. status, severity, location.
I can't see what the relationships are between the data. It should be one-to-many, mandatory (i.e. one Status can occur in many tblGMPIssues and a status is mandatory).
Your table tblType is unclear to me but it contains the categories. I am not familiar with the '-' before Categories followed by a Categories.Value but I assume an occurrence of tblType can contain exactly one Categories.Value. If not, then you must decompose this table.
If a User has access to a number of Categories, then there must be a many-to-many relationship betwen Users and Categories. From this relationship you do your select query, but I don't see this relationship.
Use following query to get any of the Category IDs 1, 2, 3 or 4
Select * from tblGMPIssues where tblGMPIssues.Category in (Select UserAccess from tblUserAccess)
I still have many problems with your relational design, or actually the lack of a proper relational design. As an example, below is a diagram from my Access 2007 showing a part of your database with a proper design. Access automatically shows that "one" and "many" symbols (which I don't see in your diagrams). I also show the relationship dialog with the proper fields checked. Note that none of the keys of any table, except tblIssue, has a system generated primary key. They are all plain text whch allows better understanding when inspecting the data and, as said, the database automaticlly updates child tables when the primary key value of a parent table changes.
Note table tblCategoryType: it implements a many-to-many relation between categories and types, meaning a category can be of zero or more types and a type can be in zero or more categories. In addition to "update cascades", this table has the "delete cascades" checkbox checked so if a category is deleted, all its relations with types are deleted (not the types).

Database design - which would be better?

I have multiple tables.
They all have the following fields in them:
item_title | item_description | item_thumbnail | item_keywords
Would I be better off having a single items_table with an extra item_type field and then joining with the respective table, or just keep them all in separate tables?
Depends on the context. If your items have very little differentiation and you’re certain you’re not going to have a scenario in 6 months, 12 months, 2 years where you need items separated, then go the route of one generic “items” table. If a particular item type does have specific requirements, then you can create a separate table that contains this data and create a LEFT JOIN when querying to include the extra data.
I’d also suggest looking at other database types. Judging from your scenario (lots of item types with little variance in the data stored) I think you may benefit from a document-based database engine like MongoDB rather than a relational data-based database engine like MySQL.
OK, so the tables share fields. Do they also share constraints1?
If yes, then go ahead and merge them together.
If not, you may keep them separate, of may merge them together, depending on what kind of tradeoff you are willing to make.
For example, if tables have separate foreign keys, you may keep them separate, or you may merge them into a single table, but keep FKs separate:
item_title
item_description
item_thumbnail
item_keywords
table1_id REFERENCES table1 (table1_id)
table2_id REFERENCES table2 (table2_id)
...
CHECK (
(table1_id IS NOT NULL AND table2_id IS NULL ...)
OR (table1_id IS NULL AND table2_id IS NOT NULL ...)
...
)
(NOTE: MySQL Doesn't enforce CHECK, so you'll need to do the equivalent enforcement from a trigger or client code, or use a different DBMS if you can.)
I'd need to know more about your database to figure out which is better.
with an extra item_type field and then joining with the respective table,
Never enforce FKs in code, if you can help it. Even if you merge the tables together, don't merge FKs, instead do something like the above. Enforcing FKs in code in the context of the concurrent environment (where multiple clients can try to modify the same data at the same time) is difficult to do correctly and with good performance - it's much better to let the DBMS do it for you.
BTW, what is item_keywords? It it's a comma-separated list of keywords (or similar), you'll need to normalize further and extract the keywords into their own separate table.
1 Domain (data type and CHECK), key (PRIMARY KEY and UNIQUE) and referential (FOREIGN KEY) constraints.
I believe that it is good to have as less table as possible. It is easy to maintain. It is hard to imagine that if you have 3000 type of item_type. Then, there would be 3,000 different table. So single table is good idea to me in your case. In the future, when you run into situation when you need to separate the table, you can easily do so.
So the short answer, YES.
If i understand well, you only need to normalize your schema:
items:
item_id
item_name
item_description
items_types
item_id
type_id
types
type_id
item_file_name
So this way you can have any number of items with any number of types
Is this you want to do???
I would suggest you to use one table for item and one table for type for the following reasons (assume there are 10 types).
I am not sure which programming language you are using. As a Java developer, i will have to create each entity class for each type if I have multiple tables. So i would rather have only one class and have a type as an attribute.
When you have to display all of the types in the same page, you will have to execute the select query from all 10 tables for 10 types.
When you introduce a new type, you have to write the code to for the CRUD and Business specific operations. The developer will keep on adding the code for every new type.
Basically, if you have one table for item and one table for type, you won't have to change the database schema and code for each new type you introduce. But if you are sure that, the number of types is less and won't change, you can consider using muiltiple tables.
Create two separate tables and join them as per your required output.
i.e>
1.1'st TABLE (master table==>item_type)
item_type(item_type_id,item_type_name,status)
2.2'nd TABLE(child table==>item_details)
item_details(item_id,item_type_id,item_title,item_description,item_thumbnail,item_keywords)
See more examples..
I feel signle table would be more suitable. It will avoid more joins, complication in program(Code) and errors in compare of multiple tables. Even it will be better from the management point of view like db clustering etc.
If you have so many tables which needs to have the same repeated columns then yes it is a good way to create a separate table for the common fields. This is more efficient if these repeated columns are not fixed and can be changed like adding one more column to the list of common default columns.
So how could you do that?
The idea is to create a seperate table and put the common default columns there.
This table is like a dummy table i.e. the columns can be added/deleted as needed.
For example-
Table - DefaultFields
Columns - item_title | item_description | item_thumbnail | item_keywords
You can then also be able to insert the values in the DefaultFields table dynamically in a loop like:
"INSERT INTO DefaultFields (item_table, item_title , item_description,item_thumbnail ,item_keywords) VALUES('"+ field.item_table + "','" + field.item_title + "','" + field.item_description+ "','" + field.item_thumbnail + "','" + field.item_keywords)");
NOTE: field is the object that holds the values in a table wise loop.
Then further you can alter your tables to create these default fields from DefaultFields table like:
"ALTER TABLE " + item_table+ " ADD COLUMN [" + field.item_title + "] Text"
This can be repeated for each table to alter it as needed.
In this design pattern, even if you want to:
1) add one more column or
2) delete pre existing column or
3) change pre existing column name
Then you can do so in the dummy table and the rest is updated by the ALTER table command in corresponding tables.
In my opinion... I would say no, never.
There is two reason for that:
You really want to preserve a logical meaning in your database. For now it's pretty obvious for you how it's organised. But in two month (or 1 year), will it be so evident? If somebody join the project, isn't it easier for him to understand if the different logical block of your app are separated? I mean... It's true that a human and a cat are animals. Is it still logical to store both of them inside the same box?
Performance. The shorter the table, the faster your request will be. The data will still take as much space on your disk. And i don't talk about the comparison for knowing which type of item you are looking for. I mean, if you want to select all the pages of your application, just compare the two request:
Multiple tables:
Select * from pages_tbl;
Single table:
Select * from item_tbl where type = 'page';
What will you gain from this design? No performance, no disk space, no readability. I really don't see a good reason for it.

MySQL: Table entry representing all 'ids'?

I have a single value in a table that I want selected every time a query is made on the table.
Let me break is down.
I have the following entry:
Instead of making a new entry for every different user_id, can I use some kind is primitive to represent ALL user_ids instead of specific ids? Example below:
For reasons that I would rather not take the time to explain, this is what I need. Is there any way to do this?
Thanks in advance!
If I'm correct in assuming that that means you want tag_id linked to every user_id (as some sort of a catch-all clause), you have a few ways of going about it. Depending on your application, you can simply request it to add a row for tag_id = 1 whenever you add a user. If you would, however, want to do it in a single row, well ... it kind of misrepresents the relational model.
You could, presumably use the NULL special "value" (essentially, declare it without a value) and then check in your application logic with
WHERE user_id = [uid] OR user_id IS NULL
or some such. I'd prefer keeping the relations intact with the former approach, however; you lose foreign keys (although using NULL won't violate the constraint) and similar constraints if you don't.

Implementing custom fields with ALTER TABLE

We are currently thinking about different ways to implement custom fields for our web application. Users should be able to define custom fields for certain entities and fill in/view this data (and possibly query the data later on).
I understand that there are different ways to implement custom fields (e.g. using a name/value table or using alter table etc.) and we are currently favoring using ALTER TABLE to dynamically add new user fields to the database.
After browsing through other related SO topics, I couldn't find any big drawbacks of this solution. In contrast, having the option to query the data in fast way (e.g. by directly using SQL's where statement) is a big advantage for us.
Are there any drawbacks you could think of by implementing custom fields this way? We are talking about a web application that is used by up to 100 users at the same time (not concurrent requests..) and can use both MySQL and MS SQL Server databases.
Just as an update, we decided to add new columns via ALTER TABLE to the existing database table to implement custom fields. After some research and tests, this looks like the best solution for most database engines. A separate table with meta information about the custom fields provides the needed information to manage, query and work with the custom fields.
The first drawback I see is that you need to grant your application service with ALTER rights.
This implies that your security model needs careful attention as the application will be able to not only add fields but to drop and rename them as well and create some tables (at least for MySQL).
Secondly, how would you distinct fields that are required per user? Or can the fields created by user A be accessed by user B?
Note that the cardinality of the columns may also significantly grow. If every user adds 2 fields, we are already talking about 200 fields.
Personally, I would use one of the two approaches or a mix of them:
Using a serialized field
I would add one text field to the table in which I would store a serialized dictionary or dictionaries:
{
user_1: {key1: val1, key2, val2,...},
user_2: {key1: val1, key2, val2,...},
...
}
The drawback is that the values are not easily searchable.
Using a multi-type name/value table
fields table:
user_id: int
field_name: varchar(100)
type: enum('INT', 'REAL', 'STRING')
values table:
field_id: int
row_id: int # the main table row id
int_value: int
float_value: float
text_value: text
Of course, it requires a join and is a bit more complicated to implement but far more generic and, if indexed properly, quite efficient.
I see nothing wrong with adding new custom fields to the database table.
With this approach, the specific/most appropriate type can be used i.e. need an int field? define it as int. Whereas with a name/value type table, you'd be storing multiple data types as one type (nvarchar probably) - unless you complete that name/value table with multiple columns of different types and populate the appropriate one but that is a bit horrible.
Also, adding new columns makes it easier to query/no need to involve a join to a new name/value table.
It may not feel as generic, but I feel that's better than having a "one-size fits all" name/value table.
From an SQL Server point of view (2005 onwards)....
An alternative, would be to store create 1 "custom data" field of type XML - this would be truly generic and require no field creation or the need for a separate name/value table. Also has the benefit that not all records have to have the same custom data (i.e. the one field is common, but what it contains doesn't have to be). Not 100% on the performance impact but XML data can be indexed.