Mysql xref table- add column or sparate - mysql

I have a cross reference table that contains three major columns:
object id
different object id
relation type between the two
Problem is, on some cases I need two more columns that help define the relation between the two objects.
My question is, what is the proper way to deal with the situation?
Should I create another table with five columns, and have two table for practically the same purpose?
Or is it ok to add two more columns that will almost always contain null. Will it needlessly affect response time and size?
Thanks
edit-
I've been asked for more information, so here it is:
the database hold philosophical arguments.
This specific table holds the information of which which statements are connected in what logic.
these are the columns:
statement_id
logic_id
direction
which are good for two-way logic (such as 'if-then');
But in case of a multiple statement logic (such as 'and' or 'or') I needs two more columns:
exit
inner-logic type
I'm not sure if this extra information helpful or just more confusing. feel free to ignore it and answer the question on purely academic base.

It is ok to have two ids and any number of columns describing the relationship. Those extra columns could be NULLable if they are optional or whatever.
It sounds like the two ids JOIN to a single table, correct? In that case, you may need to UNION two selects to check for an id in either of the columns. And have multiple indexes, one starting with one id, one starting with the other.
It would help if you provided SHOW CREATE TABLE and a SELECT or two. That might give us a better feel for what the tables are for.

Related

Best database table design for a table with dependent column values

I would like know the best way of designing a table structure for dependent column values.
If i have a scenario like this this
if the status of the field is alive nothing to do
if the status is died some other column values are stored somehow.
What is the best way to handle this situation
whether to create table containing all columns ie 'Died in the hospital','Cause of death','Date of Death' and 'Please narrate the event' and let it be null when status is alive
or
to use seperate table for storing all the other attributes using Entity-attribute-value (EVA) concepts
in the above scenario signs and symptoms may be single, multiple or others with specification. how to store this .
what is the best way for performance and querying
either to provide 15 columns in single table and store null if no value or to store foreign key of symptoms in another table (in this strategy how to store other symptom description column).
In general, if you know what the columns are, you should include those in the table. So, a table with columns such as: died_in_hospital, cause_of_death, and so on seems like a reasonable solution.
Entity-attribute-value models are useful under two circumstances:
The attributes are not known and new ones are added over time.
The number of attributes is so large and sparsely populated that most columns would be NULL.
In your case, you know the attributes, so you should put them into a table as columns.
Entity-attribute-value models is the best method, it will be helpful in data filtering/searching. Keeping the columns in the base table itself is against Normalization rules.

Database Design - properties of two equal objects

In my database there is one table 't_object' and one table 't_search_object'. These two tables are quite similar to each other.
Both tables have one column called 'properties' where the properties are stored separated with commas, e.g.: "1,4,8".
That's why there is an additional table called 't_object_properties' with two columns(id, name) and data records like: (1, propertie1) ...
The problem with having one column 'properties' and one additional table, is that I have several values in just one column.
So I want to know if this is a good way of designing a database..?
I am thinking if it wouldn't be better to have columns like 'is_propertie1', 'is_propertie2', and so on in both tables 't_object' and 't_search_object'? The problem would be to update two tables if another propertie would be added.
So what would you advise? 1) or 2) or is there another way to solve this issue?
It's always wise to have an extra table rather than a comma-separated list of values in MySQL (and all RDBMS systems) to represent one-to-many relationships like object-property. Relational data management is designed around this very concept. Read about "normalization."
Comma separated lists of values, and long lists of columns, both give rise to real peformance and usability problems, especially when your data base gets larger.
Go with your first choice, get rid of the properties column containing lists of values like '1,4,8', and don't look back.

Integer values for status fields

Often I find myself creating 'status' fields for database tables. I set these up as TINYINT(1) as more than often I only need a handful of status values. I cross-reference these values to array-lookups in my code, an example is as follows:
0 - Pending
1 - Active
2 - Denied
3 - On Hold
This all works very well, except I'm now trying to create better database structures and realise that from a database point of view, these integer values don't actually mean anything.
Now a solution to this may be to create separate tables for statuses - but there could be several status columns across the database and to have separate tables for each status column seems a bit of overkill? (I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me).
Another option is to use the ENUM data type - but there are mixed opinions on this. I see many people not recommending to use ENUM fields.
So what would be the way to go? Do I absolutely need to be putting this data in to its own table?
I think the best approach is to have a single status table for each kind of status. For example, order_status ("placed", "paid", "processing", "completed") is qualitatively different from contact_status ("received", "replied", "resolved"), but the latter might work just as well for customer contacts as for supplier contacts.
This is probably already what you're doing — it's just that your "tables" are in-memory arrays rather than database tables.
As I really agree with "ruakh" on creating another table structured as id statusName which is great. However, I would like to add that for such a table you can still use tinyint(1) for the id field. as tinyint accepts values from 0 to 127 which would cover all status cases you might need.
Can you add (or remove) a status value without changing code?
If yes, then consider a separate lookup table for each status "type". You are already treating this data in a generic way in your code, so you should have a generic data structure for it.
I no, then keep the ENUM (or well-documented integer). You are treating each value in a special way, so there isn't much purpose in trying to generalize the data model.
(I'd like each status to start from zero - so having one status table for all statuses wouldn't be ideal for me
You should never mix several distinct sets of values within the same lookup table (regardless of your "zero issue"). Reasons:
A simple FOREIGN KEY alone won't be able to prevent referencing a value from the wrong set.
All values are forced into the same type, which may not always be desirable.
That's such a common anti-pattern that it even has a name: "one true lookup table".
Instead, keep each lookup "type" within a separate table. That way, FKs work predictably and you can tweak datatypes as necessary.

Database design - which would be better?

I have multiple tables.
They all have the following fields in them:
item_title | item_description | item_thumbnail | item_keywords
Would I be better off having a single items_table with an extra item_type field and then joining with the respective table, or just keep them all in separate tables?
Depends on the context. If your items have very little differentiation and you’re certain you’re not going to have a scenario in 6 months, 12 months, 2 years where you need items separated, then go the route of one generic “items” table. If a particular item type does have specific requirements, then you can create a separate table that contains this data and create a LEFT JOIN when querying to include the extra data.
I’d also suggest looking at other database types. Judging from your scenario (lots of item types with little variance in the data stored) I think you may benefit from a document-based database engine like MongoDB rather than a relational data-based database engine like MySQL.
OK, so the tables share fields. Do they also share constraints1?
If yes, then go ahead and merge them together.
If not, you may keep them separate, of may merge them together, depending on what kind of tradeoff you are willing to make.
For example, if tables have separate foreign keys, you may keep them separate, or you may merge them into a single table, but keep FKs separate:
item_title
item_description
item_thumbnail
item_keywords
table1_id REFERENCES table1 (table1_id)
table2_id REFERENCES table2 (table2_id)
...
CHECK (
(table1_id IS NOT NULL AND table2_id IS NULL ...)
OR (table1_id IS NULL AND table2_id IS NOT NULL ...)
...
)
(NOTE: MySQL Doesn't enforce CHECK, so you'll need to do the equivalent enforcement from a trigger or client code, or use a different DBMS if you can.)
I'd need to know more about your database to figure out which is better.
with an extra item_type field and then joining with the respective table,
Never enforce FKs in code, if you can help it. Even if you merge the tables together, don't merge FKs, instead do something like the above. Enforcing FKs in code in the context of the concurrent environment (where multiple clients can try to modify the same data at the same time) is difficult to do correctly and with good performance - it's much better to let the DBMS do it for you.
BTW, what is item_keywords? It it's a comma-separated list of keywords (or similar), you'll need to normalize further and extract the keywords into their own separate table.
1 Domain (data type and CHECK), key (PRIMARY KEY and UNIQUE) and referential (FOREIGN KEY) constraints.
I believe that it is good to have as less table as possible. It is easy to maintain. It is hard to imagine that if you have 3000 type of item_type. Then, there would be 3,000 different table. So single table is good idea to me in your case. In the future, when you run into situation when you need to separate the table, you can easily do so.
So the short answer, YES.
If i understand well, you only need to normalize your schema:
items:
item_id
item_name
item_description
items_types
item_id
type_id
types
type_id
item_file_name
So this way you can have any number of items with any number of types
Is this you want to do???
I would suggest you to use one table for item and one table for type for the following reasons (assume there are 10 types).
I am not sure which programming language you are using. As a Java developer, i will have to create each entity class for each type if I have multiple tables. So i would rather have only one class and have a type as an attribute.
When you have to display all of the types in the same page, you will have to execute the select query from all 10 tables for 10 types.
When you introduce a new type, you have to write the code to for the CRUD and Business specific operations. The developer will keep on adding the code for every new type.
Basically, if you have one table for item and one table for type, you won't have to change the database schema and code for each new type you introduce. But if you are sure that, the number of types is less and won't change, you can consider using muiltiple tables.
Create two separate tables and join them as per your required output.
i.e>
1.1'st TABLE (master table==>item_type)
item_type(item_type_id,item_type_name,status)
2.2'nd TABLE(child table==>item_details)
item_details(item_id,item_type_id,item_title,item_description,item_thumbnail,item_keywords)
See more examples..
I feel signle table would be more suitable. It will avoid more joins, complication in program(Code) and errors in compare of multiple tables. Even it will be better from the management point of view like db clustering etc.
If you have so many tables which needs to have the same repeated columns then yes it is a good way to create a separate table for the common fields. This is more efficient if these repeated columns are not fixed and can be changed like adding one more column to the list of common default columns.
So how could you do that?
The idea is to create a seperate table and put the common default columns there.
This table is like a dummy table i.e. the columns can be added/deleted as needed.
For example-
Table - DefaultFields
Columns - item_title | item_description | item_thumbnail | item_keywords
You can then also be able to insert the values in the DefaultFields table dynamically in a loop like:
"INSERT INTO DefaultFields (item_table, item_title , item_description,item_thumbnail ,item_keywords) VALUES('"+ field.item_table + "','" + field.item_title + "','" + field.item_description+ "','" + field.item_thumbnail + "','" + field.item_keywords)");
NOTE: field is the object that holds the values in a table wise loop.
Then further you can alter your tables to create these default fields from DefaultFields table like:
"ALTER TABLE " + item_table+ " ADD COLUMN [" + field.item_title + "] Text"
This can be repeated for each table to alter it as needed.
In this design pattern, even if you want to:
1) add one more column or
2) delete pre existing column or
3) change pre existing column name
Then you can do so in the dummy table and the rest is updated by the ALTER table command in corresponding tables.
In my opinion... I would say no, never.
There is two reason for that:
You really want to preserve a logical meaning in your database. For now it's pretty obvious for you how it's organised. But in two month (or 1 year), will it be so evident? If somebody join the project, isn't it easier for him to understand if the different logical block of your app are separated? I mean... It's true that a human and a cat are animals. Is it still logical to store both of them inside the same box?
Performance. The shorter the table, the faster your request will be. The data will still take as much space on your disk. And i don't talk about the comparison for knowing which type of item you are looking for. I mean, if you want to select all the pages of your application, just compare the two request:
Multiple tables:
Select * from pages_tbl;
Single table:
Select * from item_tbl where type = 'page';
What will you gain from this design? No performance, no disk space, no readability. I really don't see a good reason for it.

Will multiple table reduce the speed of the result?

I do have a datbase with multiple tables.
this multiple table is related to single name for example..
Table 1 contains name of the person, joined date,position,salary..etc
Table2 contains name of the person,current projects,finished,assigned...etc
Table 3 contains name of the person,time sheets,in,out,etc...
Table 4 contains name of the person,personal details,skill set,previous experiance,...etc
All table contains morethan 50000 names, and their details.
so my question is all tables contains information related to a name say Jose20856 this name is unique index of all 4 tables. when I search for Jose20856 all four table will give result and output to a front end software/html.
so do I need to keep multiple table or combined to a single table??
If so
CASE 1
Single table -> what are the advantages? will result will be faster? what about the system resource usage?
CASE 2
Multiple table ->what are the advantages? will result will be faster? what about the system resource usage?
As I am new to MySQL I would like to have your valuable opinion to move ahead
You can combine these into a single table but only if it makes sense. It's hard to tell if the relationships in your tables are one-to-one or one-to-many but seem to be one-to-many. e.g. A single employee from table 1 should be able to have multiple projects, skills, time sheets in the other tables. These are all one-to-many relationships.
So, keep the multiple table design. You also should consider using an integer-based primary key for the employee rather than the name. Use this pkey as the fkey in your other tables and you'll see performance improvement. (Also consider the amount of work you need to do if and when you want to change the name. You have to change all the names in all the tables. If you use a surrogate key, the int pkey, as suggested above, you only have to update a single row.)
Read on the web about database normalization.
E.g. http://en.wikipedia.org/wiki/Database_normalization
I think you can even add more tables to it. It all depends on the data and the relations.
Table1 = users incl. userdata
Table2 = Projects (if multiple users work on the same project)
Table3 = Linking user to projects (if multiple users work on the same project)
Table4 = Time spent? Contains the links to the user and to the project.
I think your table 4 can be merged into table 1 cause it also contains data specific to 1 user.
There is probably more you can do but as already stated it all depends and the relations.
What we're talking about here is vertical table partitioning (as opposed to horizontal table partitioning). It is a valid database design pattern, which can be useful in these cases:
There are too many columns to fit into one table. That's pretty obvious.
There are columns which are accessed relatively often, and some that are accessed relatively rarely. For example, if you very often need to display columns joined date,position,salary and columns personal details,skill set,previous experiance very rarely, then it makes sense to move these columns to separate a table, as it will (probably) improve performance in accessing those most commonly used. In MySQL this is especially true in case of TEXT and BLOB columns, since they're stored apart from the rest of the fileds, so accessing them takes more time.
There are NULLable columns, where majority of rows are NULL. Once again, if it's mostly null, moving it to a separate table will let you reduce size of your 'mani' table and improve performance. The new table should not allow null values and have entries only for rows where value is set. This way you reduce amount of storeage/memory resources as well.
MySQL specific - You might want tom move some of your columns from nnoDB table to MyISAM, so that you can use full text indexing, while still being able to use some of the features InnoDB provides. It's not a good design gnerally speaking though - it's better to use a full text search engine like Sphinx.
Last but not least. I'd suggest using a numeric field as a key joining all these tables, not a string.
Additional reading aboout MySQL partitioning (a bit outdated, since MySQL 5.5 added some new features)