normalization of database structure - mysql

I was reading the concept of normalization of database structure. I got confused with the following situation in my project.
I have two tables "TableA" and TableB
Both tables are independent of each other and have no realtionship at all
They represent completely different data
Both the tables will have different parameters. However Parameter itself as an object has same properties.
So my concern is should I have single Parameter table which is serving TableA and TableB both
Or
I should have separate Parameter Table for both Table A and Table B
Structure look likes this
Case I:
TableA
ID
Name
Description
TableB
ID
Name
SomeFlag
Parameter
ID
TableA_ID
TableB_ID
Name
Description
Type
Case II
TableA
ID
Name
Description
Parameter_A
ID
TableA_ID
Name
Description
Type
TableB
ID
Name
SomeFlag
Parameter_B
ID
TableB_ID
Name
Description
Type
I personally prefer Case I, as it does make sense to create another table representing same type of data.
As per normalization's concept we should have a table that represents only one thing. So i guess i should have only one parameter table. But what if that table mean something completely different when viewed from TableA and different when viewed from TableB?

I would use case one but with some changes. The parameter entity does hold one thing, parameters for a table. An instance of a parameter entry should relate to only one table (based on your analysis that they are not related).
Parameter
----------
PK Param_ID
FK Main_Table_ID
Main_Table_name (A or B)
param_Name
param_Description
param_Type

If it makes logical sense for a Parameter to have both Table A and Table B in the same instance (not an either/or), then Case I is better.
In Relational Theory, every table is a type. Even if they may have common data, types are based around their usage. And though it's a little more complicated, Case II is more normalized.
There is another possibiltiy, that hasn't been mentioned, I'll call it Case III.
TableA
ID
Name
Description
PropertyID
TableB
ID
Name
SomeFlag
PropertyID
Parameter
ID
Name
Description
Type
If the Properties will always be common among both tables, this is probably going to be the best solution.

Related

MySQL - convert names and values into columns

I have the following 2 tables:
Parameters table: ID, EntityType, ParamName, ParamType
Entity table: ID, Type, Name, ParamID, StringValue, NumberValue, DateValue
Entity.ParamID is linked to Parameters.ID
Entity.Type is linked to Parameters.EntityType
StringValue, NumberValue, DateValue contains data based on Parameters.Type (1,2,3)
the query result should contain:
Entity.ID, Entity.Name, Parameters.ParamName1, Parameters.ParamName2... Parameters.ParamNameX
The content of ParamNameX is as the above correlation. How is it possible to turn the parameters names into columns and their values into data of those columns? I don't even know where to begin.
Explanation for the above: for example entity X can be entitytype 1 and entitytype 2. parameters table contains paramname for both type 1 and 2 but I need to get (for example) only entity type 1's paramname.
What you are trying to archive is a EAV (Entity Attribute Value) Model.
But the way you set up your tables is just wrong.
You should have a table per type.
So entity_string, entity_number, entity_date and a main table entity which holds the id and some general stuff like create_time, update_time and so on.
Look at magento and how they set up their tables. Like this it is much easier to ask for your data and organize it.

auto_increment index depends other table

I have two tables:
Friends :
id name
1 jhon
2 peter
Teammates:
id name
3 juan
i am looking for a way two auto increment the id of the second table (teammates) according to the first table ( Friends ).
When I add a new register to Teammates it never match with an id of Friends
I think this is not good practice. If you do so, you are introducing an implicit functional dependency between both tables outside of the declared design. If you want to it anyway, you can use a trigger to asign the value instead of making the column autoincrement.
I would suggest to have a table for all people with the real autoincrement id, then you can use several approaches:
i) Make your two actual tables take id values as foreign keys of this new table, with the corresponding integrity constraint.
ii) Simply create 2 views of the table: One for friends, other for teammates.
Table_Friends: (id, name, role)
View_Friends: Select id, name from table_Friends where role = value_for_friend_role
View_Mates: Select id, name from table_Friends where role = value_for_teammate_role

Index data from multiple table into solr

I have three table TableA,TableB and TableC
TableA
idA ------------ PK
col1A
TableB
idB ----------- PK
col1B
TableC
idC ---------- PK
col1C
I am indexing all the data in solr in single core there may be a chance of overriding TableC data with TableB or TableA and Vice versa. Because the primary key are auto generated and there is a possibility of having same value in different tables. How do I solve this problem.
I have two solutions.
1) I was thinking of appending a suffix pk_tablename to make the unique id in solr.
2) create separate core for each table.
which do you suggest is the best ?
In my business domain the table can have millions of records.
please advise.
Solution 1 should be fine. You can store data from different tables in a single core if you want to search them all with a single query. Your primary key is fine. Along with that you can also store the table name in another field, so your docs will look like:
{
unique_id: 1234_A,
id: 1234,
table: A,
data: <text field>
}
Storing the table name will help you perform searches restricted to some table(s) only.

Handling custom user fields with possibility of grow and shrink

This question is much about how to do, idea etc.
I have a situation where a user can create as many custom fields as he can of type Number, Text, or Date, and use this to make a form. I have to make/design some table model which can handle and store the value so that query can be done on these values once saved.
Previously I have hard coded the format for 25 user defined fields (UDF). I make a table with 25 column with 10 Number, 10 Text, and 5 Date type and store the label in it if a user makes use of any field. Then map it to other table which has same format and store the value. Mapping is done to know which field is having what label but this is not an efficient way, I hope.
Any suggestion would be appreciated.
Users have permissions for creating any number of UDF of the above types. then it can be used to make forms again this is also N numbers and have to save the data for each form types.
e.g. let's say a user created 10 number 10 date and 10 text fields used first 5 of each to make form1 and all 10 to make form2 now saved the data.
My thoughts on it:
Make a table1 with [id,name(as UDF_xxx where xxx is data type),UserLabel ]
table2 to map form and table1 [id(f_key table1_id), F_id(form id)]
and make 1 table of each data type as [ id(f_key of table1),F_id(form number),R_id(row id for data, would be same for all data type),value]
Thanks to all I'm going to implement, it both DataSet entry and json approach looks good as it gives wider extension-ability. Still I've to figure out which will best fit with the existing format.
There are two approaches I have used.
XML: To create a dynamic user attribute, you may use XML. This XML will be stores in a clob column - say, user_attributes. Store the entire user-data in XML key-value pair, with type as an attribute or another field. This will give you maximum freedom. You can use XOM or any other XML object Model API to display or operate on the data. A typical Node will look like
<userdata>
...
...
<datanode>
<key type="Date">User Birth</key>
<value>1994-02-25</value>
</datanode>
...
</userdata>
Attribute-AttributeValue This is same thing as above but using tables. What you do is you create a table -- attributes with FK as user_id, another table attribute_values with FK as attribute_id. attributes contains multiple field-names and types for each user and attribute_values contains values of those attributes. so basically,
users
user_id
attributes
attr_id
user_id (FK)
attr_type
attr_name
attribute_values
attr_val_id
attr_id (FK)
attr_val
If you see in both the approached you are not limited by how-many or what type of data you have. But there is a down-side of this is parsing. In either of the case, you will have to to do a small amount of processing to display or analyze the data.
The best of both worlds (having rigid column structure vs having completely dynamic data) approach is to have a users table with must-have columns (like user_name, age, sex, address etc) and have user-created data (like favorite pet house etc.) in either XML or attribute-attribute_value.
What do you want to achieve?
A table per form permutation or might each dataset consist of different sets?
Two possibilities pop into my mind:
Create a table that describes one field of a dataset, i.e. the key might be dataset id + field id and additional columns could contain the value stored as a string and the type of that value (i.e. number, string, boolean, etc.).
That way each dataset might be different but upon reading a dataset and storing it into an object you could create the appropriate value types (Integer, Double, String, Boolean etc.)
Create a table per form, using some naimg convention. When the form layout is changed, execute ALTER TABLE statements to add, remove, rename columns or change their type.
When the user changes the type of a column or deletes it, you might need to either deny that if the values are not null or at least ask the user if she's willing to drop values that don't match the new requirements.
Edit: Example for approach 1
Table UDF //describes the available fields--------
id (PK)
user_id (FK)
type
name
Table FORM //describes a form's general attributes--------
id (PK)
user_id (FK)
name
description
Table FORM_LAYOUT //describes a form's field layout--------
form_id (FK)
udf_id (FK)
mapping //mapping info like column index, form field name etc.
Table DATASET_ENTRY //describes one entry of a dataset, i.e. the value of one UDF in
--------
id (PK)
row_id
form_id (FK)
udf_id (FK)
value
Selecting the content for a specific form might then be done like this:
SELECT e.value, f.type, l.mapping from DATASET_ENTRY e
JOIN UDF f ON e.udf_id = f.id
JOIN FORM_LAYOUT l ON e.form_id = l.form_id AND e.udf_id = l.udf_id
WHERE e.row_id = ? AND e.form_id = ?
Create a table which manages which fields exist. Then create tables for each data type you want to support, where the user will their values into.
create table Fields(
fieldid int not null,
fieldname text not null,
fieldtype int not null
);
create table FieldDate
(
ValueId int not null,
fieldid int not null,
value date
);
create table FieldNumber
(
ValueId int not null,
fieldid int not null,
value number
);
..
Another possibility would be to use ALTER TABLE to create custom fields. If your application has the rights to perform this command and the custom fields are changing very rarely this would be the option I chose.

MySQL: LIKE Query Help?

I have a column in my table called student_id, and I am storing the student IDs associated with a particular record in that column, delimited with a | character. Here are a couple sample entries of the data in that column:
243|244|245
245|1013|289|1012
549|1097|1098|245|1099
I need to write a SQL query that will return records that have a student_id of `245. Any help will be greatly appreciated.
Don't store multiple values in the student_id field, as having exactly one value for each row and column intersection is a requirement of First Normal Form. This is a Good Thing for many reasons, but an obvious one is that it resolves having to deal with cases like having a student_id of "1245".
Instead, it would be much better to have a separate table for storing the student IDs associated with the records in this table. For example (you'd want to add proper constraints to this table definition as well),
CREATE TABLE mytable_student_id (
mytable_id INTEGER,
student_id INTEGER
);
And then you could query using a join:
SELECT * FROM mytable JOIN mytable_student_id
ON (mytable.id=mytable_student_id.mytable_id) WHERE mytable_student_id.student_id = 245
Note that since you didn't post any schema details regarding your original table other than that it contains a student_id field, I'm calling it mytable for the purpose of this example (and assuming it has a primary key field called id -- having a primary key is another requirement of 1NF).
#Donut is totally right about First Normal Form: if you have a one-to-many relation you should use a separate table, other solutions lead to ad-hoccery and unmaintainable code.
But if you're faced with data that are in fact stored like that, one common way of doing it is this:
WHERE CONCAT('|',student_id,'|') LIKE '%|245|%'
Again, I agree with Donut, but this is the proper query to use if you can't do anything about the data for now.
WHERE student_id like '%|245|%' or student_id like '%|245' or student_id like '245|%'
This takes care of 245 being at the start, middle or end of the string. But if you aren't stuck with this design, please, please do what Donut recommends.