There is a requirement that arises to handle dyanmic data fields in database level. Say we have a table called Employee and that table has a name, surname, and contact no fields ( 3 basic fields). So as the application progresses, the requirement is that the database and the application should be able to add (handle) dynamic data fields that can be added with type into the database.
Ex: A user will add data of birth, address field dynamically to the Employee table which has basic 3 fields mainly.
The problem is how to cater to this requirement the optimum way?
there is a picture I have designed tables to cater to this, But I am open for industry-standard optimum way of achieving this without having future problems
Please collaborate with this.
You basically have four options for handling such dynamic fields:
Modify the base table structure whenever a new column is added.
Using JSON to store the values.
Using a EAV model (entity-attribute-model).
Basically (1) but storing the additional values in a separate table or separate table per user.
You have not provided enough information in the question to determine which of these is most appropriate for your data model.
However, here is a quick run-down of strengths and weaknesses:
For modifying the table: On the downside, modifying a table is an expensive operation (especially as the table gets bigger). On the upside, the columns will be visible to all users and have the appropriate type.
For JSON: JSON is quite flexible. However, JSON incurs very large storage overheads because the name of each field is repeated every time it is used. In addition, you don't have a list of all the added fields (unless you maintain that in a separate table).
For EAV: EAV is flexible, but not quite as flexible as JSON. The problem is the value column is a single type (usually a string) or accessing the data gets more complicated. Like JSON, this repeats the "name" of the value every time it is used. However, this is often a key to another table, so the overhead is less.
For a separate table for each user: This primary advantage here is isolating users from each other. If this is a requirement, then this might be the way to go (although adding a userId to the EAV model would also work).
So, the most appropriate method depends on factors, such as:
Will the fields be shared among all users?
Do the additional fields all have the same type?
What are your concerns about performance and data size?
How often will new fields be added?
To have dynamic fields you can use another table where you can set properties of the user
user table has columns
userid, name, surname, contact
user_props table has columns
propertyid, userid, property, value
in user_props you can insert user properties like
INSERT INTO user_props (userid, property, value)
VALUES (1, "date_of_birth", "2010-01-10"),(1, "hobby", "Stackoverflow")
Like this you can dynamically set any number properties to user.
You might be better off using MongoDB or some other NoSQL/Schemaless database which stores your data in key => value pairs. For the fields you are certain about in advance, you can set a type (in MongoDB) so those columns will have a schema. For dynamic fields, the fields would be stored as strings and you would have to figure out the types in your code somehow.
If you need to use MySQL, in your Employee table, you could have a fourth column for custom fields - database type json. Then whenever you add a new custom field, you add the field_name, field_value and field_type. You schema could look like:
//Schema for Employee table in mysql
id: int
name: varchar
surname: varchar
custom_fields: json //eg { {field_name: DOB, field_value: 06/09/2020, field_type: date},... }
//Schema for contacts table
id: int
employee_id: int
contact: varchar
In MySQL, you could also get rid of the type (if you can do without it) in the custom_fields and structure the json to be simple key => value pairs so it looks like
{
{"key":"Age","value":"10"},
{"key":"salary","value":"40,000"},
{"key":"DOB","value":"06/09/2020"},
}
What you seem to be designing here is a variation of the Entity-Attribute-Value model. It works but it would be very cumbersome to query against a schema like that. Using a json column is a lot neater and a lot faster. Best is to use MongoDB and figure out the types in your code.
You may deal with this condition using a stored procedure including an ALTER DATABASE sentence. Something like:
DROP PROCEDURE IF EXISTS set_dynamic_table;
DELIMITER //
CREATE PROCEDURE set_dynamic_table (IN _field_name VARCHAR(50),
IN _field_type VARCHAR(20),
IN _last_field_name varchar(50))
BEGIN
DECLARE _stmt VARCHAR(1024);
SET #SQL := CONCAT('ALTER TABLE dynamic_table',
'ADD COLUMN ' _field_name, field_type, NULL AFTER ,last_field_name);
PREPARE _stmt FROM #SQL;
EXECUTE _stmt;
DEALLOCATE PREPARE _stmt;
END//
DELIMITER;
and you will e
Related
I want to store a single column with just one value in MySQL DB, and this column is not related to any other table in the DB, so just to store a single value should I create an entire table or is there a better way to store single key-value pair in SQL DB.
For example, a boolean variable isActive needs to be stored and queried
It is not uncommon to do this, but don't create a new table for every singleton value. Make one table for this purpose, with two columns, where the first column identifies a name for the value, and the second column is the actual value. Let the data type of the value be string. This is a minor disadvantage when you actually need a boolean, but that way your table is more flexible.
For instance:
create table params(
paramName varchar(100) not null,
paramValue varchar(100)
);
insert into params values ('isActive', '1');
commit;
See also Variant data type in DB which touches on the need to store different data types in the same column. The consensus is to use the string data type unless the specific data type is really essential, in which case I would suggest to create a separate "parameterXXX" table per data type, so all booleans go in one table, all dates in another, ...etc.
I have 'one to many' relationship in two tables. In that case I want to write a store procedured in mySql which can accept the list of child table object and update the tables. The challenge I am facing is what will be the data type of the in parameter for list of object.
You can try to use VARCHAR(65535) in MySQL.
There is no list data type in MySQL.
Given the info that you are coming from Oracle DB, you might wanna know that MySQL does not have a strict concept of objects. And, as answered here, unfortunately, you cannot create a custom data type on your own.
The way to work around it is to imagine a table as a class. Thus, your objects will become records of the said table.
You have to settle for one of the following approaches:
Concatenated IDs: Store the concatenated IDs you want to operate on in a string equivalent datatype- like VARCHAR(5000) or TEXT. This way you can either split and loop over the string or compose a prepared statement dynamically and execute it.
Use a temporary table: Fetch the child table objects, on the fly, into the temporary table and process them. This way, once you create the temporary table with the fields & constraints that you like, you can use
SELECT ... INTO TEMPORARY_TABLE_NAME to populate the table accordingly. The SELECT statement should fetch the properties you need.
Depending on the size of the data, you might want to choose the temp table approach for larger data sets.
You can use Text data type for store large amount of data in single variable
you can define in sp as:
In Variable_name TEXT
Is there any way to issue a mysql statement to create a table without having to assign the number of columns? I am working with the MySQL C API for grabbing some variables and then storing them in a table. The issue that I am encountering is that I have to create the table (obviously) before inserting the variables into the table. These variables sometimes are structures (two, three or four variables into a single table), so I am looking for a way of not having to say:
CREATE TABLE Structures(ID varchar(10) primary key, name varchar(25))
but creating a table on where any number of columns can be inserted?
Let me know if I am being a bit vague in here.
No, you can't. You can however add columns at runtime using ALTER TABLE.
However, personally, I wouldn't recommend that. You should know what your database looks like, before you start implementing it.
The other way to code this is to use two tables and a one-to-many between them.
For instance, you might have a tables like this - pcode,
table experiment
experiment_id: long
experiment_header: varchar(50)
table experiemnt_data
experiemnt_data_id: long
experiment_id: long
key: varchar(20)
value: long
#id = insert into experiment (experiment_header) value("test run")
insert into experiment_data (experiment_id, key, value) value(#id, 'x', 1)
insert into experiment_data (experiment_id, key, value) value(#id, 'y', 20)
AS #Mark and #attis said:
You can't. You can however add columns at runtime using ALTER
TABLE.
However, personally, I wouldn't recommend that. You should know what
your database looks like, before you start implementing it.
I think the best solution could be:
Create two tables :
column with (id, name)
values with (id, column_id, value)
then you just have to join them to easily get you results, and you can easily add others "columns"
You can also store everything in values table, but your data may be inconsistent, and, in my mind, it's faster to look for a number than to compare strings (table lock, index etc...)
I wanted to comment #Mark post, but can't (reputation too low)
I'm new to MySQL, and just learned about the importance of data normalization. My database has a simple structure:
I have 1 table called users with fields:
userName (string)
userEmail (string)
password (string)
requests (an array of dictionaries in JSON string format)
data (another array of dictionaries in JSON string format)
deviceID (string)
Right now, this is my structure. Being very new to MySQL, I'm really not seeing why my above structure is a bad idea? Why would I need to normalize this and make separate tables? That's the first question-why? (Some have also said not to put JSON in my table. Why or why not?)
The second question is how? With the above structure, how many tables should I have, and what would be in each table?
Edit:
So maybe normalization is not absolutely necessary here, but maybe there's a better way to implement my data field? The data field is an array of dictionaries: each dictionary is just a note item with a few keys (title, author, date, body). So what I do now is, which I think might be inefficient, every time a user composes a new note, I send that note from my app to PHP to handle. I get the JSON array of dictionaries already part of that user's data, I convert it to a PHP array, I then add to the end of this array the new note, convert the whole thing back to JSON, and put it back in the table as an array of dictionaries. And this process is repeated every time a new note is composed. Is there a better way to do this? Maybe a user's data should be a table, with each row being a note-but I'm not really sure how this would work?
The answer to all your questions really depends on what the JSON data is for, and whether you'll ever need to use some property of that data to determine which rows are returned.
If your data truly has no schema, and you're really just using it to store data that will be used by an application that knows how to retrieve the correct row by some other criteria (such as one of the other fields) every time, there's no reason to store it as anything other than exactly as that application expects it (in this case, JSON).
If the JSON data DOES contain some structure that is the same for all entries, and if it's useful to query this data directly from the database, you would want to create one or more tables (or maybe just some more fields) to hold this data.
As a practical example of this, if the data fields contains JSON enumerating services for that user in an array, and each service has a unique id, type, and price, you might want a separate table with the following fields (using your own naming conventions):
serviceId (integer)
userName (string)
serviceType (string)
servicePrice (float)
And each service for that user would get it's own entry. You could then query for users than have a particular service, which depending on your needs, could be very useful. In addition to easy querying, indexing certain fields of the separate tables can also make for very QUICK queries.
Update: Based on your explanation of the data stored, and the way you use it, you probably do want it normalized. Something like the following:
# user table
userId (integer, auto-incrementing)
userName (string)
userEmail (string)
password (string)
deviceID (string)
# note table
noteId (integer, auto-incrementing)
userId (integer, matches user.userId)
noteTime (datetime)
noteData (string, possibly split into separate fields depending on content, such as subject, etC)
# request table
requestId (integer, auto-incrementing)
userId (integer, matches user.userId)
requestTime (datetime)
requestData (string, again split as needed)
You could then query like so:
# Get a user
SELECT * FROM user WHERE userId = '123';
SELECT * FROM user WHERE userNAme = 'foo';
# Get all requests for a user
SELECT * FROM request WHERE userId = '123';
# Get a single request
SELECT * FROM request WHERE requestId = '325325';
# Get all notes for a user
SELECT * FROM note WHERE userId = '123';
# Get all notes from last week
SELECT * FROM note WHERE userId = '123' AND noteTime > CURDATE() - INTERVAL 1 WEEK;
# Add a note to user 123
INSERT INTO note (noteId, userId, noteData) VALUES (null, 123, 'This is a note');
Notice how much more you can do with normalized data, and how easy it is? It's trivial to locate, update, append, or delete any specific component.
Normalization is a philosophy. Some people think it fits their database approach, some don't. Many modern database solutions even focus on denormalization to improve speeds.
Normalization often doesn't improve speed. However, it greatly improves the simplicity of accessing and writing data. For example, if you wanted to add a request, you would have to write a completely new JSON field. If it was normalized, you could simply add a row to a table.
In normalization, "array of dictionaries in JSON string format" is always bad. Array of dictionaries can be translated as list of rows, which is a table.
If you're new to databases: NORMALIZE. Denormalization is something for professionals.
A main benefit of normalization is to eliminate redundant data, but since each user's data is unique to that user, there is no benefit to splitting this table and normalizing. Furthermore, since the front-end will employ the dictionaries as JSON objects anyway, undue complication and a decrease in performance would result from trying to decompose this data.
Okay, here is a normalized mySQL data-model. Note: you can separate authors and titles into two tables to further reduce data redundancy. You can probably use similar techniques for the "requests dictionaries":
CREATE TABLE USERS(
UID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
userName varchar(255) UNIQUE,
password varchar(30),
userEmail varchar(255) UNIQUE,
deviceID varchar(255)
) ENGINE=InnoDB;
CREATE TABLE BOOKS(
BKID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
Title varchar(255),
Author varchar(50)
) ENGINE=InnoDB;
ALTER TABLE BOOKS
ADD FOREIGN KEY (FKUSERS)
REFERENCES USERS(UID);
CREATE TABLE NOTES(
ID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
FKBOOKS int,
Date date,
Notes text
) ENGINE=InnoDB;
ALTER TABLE NOTES
ADD FOREIGN KEY BKNO (FKUSERS)
REFERENCES USERS(UID);
ALTER TABLE NOTES
ADD FOREIGN KEY (FKBOOKS)
REFERENCES BOOKS(BKID);
In your case, I will abstract out the class that handles this table. Then keep the data normalized. if in future, the data access patterns changes and i need to normalized the data, i css just do so with less impact on the program. I just need to change the class that handles this set of data to query the normalized tables , but return the data as if the database structure never changed.
We are currently thinking about different ways to implement custom fields for our web application. Users should be able to define custom fields for certain entities and fill in/view this data (and possibly query the data later on).
I understand that there are different ways to implement custom fields (e.g. using a name/value table or using alter table etc.) and we are currently favoring using ALTER TABLE to dynamically add new user fields to the database.
After browsing through other related SO topics, I couldn't find any big drawbacks of this solution. In contrast, having the option to query the data in fast way (e.g. by directly using SQL's where statement) is a big advantage for us.
Are there any drawbacks you could think of by implementing custom fields this way? We are talking about a web application that is used by up to 100 users at the same time (not concurrent requests..) and can use both MySQL and MS SQL Server databases.
Just as an update, we decided to add new columns via ALTER TABLE to the existing database table to implement custom fields. After some research and tests, this looks like the best solution for most database engines. A separate table with meta information about the custom fields provides the needed information to manage, query and work with the custom fields.
The first drawback I see is that you need to grant your application service with ALTER rights.
This implies that your security model needs careful attention as the application will be able to not only add fields but to drop and rename them as well and create some tables (at least for MySQL).
Secondly, how would you distinct fields that are required per user? Or can the fields created by user A be accessed by user B?
Note that the cardinality of the columns may also significantly grow. If every user adds 2 fields, we are already talking about 200 fields.
Personally, I would use one of the two approaches or a mix of them:
Using a serialized field
I would add one text field to the table in which I would store a serialized dictionary or dictionaries:
{
user_1: {key1: val1, key2, val2,...},
user_2: {key1: val1, key2, val2,...},
...
}
The drawback is that the values are not easily searchable.
Using a multi-type name/value table
fields table:
user_id: int
field_name: varchar(100)
type: enum('INT', 'REAL', 'STRING')
values table:
field_id: int
row_id: int # the main table row id
int_value: int
float_value: float
text_value: text
Of course, it requires a join and is a bit more complicated to implement but far more generic and, if indexed properly, quite efficient.
I see nothing wrong with adding new custom fields to the database table.
With this approach, the specific/most appropriate type can be used i.e. need an int field? define it as int. Whereas with a name/value type table, you'd be storing multiple data types as one type (nvarchar probably) - unless you complete that name/value table with multiple columns of different types and populate the appropriate one but that is a bit horrible.
Also, adding new columns makes it easier to query/no need to involve a join to a new name/value table.
It may not feel as generic, but I feel that's better than having a "one-size fits all" name/value table.
From an SQL Server point of view (2005 onwards)....
An alternative, would be to store create 1 "custom data" field of type XML - this would be truly generic and require no field creation or the need for a separate name/value table. Also has the benefit that not all records have to have the same custom data (i.e. the one field is common, but what it contains doesn't have to be). Not 100% on the performance impact but XML data can be indexed.