Is it possible to add extra columns to fields in mysql? - mysql

We are looking to extend MySQL by adding extra columns to each "column". Right now you have the following.
Field, Type, Null, Key, Default, Extra
We want to be able to add to the "column" definition an extra column like, Attributes. Our system has certain design specifications that we need to describe more data per "column". How can we accomplish this in MySQL?
The query to return back all of the columns is as follows.
SHOW COLUMNS FROM MyDB.MyTable;
EDIT 1
I should have added this to begin with, and I apologize for not doing so. We are currently describing attributes in the Comments section for each column type, and we understand that this is a very dirty solution, but it was the only one we could think of at the time. We have built a code generator that revolves around the DB structure and is what really stems from this initiative. We want to describe code attributes for a column so the code generator can pick up the changes and refresh the code base on each change or run.

First, terminology: "field" and "column" are basically synonyms in this context. There is no distinction between fields and columns. Some MySQL commands even allow you to use these two words interchangeably (e.g. SHOW FIELDS FROM MyDB.MyTable).
We want to assign attributes to each column in a table. Adding "field_foo" for "field" would repeat the same data over and over again for each row.
Simple answer:
If you want more attributes that pertain to a given column foo, then you should create another table, where foo is its primary key, so each distinct value gets exactly one row. This is part of the process of database normalization. This allows you to have attributes to describe a given value of foo without repeating data, even when you use that value many times in your original table.
It sounds like you might also need to allow for extensibility and you want to allow new columns at some future time, but you don't know which columns or how many right now. This is a pretty common project requirement.
You might be interested in my presentation Extensible Data Modeling, in which I give an overview of different solutions in SQL for this type of problem.
Extra Columns
Entity-Attribute-Value
Class Table Inheritance
Serialized LOB
Inverted Indexes
Online Schema Changes
Non-Relational Databases
None of these solutions are foolproof, each has their strengths and weaknesses. So it is worth learning about all of them, and then decide which ones have strengths that matter to your specific project, while their weaknesses are something that doesn't inconvenience you too much (that's the decision process for many software design choices).
We are currently describing attributes in the Comments section for each column type
So you're using something like the "Serialized LOB" solution.

Related

First Normal Form Database Normalization

I'm new in database normalization and I want to know if I'm doing the right thing about the structure of the table that I created.
I have a table called "RoomRates" with next columns:
RoomRateId(int), RoomType(int), Season(int), Monday(decimal),Tuesday(decimal), Wednesday(decimal), Thursday(decimal),Friday(decimal), Saturday(decimal), Sunday(decimal)
As far as I know this structure of the table is breaking First Normal Form.
I have a foreign key on RoomType and Season.
Should I turn table's structure into:
RoomRateId(int), RoomType(int), Season(int), DayOfTheWeek(int), Rate(decimal)
in order to not breaking First Normal Form ?
Fastly, I will provide you some tips. I recommend do not walk step by step throught normal forms, because you will do re-make your previous work (and your team's work), unless you are practicing.
As general rule designing databases, each field must exist once in the whole database. You also must group semantic data into tables, and link them with a foreign key.
Referencing your question, I could do a full remake of tables. You probably don't need a column for each day of week. Did you think if instead of several columns, you can use only one? The foreign would be light as possible (one column recommended, unless does not exist any other way to solve the link). Always think if your design will be useful for huge amount of data, in terms of querying and storing.
I recommend to you take a look to this link: A beginner's guide to SQL database design
Hope this helps.
To me your table is not breaking the 1 Normalized Form.
The first form 1NF has the following characteristics: no ordering, neither left to right for columns not top-to-bottom for the rows. Every intersection has only one value of the specific domain and nothing else. But the most important characteristic, I think of a relation in the first form is simply that all attributes in that relation should only have atomic values. Attributes with non atomic values should be decomposed to reach the 1NF.
I think you table columns will have atomic values only.
Also, the idea of a column for the day of the week is better than several columns, leading to much storage space and more processing time.

Which of these 2 MySQL DB Schema approaches would be most efficient for retrieval and sorting?

I'm confused as to which of the two db schema approaches I should adopt for the following situation.
I need to store multiple attributes for a website, e.g. page size, word count, category, etc. and where the number of attributes may increase in the future. The purpose is to display this table to the user and he should be able to quickly filter/sort amongst the data (so the table strucuture should support fast querying & sorting). I also want to keep a log of previous data to maintain a timeline of changes. So the two table structure options I've thought of are:
Option A
website_attributes
id, website_id, page_size, word_count, category_id, title_id, ...... (going up to 18 columns and have to keep in mind that there might be a few null values and may also need to add more columns in the future)
website_attributes_change_log
same table strucuture as above with an added column for "change_update_time"
I feel the advantage of this schema is the queries will be easy to write even when some attributes are linked to other tables and also sorting will be simple. The disadvantage I guess will be adding columns later can be problematic with ALTER TABLE taking very long to run on large data tables + there could be many rows with many null columns.
Option B
website_attribute_fields
attribute_id, attribute_name (e.g. page_size), attribute_value_type (e.g. int)
website_attributes
id, website_id, attribute_id, attribute_value, last_update_time
The advantage out here seems to be the flexibility of this approach, in that I can add columns whenever and also I save on storage space. However, as much as I'd like to adopt this approach, I feel that writing queries will be especially complex when needing to display the tables [since I will need to display records for multiple sites at a time and there will also be cross referencing of values with other tables for certain attributes] + sorting the data might be difficult [given that this is not a column based approach].
A sample output of what I'd be looking at would be:
Site-A.com, 232032 bytes, 232 words, PR 4, Real Estate [linked to category table], ..
Site-B.com, ..., ..., ... ,...
And the user needs to be able to sort by all the number based columns, in which case approach B might be difficult.
So I want to know if I'd be doing the right thing by going with Option A or whether there are other better options that I might have not even considered in the first place.
I would recommend using Option A.
You can mitigate the pain of long-running ALTER TABLE by using pt-online-schema-change.
The upcoming MySQL 5.6 supports non-blocking ALTER TABLE operations.
Option B is called Entity-Attribute-Value, or EAV. This breaks rules of relational database design, so it's bound to be awkward to write SQL queries against data in this format. You'll probably regret using it.
I have posted several times on Stack Overflow describing pitfalls of EAV.
Also in my blog: EAV FAIL.
Option A is a better way ,though the time may be large when alert table for adding a extra column, querying and sorting options are quicker. I have used the design like Option A before, and it won't take too long when alert table while millions records in the table.
you should go with option 2 because it is more flexible and uses less ram. When you are using option1 then you have to fetch a lot of content into the ram, so will increases the chances of page fault. If you want to increase the querying time of the database then you should defiantly index your database to get fast result
I think Option A is not a good design. When you design a good data model you should not change the tables in a future. If you domain SQL language, using queries in option B will not be difficult. Also it is the solution of your real problem: "you need to store some attributes (open number, not final attributes) of some webpages, therefore, exist an entity for representation of those attributes"
Use Option A as the attributes are fixed. It will be difficult to query and process data from second model as there will be query based on multiple attributes.

Custom fields / attributes model

I need to implement custom fields in a booking software. I need to extend some tables containing, for example, the user groups with dynamic attributes.
But also, a product table where each product can have custom fields (and ideally these fields could be nested).
I already made some searches about EAV but I read many negative comments, so I'm wondering which design to use for this kind of things.
I understand using EAV causes many joins to sort a page of products, but I don't feel like I want to alter the groups/products tables, each time an attribute is created.
Note : I use Innodb
The only good solution is pretty much what you don't want to do, alter the groups/products tables, each time an attribute is created. It's a pain, yes, but it will guarantee data integrity and better performance.
If you don't want to do that, you can create a table with TableName, FieldName, ID and value, and hold lets say:
TableName='Customer', FieldName='Address', ID =1 (customers ID), Value
='customers address'
But as you said, it will need loads of joins. I don't think it is a good solution, I've seen it but wouldn't really recommend it. Just showing because well, it is one possible solution.
Another solution would be to add several pre-defined columns on your tables like column1, column2, column3 and so on and use them as necessary. It's a solution as worst as the previous one but I've seen major ERPs that use it.
Mate, based on experience, anything you will find on this area would be a huge work around and won't be worth implementing, the headache you will have to maintain it will be bigger than adding your fields to your table. Keep it simple and correct.
I am working on a project entirely based on EAV. I agree that EAV make things complex and slow, but it has its own advantages like we don't need to change the database structure or code for adding new attributes and we can have hierarchies among the data in the database tables.
The system can get extremely slow if we are using EAV at all the places.
But, Eav is very helpful, if used wisely. I will never design my entire DB based on EAV. I will divide the common and useful attributes and put them in flat tables while for the additional attributes (which might need to be changed depending on clients or various requirements), I will use EAV.
This way we can have the advantages of EAV which includes flexibility what you want without getting much trouble.
This is just my suggestion, there might be a better solution.
You can do this by adding at least 2 more tables.
One table will contain attribute unique key (attr_id) and attribute values, like attribute name and something else that is needed by your business logic.
Second table will serve as join between your say products table and attributes table and should have the following fields:
(id, product_id, attr_id)
This way, you can add as many dynamic attributes as you like, and your database schema will be future proof.
The only downside that queries now will have to add 2 more tables to be joined.

MySQL: Key-Value vs. Null Columns for creating tables with unknown number of columns

I am designing a database which holds a lot of information for a user. Currently I need to store 20 different values, but over time I could be be adding more and more.
I have looked around StackOverflow for simular questions, but it usually ends up with the asker just not designing his table correctly.
So based of what I have seen around StackOverflow, should I:
Create a table with many null columns and use them when needed (this seems terrible to me)
Create a users table and a information table where information is a key-value pair: [user_id, key, value]
Anything else you can suggest?
Keep in mind this is for a MySQL database, so I understand the disliking for a Key-Value table on a relational database.
Thanks.
hmm, i am a bit confused by the question, but it sounds like you want to have lots of attributes for one user right? And in the future you want to add more??
Well, isn't that just have a customer_attribute_ref ref table of some sort, then you can easily add more by then inserting to the ref table, then in the customer table you have at least three columns : 1. customer ID 2. customer attribute ID 3. customer attribute value...
may be i missed your question. Can you clarify
I'd suggest 3. A hybrid of 1 and 2. That is, put your core fields, which are already known, and you know you'll be querying frequently, into the main table. Then add the key-value table for more obscure or expanded properties. I think this approach balances competing objectives of keeping your table width relatively narrow, and minimizing the number of joins needed for basic queries.
Another approach you could consider instead of or in combination with the above is an ETL process of some kind. Maybe you define a key-value table as a convenient way for your applications to add data; then set up replication, triggers, and/or a nightly/hourly stored procedure to transform the data into a form more suitable for querying and reporting purposes.
The exact best approach should be determined by careful planning and consideration of the entire architecture of your application.

MySQL - When to have one to one relationships

When should one use one to one relationships? When should you add new fields and when should you separate them into a new table?
It seems to me that you'd use it whenever you're grouping fields and/or that group tends to be optional. Yes?
I'm trying to create the tables for an object but grouping/separating everything would require me about 20 joins and some even 4 levels deep.
Am I doing something wrong? How can I improve?
First, I highly recommend reading about Normal Forms
A normalized relational database is extremely useful, and doing this properly is the reason tools such as Hibernate exist - to help manage the difference between objects-represented-as-relational-mappings and objects-as-progrommatic-entities.
Anything that has a one-to-one mapping should probably be in the same table. A Person has only one first name, one last name. Those should logically be in the same table. Having a reference to a table of names isn't necessary - in particular because little additional data can be stored about a name. Obviously, this isn't always true (an etymology database might want to do exactly that), but for most uses, you don't care about where a name comes from - indeed all you want is the name.
Therefore, think of the objects being represented. A person has some singular data points, and some one-to-many relationships (addresses they have lived, for instance). One to many and many to many will almost always require a separate table (or two, to have many to many). Following those two guidelines, you can get a normalized database pretty fast.
Note that optional fields should be avoided if at all possible. Usually this is a case of having a separate table holding the field with a reference back to the original table. Try to keep your tables lean. If a field isn't likely to have something, it probably should be a row in it's own table. Many such properties suggests a 'Property' table that can hold arbitrary optional properties of a particular type (ie, as are applied to a 'Person').