Database design to assign specific tags to an item - mysql

I am trying to build a little system that would assign specific tags to an item, or a person to be precise. So, I have a list of persons and a list of tags. I need to assign 3 specific tags to each person (that correspond to 3 different skills this person might have).
In a nutshell, the output would look like this :
Person 1 | webdesign, ux, jquery
Person 2 | blogging, photography, wordpress
Person 3 | graphic-design, 3d, inventor
...
For now, those lists are stored in two different tables :
persons
-------
person_id
person_name
tags
-------
tag_id
tag_name
My main goal is to avoid repetition and to simply assign 3 existing tags to an existing person.
Could you give me a few hints on how to implement this? I know that a three-table design is common for a tagging system, but is it relevant in my situation?
Thanks for your help.

If you want to ensure that you don't have any duplicates and to be able to add N tags to a person, then to properly implement a normalized design you would need a third table to link the tags to each person
persons_2_tags
--------------
person_id
tag_id
To guarantee uniqueness, you can either use a composite primary key, or add a unique index to the table including both columns.
See an example of the above in this SQL Fiddle.
If you need to enforce the 3 tag limit at the database level, you can add a third column to the persons_2_tags table (tag_number for example) that is an enum with values of 1, 2, 3 and add that to your unique index. Insert logic would need to be handled at the application level, but would be enforced by the index.

Do your requirements specify "exactly" 3 tags?
The third table is recommended to stay normalized. It's a typical Many-to-many relationship. This offeres the greatest amount of flexibility since you can have an unlimited, yet unique list of user/tag pairs.
You could have 3 columns in the user table for each tag. Performance would be improved at the expense of flexibility. Queries like, "List all the users with tag = 'X'" are a little harder. There may be several null values if you allow fewer than 3 tags. Of course in this setup, you'll have to create a new column and a lot of code to expand beyond three columns.

I think that I would probably do the three table design which Jeff O mentioned, however, just to present an alternative view...
If you're just talking about tags, that is, a short string with no other meta data, I don't know that you'd need a tags table. Essentially, the tag itself could be the its id.
persons (person_id, person_name);
tags (person_id, tag);
Yeah, you'd get a bit of repetition there, but they're short strings anyway and it should really make a difference.

Related

MySQL multiple column relationships between 2 tables

I have this problem in a table where there are 4 columns which include terms describing the product. I want to make this terms editable (and you can add more) in my app and there are 4 groups of them obviously. I created a table who has all these terms altogether but the product table will have to create 4 relationships with the ID of the terms table.
Is this a good solution?
The main reason I don't want to make 4 different tables for the terms is because there aren't many of them and as the app progresses we might have even more different term groups, thus adding many small tables cluttering the database.
Any suggestion?
Update #1: Here is my current schema http://i.imgur.com/q2a1ldk.png
Have a product table and a terms (product_id, terms_name, terms_description) which will allow you to add as many or as little terms for each product as you want. You just need to retrieve all terms from the terms table with a particular product id.
You could try a mapping table:
apputamenti(id, ...)
term_map (apputamenti_id, term_id)
terms (id, text, type)
So you can add as many terms as you want.
Or if you want to specify the mapping with one more field, change:
term_map (apputamenti_id, term_id, map_type)
so you can use an enum for map_type like enum(tipologia, feedback, target) or whatever your original fields where

MySQL table with multiple values in one field [duplicate]

This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 4 years ago.
I'm building a database for a real estate company.. Those real estate properties are managed by the sales people using a CodeIgniter website.
How should I proceed with the database design for this table. The fields like address/location, price etc. are pretty straightforward but there are some sections like Kitchen Appliences, Property Usage etc... where the sales people can check multiple values for that field.
Furthermore, a property can have multiple people attached to it (from the table people), this can be an intermediaire, owner, seller, property lawyer etc... Should I use one field for this or just I create an extra table and normalize the bindings?
Is the best way to proceed just using one field and using serialized data or is there a better way for this?
In a relational database you should never, EVER, put more than one value in a single field - this violates first normal form.
With the sections like Kitchen Appliances, Property Usage etc... where the sales people can check multiple values for that field, it depends on whether it will always be the same set of multiple options that can be specified:
If there are always the same set of multiple options, you could include a boolean column on the property table for each of the options.
If there are likely to be a variety of different options applicable to different properties, it makes more sense to set up a separate table to hold the available options, and a link table to hold which options apply to which properties.
The first approach is simpler and can be expected to perform faster, while the latter approach is more flexible.
A similar consideration applies to the people associated with each house; however, given that a single property can have (for example) more than one owner, the second approach seems like the only one viable. I therefore suggest separate tables to hold people details (name, phone number, etc) and people roles (such as owner, seller, etc.) and a link table to link roles to properties and people.
You should create extra table for it....
For example...
Consider the scenario that 1 item may have many categories and 1 category may have many items...
Then you can create table like this...
Create three tables for that....
(1) tblItem :
fields:
itemId (PK)
itemName
etc..
(2) tblCategory :
fields:
categoryId (PK)
categoryName
etc..
(3) tblItemCategory :
fields:
itemId (PK/FK)
categoryId (PK/FK)
So according to your data you can create an extra table for it....
I think it would be optimize way....
If you want your data to be in third normal form, then you should think in terms of adding extra tables, rather than encoding multiple values into various fields. However it depends how far your brief goes.

database design for tagging multiple sources (MySQL)

I'm working on a project where I have the following (edited) table structures: (MySQL)
Blog
id
title
description
Episode
id
title
description
Tag
id
text
The idea is that that tags can be applied to any Blog or Episode (and to other types of sources), new tags can be created by the user if it doesn't exist already in the tag table.
The purpose of the tags is that a user will be able to search the site, and the results will search across all types of material on the site. Also, at the bottom of each blog article/episode description it would have a list of tags for that item.
I'd thought too much about the search mechanism, but I guess it'd be flexible between an OR and AND searches, if that has any impact on choices, and probably allow the user to filter the results for particular types of sources.
Originally I was planning to create multiple tag mapping tables:
BlogTag
id
tag_id
blog_id
EpisodeTag
id
episode_id
tag_id
But now I wonder if I would be better off with:
TaggedStuff
id
source_type
source_id
tag_id
Where source_type would be an integer related to whether it was an Episode, Blog, or some other type that I've not included in the structures above, and source_id would be the reference in that particular table.
I'm just wondering what the optimum structure would be for this, the first choice or the second?
In a clean (academic) design you would often see to have a supertype Resource (or something similar) for Blog and Episode with it's own table. Another table for the tags. And since it's a N:M relationship between Tag and Resource you have an extra mapping table between them.
So in such a design you would associate the Tag-Entities with your resources by having a relationship to their generalization.
After that you can put general attributes to the generalization. (i.e. title, description)
You can add attributes to the relationship between Tag and Resource like a counter how often a specific resource was tagged with a specific tag. Or how often a tag was used and and and (i.e. something like you see on stackoverflow in the upper right here)
The biggest loss in going with structure 2 is loss of referential integrity. If you can say "whatever" to that, it might be easier to go with this structure.
When I say structure 2 I mean:
TaggedStuff
id
source_type
source_id
tag_id
If I understand you correctly, the point is to optimize search mechanism...
So it has sense to make some kind of index_table and demoralize the data there...
I mean smth like this:
Url, Type, Title, Search_Field etc..
where Url is the path to the article or episode, Type (article|episode), Name (what users will see), Search_Field ( list of tags, other important data for search )
thats why both variants are quite good)))

How to store these field descriptions in mysql?

Apologize for the long topic, I didn't intend for it to be this long, but it's a pretty simple issue I've been having. :)
Let's say you have a simple table called tags that has columns tag_id and tag. The tag_id is simply an auto increment column and the tag is the title of the tag. If I need to add a description field, that would be around 1-2 paragraphs on average (max around 3-4 paragraphs probably), should I simply add a description field to the table or should I create a new table called tag_descriptions and store the descriptions with the tag_id?
I remember reading that it is better to do this because if you do a query that doesn't select the description, that description field will still slow down mysql. Is this true? I don't even remember where I read that from, but I've been kind of following it for a couple years now... Finally I question if I need to do this, I have a feeling I don't. You'd also need to inner join whenever you need the description field.
Another question I have is, is it generally bad to create new tables that will only hold very few rows at the max? What if this data doesn't fit anywhere else?
I have a simple case below which relates to these two questions.
I have three tables content, tags, and content_tags that make up a many to many relationship:
content
content_id
region (enum column with
about 6-7 different values and most
likely won't grow later on)
tags
tag_id
tag
content_tags
content_id
tag_id
I want to store a description around 1-2 paragraphs for each tag, but also for each region. I'm wondering what would be the best way to do this?
Option A:
Just add a description column to the
tags table
Create a new table for
region_descriptions
Option B:
Create a new table called
descriptions with fields: id,
description, and type
The id would be id of the content or
id of the enum field
The type would be whether it is a tag
description, or region description
(Would use the enum column for this)
Maybe have a primary key on the id and type?
Option C:
Create a new table for tag_descriptions
Create a new table for region_descriptions
Option A seems to be a good choice if adding the description column doesn't slow down mysql select queries that don't need the description.
Assuming the description column would slow down mysql, option B might be a good choice. It also removes the need for a small table with just 6-7 rows that would hold the region descriptions. Although now that I think of it, would it be slow to connect to this table if originally to get a region description you'd only need to go through very little rows.
Option C would be ideal if the description columns would slow down mysql and if a small table like region descriptions would not matter.
Maybe none of these options are the best, feel free to offer another option. Thanks.
P.S. What would be an ideal column type to use to hold data that usually 1-2 paragraphs, but might be a little more sometimes?
I don't think it really matters if you don't handle thousands of queries per minute. If you are going to have a zillion queries per minute, then I would implement the various options and perform benchmarks for all these options. Based on the results, you can make a decision.
In my (admittedly somewhat uninformed) opinion, it really depends on how much you'll be using both of them.
If properly indexed, that JOIN should not be very expensive. Also, a larger table will be slower. It inhibits caching, and takes longer to access stuff, although indexing seriously mitigates this problem.
If you'll be joining tag names to tag IDs a LOT, and only rarely will be using the descriptions, I'd say go with separate tables. If you'll be using the descriptions more often, go with one table.
For the first part of your question: if you have a tag with an id, a name and a description, you should save it in 1 table.
Now, this query
SELECT name FROM tags WHERE id = 1;
will NOT slow down if you have 1, 2 or 20 extra fields in there.

What Db structure should I use with 2 the same comment tables for 2 different parent tables

This is a tough design question for a application I'm working on. I have 2 different items in my app that both will use comments. What but I can't decide how to design my database.
There are 2 possibilities here. The first is a different comment table for every table that needs comments (normalized way):
movies -> movie_comments
articles -> article_comments
The second way I was thinking of was the use of a generic comments table and then have a many 2 many relationship for the comment and movie|article relations. Eg
comments
comments_movies (movie_id, comment_id)
comments_articles (article_id, comment_id)
What is your opinion on that the best method would be and can you give a good reason so I can decide.
i personally opt for 2nd solution
comments
comments_movies (movie_id, comment_id)
comments_articles (article_id, comment_id)
it is much more simple to maintain only on table model for logical Comment model e.g. when You wan't to add some feature to comments You just do it once or when You wan't count comments for specific user is much more easier because there are in one table
of course someone else could write his advantages of keeping that in multiple tables but You asked for opinions so here is mine :)
Keeping them separate has the benefit of supporting change without impacting the comments for the other entity (movie vs articles). Assuming there are differences in attributes for a comment against an article vs. a movie. Otherwise...
I suppose there could be a need for displaying a comment with an article and a movie. But the consolidation would also support if you want to provide comment functionality for other entities in the future.
The answer depends on what you need currently, and a best guess of what you want to do in the future. More details help us to know what to suggest.
There is no "best" method, because it is a straight-forward Normalisation question: the proposal is either correctly Normalised or it is not.
Actually, the first option is not Normalised, the Normalisation is not complete. You have identical repeating groups of columns in two tables which have not been identified and grouped into a single table.
The second option is Normalised. You have identified that, and placed them in a single table.
at the logical level then, you have a many-to-many relation (not a table) between Movie and Comment, and between Article and Comment. End of story at the logical level.
at the physical level, where n::n relations are implemented as Associative tables, you have CommentMovie and CommentArticle.
as the Db expands and grows, life is simple, because:
any new column that is 1::1 with Movie.PK is placed in Movie
any new column that is 1::1 with Article.PK is placed in Article
any new column that is 1::1 with Comment.PK is placed in Comment
any new column that is 1::1 with CommentArticle.PK (the relation; PK is as shown (ArticleId, CommentId) ) is placed in CommentArticle. This (adding attributes to an n::n relation) will now cause the table to show up on the Logical model.
any new column that is 1::1 with CommentMovie.PK (the relation; PK is as shown (MovieId, CommentId) ) is placed in CommentMovie. This (adding attributes to an n::n relation) will now cause the table to show up on the Logical model.
I would suggest your second choice:
movies -> movie_comments -> comments
articles -> article_comments -> comments
One comments table, two pivot tables(many to many).
This will keep all the same data in one table and just loosely linking them. If you can get away with joins I usually recommend that for things that don't need to scale because joining can be a performance hit and a nightmare in cases. But this would be best for your case.
comment_table
-------------
comment_id (int)
object_id (int)
comment (varchar(max))
type (int)
--------------
object_id refers to object such as movie ,i articles and so on.
type equals 1: comment was done to movie ,
type equals 2: comment was done to article
You can design your tables like this.