Relational Database Design (MySQL) - mysql

I have a table User that stores user information - such as name, date of birth, locations, etc.
I have also created a link table called User_Options - for the purpose of storing multi-value attributes - this basically stores the checkbox selections.
I have a front-end form for the user to fill in and create their user profile. Here are the tables I have created to generate the checkbox options:
Table User_Attributes
=====================
id attribute_name
---------------------
1 Hobbies
2 Music
Table User_Attribute_Options
======================================
id user_attribute_id option_name
--------------------------------------
1 1 Reading
2 1 Sports
3 1 Travelling
4 2 Rock
5 2 Pop
6 2 Dance
So, on the front-end form there are two sets of checkbox options - one set for Hobbies and one set for Music.
And here are the User tables:
Table User
========================
id name age
------------------------
1 John 25
2 Mark 32
Table User_Options
==================================================
id user_id user_attribute_id value
--------------------------------------------------
1 1 1 1
2 1 1 2
3 1 2 4
4 1 2 5
5 2 1 2
6 2 2 4
(in the above table 'user_attribute_id' is the ID of the parent attribute and 'value' is the ID of the attribute option).
So I'm not sure that I've done all this correctly, or efficiently. I know there is a method of storing hierarchical data in the same table but I prefer to keep things separate.
My main concern is with the User_Options table - the idea behind this is that there only needs to be one link table that stores multi-value attributes, rather than have a table for each and every multi-value attribute.

The only thing I can see that I'd change is that in the association table, User_Options, you have an id that doesn't seem to serve a purpose. The primary key for that table would be all three columns, and I don't think you'd be referring to the options a user has by an id--you'd be getting them by user_id/user_attribute_id. For example, give me all the user options where user is 1 and user attribute id is 2. Having those records uniquely keyed with an additional field seems extraneous.
I think otherwise the general shape of the tables and their relationships looks right to me.

There's nothing wrong with how you've done it.
It's possible to make things more extensible at the price of more linked table references (and in the composition of your queries). It's also possible to make things flatter, and less extensible and flexible, but your queries will be faster.
But, as is usually the case, there's more than one way to do it.

Related

DB design little or too much data

I'm currently working on a little project that uses MySQL. However I'm struggling with the database design. Currently I've come up with 2 designs, one stores more data but is actually the way I want it to be, however this way makes it really hard to work with the data. The other way is I think more basic and simplifies a lot of things but stores less data.
Design 1
Example data items table
id
description
time_created
1
Car
2021-04-17 17:30:00
2
Bike
2021-04-17 17:30:00
Example data user_items table
id
user_id
item_id
time_achieved
1
1
1
2021-04-17 17:30:04
2
1
1
2021-04-17 17:30:03
3
1
1
2021-04-17 17:30:17
4
1
1
2021-04-17 17:30:22
5
1
1
2021-04-17 17:30:34
6
1
2
2021-04-17 17:30:42
7
1
2
2021-04-17 17:30:54
Design 2
Example data items table
id
description
time_created
1
Car
2021-04-17 17:30:00
2
Bike
2021-04-17 17:30:00
Example data user_items table
id
user_id
item_id
count
1
1
1
5
2
1
2
2
Basically we have items that can be anything, they include a description to specify what they actually are. A user can collect items (a lot). These are stored in the user_items table which contains a FK user_id and item_id to the users and items table. The users table is left out for simplicity.
As you can see design 1 stores a lot more rows for the user_items table, this allows us to add more information (time_achieved and more) per item that a user achieved. However this results in more rows and probably a harder time queriyng. Design 2 on the other hand simply adds a count column to determine how many items the user has, but this is very limiting because we cannot add more data (achieved time..) per user_item.
I'm not sure if design 1 is the right and only design for what we want to achieve. Basically we really want to store additional metadata per user_item but I just don't know if this is the right design since it quickly fills up the database. Does anyone have a suggestion/idea for an alternative design which stores less data than design 1 but still allows to add more info per user_item?
Thanks in advance.
Does anyone have a suggestion/idea for an alternative design which stores less data than design 1 but still allows to add more info per user_item?
Design 1 should work.
This design will also work but quickly fills up, more efficient.
id, item_id,Item_des,Item_qty,user_id,username,time_created all in one table.
some of the values will be repeated.

MYSQL DB Best method to store keywords and URL index

Which of these methods would be the most efficient way of storing, retrieving, processing and searching a large (millions of records) index of stored URLs along with there keywords.
Example 1: (Using one table)
TABLE_URLs-----------------------------------------------
ID DOMAIN KEYWORDS
1 mysite.com videos,photos,images
2 yoursite.com videos,games
3 hissite.com games,images
4 hersite.com photos,pictures
---------------------------------------------------------
Example 2: (one-to-one Relationship from one table to another)
TABLE_URLs-----------------------------------------------
ID DOMAIN KEYWORDS
1 mysite.com
2 yoursite.com
3 hissite.com
4 hersite.com
---------------------------------------------------------
TABLE_URL_KEYWORDS---------------------------------------------
ID DOMAIN_ID KEYWORDS
1 1 videos,photos,images
2 2 videos,games
3 3 games,images
4 4 photos,pictures
---------------------------------------------------------
Example 3: (one-to-one Relationship from one table to another (Using a reference table))
TABLE_URLs-----------------------------------------------
ID DOMAIN
1 mysite.com
2 yoursite.com
3 hissite.com
4 hersite.com
---------------------------------------------------------
TABLE_URL_TO_KEYWORDS------------------------------------
ID DOMAIN_ID KEYWORDS_ID
1 1 1
2 2 2
3 3 3
4 4 4
---------------------------------------------------------
TABLE_KEYWORDS-------------------------------------------
ID KEYWORDS
1 videos,photos,images
2 videos,games
3 games,images
4 photos,pictures
---------------------------------------------------------
Example 4: (many-to-many Relationship from url to keyword ID (using reference table))
TABLE_URLs-----------------------------------------------
ID DOMAIN
1 mysite.com
2 yoursite.com
3 hissite.com
4 hersite.com
---------------------------------------------------------
TABLE_URL_TO_KEYWORDS------------------------------------
ID DOMAIN_ID KEYWORDS_ID
1 1 1
2 1 2
3 1 3
4 2 1
5 2 4
6 3 4
7 3 3
8 4 2
9 4 5
---------------------------------------------------------
TABLE_KEYWORDS-------------------------------------------
ID KEYWORDS
1 videos
2 photos
3 images
4 games
5 pictures
---------------------------------------------------------
My understanding is that Example 1 would take the largest amount of storage space however searching through this data would be quick (Repeat keywords saved multiple times, however keywords are sat next to the relevant domain)
wWhereas Example 4 would save a tons on storage space but searching through would take longer. (Not having to store duplicate keywords, however referencing multiple keywords for each domain would take longer)
Could anyone give me any insight or thoughts on which the best method would be to utilise when designing a database that can handle huge amounts of data? With the foresight that you may want to display a URL with its assosicated keywords OR search for one or more keywords and bring up the most relevant URLs
You do have a many-to-many relationship between url and keywords. The canonical way to represent this in a relational database is to use a bridge table, which corresponds to example 4 in your question.
Using the proper data structure, you will find out that the queries will be much easier to write, and as efficient as it gets.
I don't know what drives you to think that searchin in a structure like the first one will be faster. This requires you to do pattern matching when searching for each single keyword, which is notably slow. On the other hand, using a junction table lets you search for exact matches, which can take advantage of indexes.
Finally, maintaining such a structure is also much easier; adding or removing keywords can be done with insert and delete statements, while other structures require you do do string manipulation in delimited list, which again is tedious, error-prone and inefficient.
None of the above.
Simply have a table with 2 string columns:
CREATE TABLE domain_keywords (
domain VARCHAR(..) NOT NULL,
keyword VARCHAR(..) NOT NULL,
PRIMARY KEY(domain, keyword),
INDEX(keyword, domain)
) ENGINE=InnoDB
Notes:
It will be faster.
It will be easier to write code.
Having a plain id is very much a waste.
Normalizing the domain and keyword buys little space savings, but at a big loss in efficiency.
"Huse database"? I predict that this table will be smaller than your Domains table. That is, this table is not your main concern for "huge".

Questions regarding phpMyAdmin Table Relations and DB Structure

I'm creating a website for use by a Youth Organisation to organize events, by providing a place for upcoming events to be listed and to be signed up to by members. In my database I have a table full of tags that can be assigned to events, quite like this website where you can tag your questions. I have another table to store the information about the events, for example title, description, requirements, date, etc.
I want to connect these databases up, so that when an event is made they are assigned a primary tag and an infinite amount of secondary tags from those in the tag table. Currently, I have a linking table that has a field for the event ID, the tag ID and whether the tag is the primary tag or not, however as I have had to set the fields to unique to allow me to create a relation I cannot store enter multiple event or tag IDs.
My question is, what is the the best way for me to structure my database for the functionality described above? Further more, if what I am doing correctly is correct, then how can I link the tables without either field in the linking table being a primary or unique key?
tblEvents
tblTags
As you have whats referred to as a "many to many" relationship (many events can have many tags) you need an intermediate table that handles assignment - this is called normalisation.
In this case, you need 3 columns: AssignmentID, EventID, TagID
Put all your tags in the same database table, but flag each one as either primary or secondary and handle the 1 primary tag per event within your code outside of the database.
For example your tag table could look like:
ETagID ETagName ETagColour ETagPrimary ETagDel
1 first red 1 0
2 second blue 0 0
3 third green 0 0
4 forth yellow 1 0
5 fifth orange 0 0
6 sixth white 0 0
and your assignment table:
AssignmentID EventID TagID
1 1 1
2 1 2
3 1 5
4 2 4
5 3 4
6 3 1
7 4 4
As your code outside of the sql handles the insertions in the first place, you can now query your tables using joins to pull out the event + tags for that event
SELECT ETagName, ETagColour FROM TagTable
JOIN AssignmentTable on AssignmentTable.TagID = TagTable.ETagID
JOIN EventTable on EventTable.EID = AssignmentTable.EventID
WHERE EventTable.EID = <some value> AND TagTable.ETagDel = 1
This would select all the tag names and colours for that specific event that aren't deleted.
An important thing to note is not to overcomplicate things. If your primary and secondary tags store the same info except for being either primary or secondary, then its pointless separating them into individual tables. Flagging them like I mentioned will be sufficient and reduces the number of tables required.
Hopefully this points you in the right direction moving forwards
Update:
As per the recent comment, you can handle the allocation of the primary tag within the assignment table. Create the same table as above, but include the primary flag column too
AssignmentID EventID TagID PrimaryFlag
1 1 1 1
2 1 2 0
3 1 5 0
4 2 4 1
5 3 4 1
6 3 1 0
7 4 4 1
then within the query, you can also select the status of the tag using a slightly modified version of the one written before:
SELECT ETagName, ETagColour, AssignmentTable.PrimaryFlag FROM TagTable
JOIN AssignmentTable on AssignmentTable.TagID = TagTable.ETagID
JOIN EventTable on EventTable.EID = AssignmentTable.EventID
WHERE EventTable.EID = <some value> AND TagTable.ETagDel = 1
if you want to make sure the primary tag appears at the top of that list, you can also bolt on
ORDER BY AssignmentTable.PrimaryFlag

processing MySQL data when there are field values inserted with commas

I have some columns in mysql table with field vaues are seperated with commas. fields like IP address and running_port_ids, dns_range or subnet etc. running a cron to check every hour whether the ports are used or not on the appliance. if ports are used against each appliance running_port_ids(like 2,3,7) are inserted with comma seperated values.
How to process the data so that i can get a reports which ports are less used (i have a static list of port ids) in ascending order like below by grouping of address, running_port_ids and insert date for a date range of one month.
address port usage%
10.2.1.3 3 1
10.3.21.22 2 20
there are thousands of record now in the table with comma seperated running_port_ids. is there any methods available in MySql to do this?
Any help much appreciated.
If you can convert your data model to a n:m relation (or "link table"), i.e. normalize your data model, this is pretty easy using grouping (or "aggregate") functions. So I'd advise to revise your data model and introduce a table containing one row for each of the ports, in stead of storing this de-normalized in a text column.
A typical example would be: "student has many classes", and a property of this relation is "attendance":
Student
id name
1 John
2 Jane
Course
id name
1 Engineering
2 Databases
Class
id courseid date room
1 1 2015-08-05 10:00:00 301
2 1 2015-08-13 10:00:00 301
3 1 2015-09-03 10:00:00 301
StudentClass
studentid classid attendance
1 1 TRUE
1 2 FALSE
1 3 NULL
2 1 TRUE
2 2 TRUE
2 3 NULL
In this case, you can see the relation between student and class is normalized, i.e. every other value is stored vertically in stead of horizontally. This way, you can easily query things like "How many classes did John miss?" or "How many students did not miss any class". NULL in the example shows that we can not yet tell anything about the attendance (as the date is in the future), but we do know that they should attend.
This is the way you should keep track of properties of a relation between two things. I can't really make out what you're trying to build, but I'm pretty sure you need a similar model.
Hope this helps.

Advice on linking product codes to a product in a MySQL database

I need some advice of how to setup my tables I currently have a product table and a product codes table.
In the codes table I have an id and a title such as:
1 567902
2 345789
3 345678
there can be many items in this table.
In my product table I have the usual product id,title, etc but also a code id column that I'm currently storing a comma separate list of ids for any codes the product needs to reference.
in that column I could end up with ids like: 2,5,6,9
I'm going to need to be able to search the products table looking for code ids for a specific set this is where I've come into problems trying to use id IN ($var) or FIND_IN_SET is proving problematic I've been advised to restructure it I'm happy to do just wondering what the best method would be.
Sounds like you have two choices. If this is a 1 to many relationship, then you need to have the foreign key in the code table, not the product table.
i.e.
codeId code productId
1 567902 2
2 345789 6
3 345678 9
4 345690 9
The other option is to have another table which contains productId and codeId (both as foreign keys), this is a many-to-many relationship. This is what you should go for if a code can be assigned to multiple products (I assume not). It will look something like this:
codeId productId
1 2
1 10
2 6
3 9
4 9
I think the first option is what you need.