Let's say we have a table with these records of tags:
Category ID
apples 1
orange 2
And then we have another table with a row
Data catID
... 1
With this setup we can retrieve this row only in apples page, what is the proper way to assign both apples & orange to that row? Would I need to change catID field from integer to varchar and just add the second id so the value will be 1,2 and then edit the query to something like:
select * from table where catID LIKE '%1%'
select * from table where catID LIKE '%2%'
instead of
select * from table where catID='1'
select * from table where catID='2'
I'm not sure if this is the proper way? Could someone tell how you do it? Basically, I don't want to duplicate the whole row, just to add another id to it.
As others have already suggested, many-to-many relationship is represented in the physical model by a junction table. I'll do the leg work and illustrate that for you:
The CATEGORY_ITEM is the junction table. It has a composite PK consisting of FKs migrated from the other two tables. Example data...
CATEGORY:
CATEGORY_ID CATEGORY
----------- --------
1 Apple
2 Orange
ITEM:
ITEM_ID NAME
------- ----
1 Foo
2 Bar
CATEGORY_ITEM:
CATEGORY_ID ITEM_ID
----------- -------
1 1
2 1
1 2
The above means: "Foo is both Apple and Orange, Bar is only Apple".
The PK ensures any given combination of category and item cannot exist more than once. The category is either connected to the item of isn't - it cannot be connected multiple times.
Since you primarily want to search for items of given category, the order of fields in the PK is {CATEGORY_ID, ITEM_ID} so the underlying index can satisfy that query. The exact explanation why is beyond this scope - if you are interested I warmly recommend reading Use The Index, Luke!.
And since InnoDB uses clustering, this will also store items belonging to the same category physically close together, which may be rather beneficial for I/O of the query above.
(If you wanted to query for categories of the given item, you'd need to flip the order of fields in the index.)
Have you realized that two ids indexing one row is a typical application of bidirectional relationship management in a real project? We need a smarter solution in DB rather than the two rows/junction table solution. In MongoDB, you could make "low_id:hight_id" as field "_id" and field "uids_low_high", and indexing the "uids_low_high" for "$in:[$id]" search.
Related
A table has column as category_NAME/ID where we can pass either id and create another table for list of that category or directly add the category name.
Which one is fater
CASE 1:
TABLE1
ID | CATEGORY_NAME
CASE 2
here to fetch list we have make one JOIN statement
TABLE1
ID | CATEGORY_ID
CATEGORY_TABLE
ID | CATEGORY NAME
What is the difference between
CASE 1:
TABLE1
ID | CATEGORY_NAME
CASE 2
here to fetch list we have make one JOIN statement
TABLE1
ID | CATEGORY_ID
CATEGORY_TABLE
ID | CATEGORY NAME
I suppose what you mean by first case is Item_id.
In that case it depends on code maintainability and storage. As if category name is changed, you have to update the record in your large table,that will be heavy query.
Also as you are storing category_id instead of its name,it will help in saving storage space.As you are not replicating category name for each record but just refering to category_id.
But clearly it depends on the size of category and size of items.
If TABLE1 size is rather small (let's say less than 10k rows at max), indexed VARCHAR CATEGORY_NAME should work OK, otherwise better to use a separate table for storing categories.
Personally, I would go with option 2. Performance improvement here will not be so dramatically, but storing dictionary field inside another table is a bad practice.
I had a question about whether or not my implementation idea is easy to work with/write queries for.
I currently have a database with multiple columns. Most of the columns are the same thing (items, but split into item 1, item 2, item 3 etc).
So I have currently in my database ID, Name, Item 1, Item 2 ..... Item 10.
I want to condense this into ID, Name, Item.
But what I want item to have is to store multiple values as different rows. I.e.
ID = One Name = Hello Item = This
That
There
Kind of like the format it looks like. Is this a good idea and how exactly would I go about doing this? I will be using no numbers in the database and all of the information will be static and will never change.
Can I do this using 1 database table (and would it be easy to match items of one ID to another ID), or would I need to create 2 tables and link them?
If so how exactly would I create 2 tables and make them relational?
Any ideas on how to implement this? Thanks!
This is a classical type of denormalized data base. Denormalization sometimes makes certain operations more efficient, but more often leads to inefficiencies. (For example, if one of your write queries was to change the name associated with an id, you would have to change many rows instead of a single one.) Denormalization should only be done for specific reasons after a fully normalized data base has been designed. In your example, a normalized data base design would be:
table_1: ID (key), Name
table_2: ID (foreign key mapped to table_1.ID), Item
You're talking about a denormalized table, which SQL databases have a difficult time dealing with. Your Item field is said to have a many-to-one relationship to the other fields. The correct things to do is to make two tables. The typical example is an album and songs. Songs have a many-to-one relationship to albums, so you could structure your ables like this:
Table Album
album_id [Primary Key]
Title
Artist
Table Song
song_id [Primary Key]
album_id [Foreign Key album.album_id]
Title
Often this example is given with a third table Artist, and you could substitute the Artist field for an artist_id field which is a Foreign Key to an Artist table's artist_id.
Of course, in reality songs, albums, and artists are more complex. One song can be on multiple albums, multiple artists can be on one album, there are multiple versions of the same song, and there are even some songs which have no album release at all.
Example:
Album
album_id Title Artist
1 White Beatles
2 Black Metallica
Song
song_id album_id Title
1 2 Enter Sandman
2 1 Back in the USSR
3 2 Sad but True
4 2 Nothing Else Matters
5 1 Helter Skelter
To query this you just do a JOIN:
SELECT * FROM Album INNER JOIN Song ON Album.album_id = Song.album_id
I don't think one table really makes sense in this case. Instead you can do:
Main Table:
ID
Name
Item Table:
ID
Item #
Item Value
Main_ID = Main Table.ID
Then when you do queries you can do a simple join
Let's say I have a categories table that stores categories. It is implemented in a nested set style(with left and right values).
category_id lft rgt
1 1 6
2 2 5
3 3 4
So category 1 is a parent of category 2. category 2 is a parent of category 3. So its essentially one path from root to leaf.
The category fields of category 1 should be inherited by category 2 which in turn would be inherited by category 3
Now what is the best way to store the fields for a specific category? My solution was to make another table which has the category id foreign key and the fieldname.
category_id fieldname
1 field1
1 field2
2 field3
3 field4
My problem with this approach is that when getting the fields of category 3, I need to get its parent, its parent's parent and so on until I get to the root node so that I can inherit their fields. It's not really a bad solution but I wonder if this would work when the category table is very large.
So the problem is basically an optimization problem. Is this an optimal solution?
You can do this using the schema that you have, but joining the two tables together. The beauty of the left/right nest structure is that in one query you can pull out lots of information about the whole hierarchy.
In your instance, you want to pull out all the category IDs with a 'lft' equal to or less than the 'lft' value for your given level of hierarchy, and join the results against the category ID fields in your fields table.
The query is something like:-
select table2.fieldname from table2 left join table1 on table1.category_id = table2.category_id where table1.lft <= [lft value for given level of hierarchy]
If you only have the category ID to go on then you can also extract the lft value using a subselect or joining the table back on itself.
i simply save such data in a table which has following schema:
CategoryID
ParentCategoryID
Path
so you could have
1
0
1\
then
2
1
1\2
then
3
2
1\2\3
you can then fire a simple query to either get the immediate parentid or get all the hierarchy right from the root to leaf using the path field
different but simple approach which has been working for me since last 4+ years without any issues :-)
I'am using a simple newsletter-script where different categories for one user are possible. But I want to get the different categories in one row like 1,2,3
The tables:
newsletter_emails
id email category
1 test#test.com 1
2 test#test.com 2
newsletter_categories
id name
1 firstcategory
2 secondcategory
But what Iam looking for is like this:
newsletter_emails
user_id email category
1 test#test.com 1,2
2 person#person.com 1
what's the best solution for this?
PS: The User can select his own Categorys at the profile page. (maybe with Mysql Update?)
SQL and the relational data model aren't exactly made for this kind of thing. You can do either of the following:
use a simple SELECT query on the first table, then in your consuming code, iterate over the result, fetching the corresponding rows from the second table and combining them into a string (how you'd do this exactly depends on the language you're using)
use a JOIN on both tables, iterate over the result set and accumulate values from table 2 as long as the ID from table 1 remains the same. This is harder to code than the first solution, and the result set you're pulling from the DB is larger, but you'll get away with just one query.
use DBMS-specific extensions to the SQL standard (e.g. GROUP_CONCAT) to achieve this. You'll get exactly what you asked for, but your SQL queries won't be as portable.
This is a many-to-many relationship case. Instead of having comma separated category ids make an associative table between newsletter_emails and newsletter_categories like user_category having the following schema:
user_id category
1 1
1 2
2 1
This way you won't have to do string processing if a user unsubscribes from a category. You will just have to remove the row from the user_category table.
Try this (completely untested):
SELECT id AS user_id, email, GROUP_CONCAT(category) AS category FROM newsletter_emails GROUP BY email ORDER BY user_id ASC;
I am restructuring a classifieds MySQL db where the different main sections are separated into separate tables. For example, sale items have their own table with unique ID's, jobs have their own table with unique ID's, personals have their own table as well.
These sections all share a few common characteristics:
-id
-title
-body
-listing status
-poster
-reply email
-posting date
But they each have some separate information required as well:
-each have different sets and trees of categories to choose from (which affect the structure needed to store them)
-jobs need to store things like salary, start date, etc.
-sale items need to store things like prices, obo, etc.
Therefore, is it a better practice to refactor the db while I can to a universal table to store ALL the general listing info regardless of section, and then task out customized data storage to small tables, or is it better to leave the current structure alone and leave the sections separated?
Sounds like they are all separate entities that have nothing to do with each other (ecxept for sharing some column-definitions), right?
Do you ever want to do a SELECT like
SELECT *
FROM main_entity
WHERE entity_type IN ('SALE_ITEM', 'JOB', 'PERSONAL')?
Otherwise I don't think I would merge them into one table.
Don't use a single table. Go relational.
What I would recommend setting up is a so-called polymorphic relationship between your "main" table (the one with the common characteristics), and three tables containing specific information. The structure would look something like this:
Main table
id
title
...
category_name (VARCHAR or CHAR)
category_id (INTEGER)
Category table
id
(specific columns)
The category_name field should contain the table name of the specific category table, eg. 'job_category', while the category_id should point to ID in the category table. An example would look like this:
# MAIN TABLE
id | title | ... | category_name | category_id
-------------------------------------------------------
123 | Some title | ... | job_category | 345
321 | Another title | ... | sale_category | 543
# SPECIFIC TABLE (job_category)
id | ...
---------
345 | ...
# SPECIFIC TABLE (sale_category)
id | ...
---------
543 | ...
Now, whenever you query the main table, you will immediately know which table to fetch the additional data from, and you will know the ID in that table. The only downside to this approach is that you have to perform two separate queries to fetch information for one single item. It would probably be possible to do this in a transaction, however.
For fetching data the other way around (eg. you search the jobs_category for something), on the other hand, you can fetch the associated data from the main table with a JOIN. Remember to not only join main.category_id = jobs_category.id, but also to use the category_name column as a join condition. Otherwise, you may fetch data that belongs to one of the other categories.
For optimal performance, you may want to index the category_name and category_id columns. This would mostly speed up any queries that join the two tables, as described in the previous paragraph.
Hope this helps!