I'm wondering how to store a list of ID indexes for each id in a table. These will effectively form "relations" between the IDs. Here's the main table so far:
table main:
id, integer, primary key
name, varchar
text_info_1, varchar
text_info_2, varchar
And now for each row, there will be some list of other rows' IDs which will show me how one row relates to the next. For example, the row with id #5 might be related to IDs 6,7,9,25,...etc.
Here are the options I've considered:
Create a new column as a text field and store a serialized list of these integer values. Then just unserialize when I want them.
Create a new table called "relations" with columns relation_id (int auto increment primary key), name1, name2, [and optional other fields specifying the relation type, which would be nice].
I feel like option 1 is a bit of a hack. I've done it before and it works, but perhaps option 2 is better design?
I'm worried about speed though. With option 1 I can just do SELECT relations FROM main WHERE id = $id, and then unserialize the result and have my array with integer indices. But with option 2 I'll have to browse through a table that will be many times (10x or more) larger, and do "SELECT name1, name2 FROM relations".
Speed is my main priority here. I'm not sure which one is better for space, though I would be curious to find out.
So which option should I go with? Are there other good options I haven't considered? I'd also appreciate some good general pointers on database design!
Thanks a bunch.
Second option is better. If you want to get better in database design, you should read about database normalization. Sadly, all the materials I've used were not in English. This wikipedia link
could be a start.
Table relations:
relation_id (int auto increment primary key)
id_one (int)
id_two (int)
...
Instead of storing names in the relations table, you store ids.
Related
Let's assume there is a table, with theese rows:
-personID,
-personName,
-personInterests
There is also another table, which stores the interests:
-interestID
-interestName
One person can have multiple interests, so I put the serialize()-d or JSON representation of the interest array into the interest field. This is not a String, like "reading", buth rather an index of the interests table, which stores the possible interests. Something like multiple foreign keys in one field.
The best way would be to use foreign keys, but it is not possible to achieve multiple references in one field...
How do I run such a query, without REGEX or splitting the field's content by software? If putting indexes to one field is not the way to go, then how is it possible, to achieve a structure like this?
Storing multiple indexes or any references in one field is strictly not advised.
You have to create something that I call "rendezvous" table.
In your case it has:
- ID
- UserID (foreign key)
- InterestID (foreign key)
Every single person can have multiple interests, so when a person adds a new interest to himself, you just add a new row into this table, that will have a reference to the person and the desired interest with a foreign key NOT NULL.
On large-scale projects when there are too many variations available, it is advised, to not to give an ID row to this table, but rather set the two foreign keys also primary keys, so the duplication will be impossible and the table-index will be smaller, as well as in case of lookup, it will consume less from the expensive computing power.
So the best solution is this:
- UserID (foreign key AND primary key)
- InterestID (foreign key AND primary key)
I believe the only way you can implement this is to create a third table, which will actually get updated by a trigger (Similar to what Gabor Dani advised)
Table1
-personID,
-personName,
-personInterests
Table2
-interestID
-interestName
Table3
-personInterestID (AutoIncrement Field)
-personID
-interestID
Then you need to write a trigger which will do this a stored procedure may be needed because you will need to loop through all the values in the field.
I'm designing a db table that will save a list of user's favorited food items.
I created favorite table with the following schema
id, user_id, food_id
user_id and food_id will be foreign key linking to another table.
Im just wondering if this is efficient and scalable cause if user has multiple favorite things then it would need multiple rows of data.
i.e. user has 5 favorited food items, then it will consist of five rows to save the list for that user.
Is this efficient? and scalable? Whats the best way to optimize this schema?
thnx in advance!!!
tldr; This is called a "join table" and is the correct and scalable approach to model M-M relationships in a relational database. (Depending upon the constraints used it can also model 1-M/1-1 relationships in a "no NULL FK" schema.)
However, I contend that the id column should be omitted here so that the table is only user_id, food_id. The PK will be (user_id, food_id) in this case.
Unlike other tables, where surrogate (aka auto-increment) PKs are sometimes argued for, a surrogate PK generally only adds clutter in a join table as it has a very natural compound PK.
While the PK itself is compound in this case, each "joined" table only relates back by part of the PK. Depending upon queries performed it might also be beneficial to add covering indices on food_id or (food_id, user_id).
Eliminate Surrogate Key: Unless you have a specific reason for the surrogate key id, exclude it from the table.
Fine-tune Indexing: A this point, you just have a composite primary key that is the combination of the two foreign keys. In which order should the PK fields be?
If your application(s) predominantly execute queries such as: "for given user, give me foods", then PK should be {user_id, food_id}.
If the predominant query is "for given food, give me users", then the PK should be {food_id, user_id}.
If both query "directions" are common, add a UNIQUE INDEX that has the same fields as PK, but in opposite directions. So you'll have PK on {user_id, food_id} and index on {food_id, user_id}.
Note that InnoDB tables are clustered, which eliminates (in this case "unnecessary") table heap. Yet, the secondary index discussed above will not cause a double-lookup (since it fully covers the query), nor have a hidden overhead of PK fields (since it indexes the same fields as PK, just in opposite order).
For more on designing a junction table, take a look at this post.
To my opinion, you can optimize your table in the following ways:
As a relation table with 2 foreighkeys you don't have to use "id" field.
use "innodb" engine to your table
name your relation table "user_2_food", which will make it more clear.
try to use datatype as small as possible, i.e. "smallint" is better than "int", and don't forget "UNSIGNED" attribute.
Creating the below three Tables will result in an efficient design.
users : userId, username, userdesc
foods : foodId, foodname, fooddesc
userfoodmapping : ufid, userid, foodid, rowstate
The significance of rowstate is, if the user in future doesn't like that food, its state will become -1
You have 2 options in my opnion:
Get rid of the ID field, but in that case, make both your other keys (combined) your primary key
Keep your ID key as the primary key for your table.
In either case, I think this is a proper approach. Once you get into a problem of inefficiency, then you will look at probably how to load part of the table or any other technique. This would do for now.
Disclaimer: I didn't design these tables, and am aware they're not the best way to store this data. I need advice on the best way to handle the situation.
I have two MySQL tables:
`students`
id int; primary, unique
prof1_id varchar
prof2_id varchar
prof3_id varchar
`professors`
id int; primary, unique, auto-increment
username varchar
fname varchar
fname varchar
students_id int
comment text
A professor's username may be in any of the last three columns in the students table. Professors will need to provide one comment for each student who has them in their row in the students table.
The application that is the front end to this database expects two more columns in the professors table: student_id and comment. Since each student may have any 3 professors in their row, new rows will need to be added to professors whenever they are listed for multiple students. This means that the professors id field will need to auto increment for each new student comment.
What is the best way to accomplish this data transfer with a MySQL query? I've tried looking at UPDATE in combination with JOIN, but I'm not sure there is even a valid way to do this. Also, do I need to create another primary key for professors since the multiple rows will have the same id?
I suspect that a VIEW might be the best way to accomplish this, but the application on the front end expects the information to be stored in the professors table.
One other thing I was considering was that I could create a new table by joining the two tables and have a two-column primary key, students.id, professor.id.
Thanks!
Also, do I need to create another primary key for professors since the
multiple rows will have the same id?
Yes. This is good idea.
I wrote simple query that merges data from first table into 2 columns. This is not complete answer, but it can help you a lot:
SELECT id, prof1id as profid
UNION
SELECT id, prof2id
UNION
SELECT id, prof3id
UNION;
You may use this for view, inserts, but im not familiar with specyfic MySQL syntax and i dont want to misslead you. Please give feedback if it work.
"UNION" removes duplicate rows, you may need to use "UNION ALL" to keep duplicates (like duplicated values in 2 or 3 professors columns).
I have a column in my table called student_id, and I am storing the student IDs associated with a particular record in that column, delimited with a | character. Here are a couple sample entries of the data in that column:
243|244|245
245|1013|289|1012
549|1097|1098|245|1099
I need to write a SQL query that will return records that have a student_id of `245. Any help will be greatly appreciated.
Don't store multiple values in the student_id field, as having exactly one value for each row and column intersection is a requirement of First Normal Form. This is a Good Thing for many reasons, but an obvious one is that it resolves having to deal with cases like having a student_id of "1245".
Instead, it would be much better to have a separate table for storing the student IDs associated with the records in this table. For example (you'd want to add proper constraints to this table definition as well),
CREATE TABLE mytable_student_id (
mytable_id INTEGER,
student_id INTEGER
);
And then you could query using a join:
SELECT * FROM mytable JOIN mytable_student_id
ON (mytable.id=mytable_student_id.mytable_id) WHERE mytable_student_id.student_id = 245
Note that since you didn't post any schema details regarding your original table other than that it contains a student_id field, I'm calling it mytable for the purpose of this example (and assuming it has a primary key field called id -- having a primary key is another requirement of 1NF).
#Donut is totally right about First Normal Form: if you have a one-to-many relation you should use a separate table, other solutions lead to ad-hoccery and unmaintainable code.
But if you're faced with data that are in fact stored like that, one common way of doing it is this:
WHERE CONCAT('|',student_id,'|') LIKE '%|245|%'
Again, I agree with Donut, but this is the proper query to use if you can't do anything about the data for now.
WHERE student_id like '%|245|%' or student_id like '%|245' or student_id like '245|%'
This takes care of 245 being at the start, middle or end of the string. But if you aren't stuck with this design, please, please do what Donut recommends.
my website includes a mysql db with two tables.
one table has a field called 'id_array', and it contains a string look like this:
'1,3,7,78,89,102'. it represents an array of id's in the second table.
what is the best strategy for retrieving the id's in the second table?
i want the fastest way.
currently i am using 'SELECT * FROM first_table WHERE id IN (id_array)'.
i used to query one by one, but i assume it is a lot slower.
if someone has a better structure to offer me for the db, that would be great.
this structure is the best i came up with, but i'm pretty sure it is not so efficiant.
i need a fast and convenient way to find all the id's that belong to the id from the first table.
help would be appreciated.
A comma separated value means the data is denormalized -- the IN clause won't work for you, because it works on distinct values.
Short Term Solution
Use MySQL's FIND_IN_SET function, like this:
SELECT *
FROM first_table
WHERE FIND_IN_SET(id, id_array)
Long Term Solution
...is not to store values in this manner, which means having a table to store the distinct values and a table to join that table of distinct values to the original table. IE:
FIRST_TABLE
first_table_id (primary key)
FIRST_TABLE_TYPE_CODES
type_code_id (primary key)
FIRST_TABLE_TYPE_MAP
first_table_id (primary key)
type_code_id (primary key)
You have a field that holds the ids that the record is related to in another table? You might want to create a lookup table that links the two tables together to help you with a many to many relationship. Post a little bit more or your db setup