For storing friends relationships in social networks, is it better to have another table with columns relationship_id, user1_id, user2_id, time_created, pending or should the confirmed friend's user_id be seralized/imploded into a single long string and stored along side with the other user details like user_id, name, dateofbirth, address and limit to like only 5000 friends similar to facebook?
Are there any better methods? The first method will create a huge table! The second one has one column with really long string...
On the profile page of each user, all his friends need to be retrieved from database to show like 30 friends similar to facebook, so i think the first method of using a seperate table will cause a huge amount of database queries?
The most proper way to do this would be to have the table of Members (obviously), and a second table of Friend relationships.
You should never ever store foreign keys in a string like that. What's the point? You can't join on them, sort on them, group on them, or any other things that justify having a relational database in the first place.
If we assume that the Member table looks like this:
MemberID int Primary Key
Name varchar(100) Not null
--etc
Then your Friendship table should look like this:
Member1ID int Foreign Key -> Member.MemberID
Member2ID int Foreign Key -> Member.MemberID
Created datetime Not Null
--etc
Then, you can join the tables together to pull a list of friends
SELECT m.*
FROM Member m
RIGHT JOIN Friendship f ON f.Member2ID = m.MemberID
WHERE f.MemberID = #MemberID
(This is specifically SQL Server syntax, but I think it's pretty close to MySQL. The #MemberID is a parameter)
This is always going to be faster than splitting a string and making 30 extra SQL queries to pull the relevant data.
Separate table as in method 1.
method 2 is bad because you would have to unserialize it each time and wont be able to do JOINS on it; plus UPDATE's will be a nightmare if a user changes his name, email or other properties.
sure the table will be huge, but you can index it on Member11_id, set the foreign key back to your user table and could have static row sizes and maybe even limit the amount of friends a single user can have. I think it wont be an issue with mysql if you do it right; even if you hit a few million rows in your relationship table.
Related
I'm making a "Like" button for a portfolio website I'm working on and I've gotten stumped by some of my own code!
I have two MySQL tables:
img_all: contains all images on the server (each image has a 6 INT id)
login: contains account information for all users of the website (each user has a 6 INT id as well a VARCHAR column labeled "likes")
The way my system is laid out. When a user "Likes" a picture, I save that picture's id to a column on the login table.
UPDATE login SET likes = CONCAT(likes,':$img_id:') WHERE user_key = $user_id;
and when they unlike a picture:
UPDATE login SET likes = REPLACE(likes,':$img_id:','') WHERE user_key = $user_id;
This will output strings in the likes column similar to this:
:456093:475829:203944:789203:
My problem starts here. I'm making a page that allows users to view all the pictures that they've liked (let's call this file "Likes.php").
However, the list of liked pictures are saved in the login table, while the actual picture information is saved in img_all.
How then do I take the list from my login table and translate it to select those images from img_all? I was thinking of using a mixture of:
SELECT user_key FROM login WHERE likes LIKE '%:$img_id:%';
and
while();
I also thought of a SQL query. I know it won't work. However, hopefully, it will also help relay what I'm trying to accomplish!
SELECT * FROM
img_all WHERE id =
SELECT likes FROM login
WHERE likes LIKE '%:$img_id:%' AND user_key = '$user_id';
You are almost there. Instead of using a subquery, you can turn your query into a JOIN, like :
SELECT i.*
FROM img_all i
INNER JOIN login l ON l.likes LIKE CONCAT('%:', l.img_id, ':%')
WHERE user_id = ?
While this might solve your question, please be aware that storing list of values in a single column is almost always an indication of poor design.
Accessing and modifying the data requires to manipulate strings, which is uneasy to do with SQL, error-prone and quite inefficient. Also, as commented by Bill Karwin, the number of likes that you can store for a single user is limited by the maximum size of the string column.
As commented by tim, you should use a separated table to store the likes, with foreign keys to the login and img_all tables.
CREATE TABLE likes (
like_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
img_id INT NOT NULL,
PRIMARY KEY (like_id),
FOREIGN KEY fk_likes_login(user_id) REFERENCES login(user_id),
FOREIGN KEY fk_likes_img(img_id) REFERENCES img_all(img_id)
);
NB : the auto-incremented primary key is not strictly necessary, and could be replaced by a composite unique index on the two foreign keys.
Then you can retrieve all images that a user liked with a simple JOINed query :
SELECT i.*
FROM likes
INNER JOIN img_all ON img_all.img_id = likes.img_id
WHERE likes.user_id = ?
First and foremost, I want to say I do not have any code to back this at this point. I am trying to conceptualize an idea. So, apologies in advance.
Basic rundown. I have a database full of shows that each can have multiple genres, such as show A can be an action, adventure, drama. Typical, right? Right now, as I have my database set up to have columns such as genre_1, genre_2, genre_3. This is terrible, I know, which is why I am redoing it.
I am wanting to create a table full of genres, then have a table with the show information, then have a table to relate those two. So, the primary keys in the genre and show tables would be foreign keys in the genre-show table.
I'm pretty sure this is the best way to go about this one-to-many relationship, but let me know if there is something I'm missing.
My problem is, I'm uncertain of how I would, for say, list all shows that are in the action OR adventure genres, or list all shows that are in action AND adventure genres.
I'm fairly, kind of familiar with joins, but on my knowledge I can't figure out how I would query that.
Ultimately, what I am looking to do is be able to query my DB and say "Give me every show that has action and adventure genres" and then be on my way.
I hope this make sense. Thank you in advance for your time / answers, I truly appreciate it.
One to many 101:
Main Table:
id (primary key, auto_increment),
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Many-to-one A
id (primary key, auto_increment),
main_id (foreign key)
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Many-to-one B
id (primary key, auto_increment),
main_id (foreign key)
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Now you may join up as many, Many-to-one tables as you need thusly:
select
*
from
main left join
table_a on main.id = table_a.main_id left join
table_b on main.id = table_b.main_id
where
main.id = X
You will receive back many rows, but each row will have duplicates of the main object but include all many-to-ones. This is called Denormalization.
Or you may prefer to do sub loops whereby you run one query to get your main objects, and then within a subloop for each many-to-one, you use the main.id to find the main_id.id rows that match your object.
I'm working on an application that previously had unique handles for users only--but now we want to have handles for events, groups, places... etc. Unique string identifiers for many different first class objects. I understand the thing to do is adopt something like the Party Model, where every entity has its own unique partyId and handle. That said, that means on pretty much every data-fetching query, we're adding a join to get that handle! Certainly for every user.
So just what is the performance loss here? For a table with just three or four columns, is a join like this negligible? Or is there a better way of going about this?
Example Table Structure:
Party
int id
int party_type_id
varchar(256) handle
Events
int id
int party_id
varchar(256) name
varchar(256) time
int place_id
Users
int id
int party_id
varchar(256) first_name
varchar(256) last_name
Places
int id
int party_id
varchar(256) name
-- EDIT --
I'm getting a bad rating on this question, and I'm not sure I understand why. In PLAIN TERMS, I'm asking,
If I have three first class objects that must all share a UNIQUE HANDLE property, unique across all three objects, does adding an additional table that must be joined with on almost any request incur a significant performance hit? Is there a better way of accomplishing this in a relational database like MySQL?
-- EDIT: Proposed Queries --
Getting one user
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle='foo'
Searching users
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle LIKE '%foo%'
Searching all parties... I guess I'm not sure how to do this in one query. Would you have to select all Parties matching the handle and then get the individual objects in separate queries? E.g.
db.makeQuery(SELECT * FROM Party p WHERE p.handle LIKE '%foo%')
.then(function (results) {
// iterate through results and assemble lists of matching parties by type, then get those objects in separate queries
})
This last example is what I'm most concerned about I think. Is this a reasonable design?
The queries you show should be blazingly fast on any modern implementation, and should scale to tens or hundreds of thousands of millions of records without too much trouble.
Relational Database Management Systems (of which MySQL is one) are designed explicitly for this scenario.
In fact, the slow part of your second query:
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle LIKE '%foo%'
is going to be WHERE p.handle LIKE '%foo%' as this will not be able to use an index. Once you have a large table, this part of the query will be many times slower than the join.
I have a three tables namely profile, academic,payment and these tables having two same columns that are username and status.
my problem is how to select username from the tables where status=1 in all the tables
Typically it works like this:
SELECT * FROM profile
LEFT JOIN academic ON profile.username=academic.username
LEFT JOIN payment ON profile.username=payment.username
WHERE profile.status=1 AND academic.status=1 AND payment.status=1
As a note having username as a key is usually a bad thing, often super bad since if someone's able to change their name you need to update N other tables. You may have a circumstance where you forget to update one or more tables, then subsequently someone registers with the former name and "inherits" this data.
It's also typically very inefficient to use a string INDEX key when a user_id integer value would suffice.
I want to create a table where my users can associate a friendship between one another. Which at the same time this table will work in conjunction to what I would to be a one-to-many relation between various other tables I am attempting to work up.
Right now I am thinking of something like this
member_id, friend_id, active, date
member_id would be the column of the user making the call, friend_id would be the column of the friend they are attempting to tie to, active would be a toggle of sorts 0 = pending, 1 = active, date would just be a logged date of the last activity on that particular row.
Now my confusion is if I were to query I would typically query for member_id then base the rest of the query off of associated friend_id's to display data accordingly to the right people. So with this logic of sorts in mind, that makes me think I would have to have 2 rows per request. One where its the member_id who's requesting and the friend_id of the request inserted into the table, then one thats the opposite so I could query accordingly every time. So in essences its like double dipping for every one action requested to this particular table I need to make 2 like actions to make it work.
Which in all does not make sense to me as far as optimization goes. So in all my question is what is the proper way to handle data for relations like this? Or am I actually thinking sanely about this being an approach to handling it?
If a friendship is always mutual, then you can choose between data redundancy (i.e. both directions having a row) for the sake of simpler queries, or learn to live with slightly more complex queries. I'd personally avoid data redundancy unless there is a compelling reason otherwise - you're not just wasting space and performance, but you'll need to be careful when enforcing it - a simple CHECK is incapable of referencing other rows and depending on your DBMS a trigger may be limited in what it can do with a mutating table.
An easy way ensure to only one row per friendship is to always insert the lower value in member_id and higher value in friend_id (make a constraint CHECK (member_id < friend_id) to enforce it). Then, when you query, you'll have search in both directions - for example, finding all friends of the given person (identified by person_id) would look something like this:
SELECT *
FROM
person
WHERE
id <> :person_id
AND (
id IN (
SELECT friend_id
FROM friendship
WHERE member_id = :person_id
)
OR
id IN (
SELECT member_id
FROM friendship
WHERE friend_id = :person_id
)
)
BTW, in this scheme, you'd probably want to rename member_id and friend_id to, say, friend1_id and friend2_id...
Two ways to look at it:
WHERE ((friend_id = x AND member_id = y) OR (friend_id = y AND member_id = x))
would allow you to query by simply stating one side of the relationship. If both sides are added, this method would still work without causing duplicate rows to be returned.
Conversely, adding both sides of the relationship, so that your queries consist of
WHERE friend_id = x AND member_id = y
not only makes queries easier to write, but also easier to plan (meaning better DB performance).
My vote is for the latter option.
Beautiful - there's no problem with your table as-is.
ALSO:
I'm not sure if this cardinality is "one to many", or "many to many":
http://en.wikipedia.org/wiki/Cardinality_%28data_modeling%29
Q: I were to query I would typically query for member_id then base the
rest of the query off of associated friend_id's to display data
accordingly to the right people
A: Frankly, I don't see any problem querying "member to friend", or "friend to member" (or any other combinations - e.g. friends who share friends). Again, it looks good.
Introduce a helper table like:
users
user_id, name, ...
friendship
user_id, friend_id, ....
select u.name as user, u2.name as friend from users u
inner join friendship f on f.user_id = u.user_id
inner join users u2 on u2.user_id = f.friend_id
I think this is pretty similar to what you have, just putting a query as an example.