How does stackoverflow find users to give them their notifications? - mysql

I'm creating a website like SO. Now I want to know, when I write a comment under Jack's answer/question, what happens? SO sends a notification to Jack, right? So how SO finds Jack?
In other word, should I store author-user-id in the Votes/Comments tables? Here is my current Votes-table structure:
// Votes
+----+---------+------------+---------+-------+------------+
| id | post_id | table_code | user_id | value | timestamp |
+----+---------+------------+---------+-------+------------+
// ^ this column stores the user-id who has sent vote
// ^ because there is multiple Posts table (focus on the Edit)
Now I want to send a notification for post-owner. But I don't know how can I find him? Should I add a new column on Votes table named owner and store the author-id ?
Edit: I have to mention that I have four Posts tables (I know this structure is crazy, but in reality the structure of those Posts tables are really different and I can't to create just one table instead). Something like this:
// Posts1 (table_code: 1)
+----+-------+-----------+
| id | title | content |
+----+-------+-----------+
// Posts2 (table_code: 2)
+----+-------+-----------+-----------+
| id | title | content | author_id |
+----+-------+-----------+-----------+
// Posts3 (table_code: 3)
+----+-------+-----------+-----------+
| id | title | content | author_id |
+----+-------+-----------+-----------+
// Posts4 (table_code: 4)
+----+-------+-----------+
| id | title | content |
+----+-------+-----------+
But the way, Just some of those Post tables have author_id column (Because I have two Posts tables which are not made by the users). So, as you see, I can't create a foreign key on those Posts tables.
What I need: I want a TRIGGER AFTER INSERT on Votes table which send a notification to the author if there is a author_id column. (or a query which returns author_id if there is a author_id). Or anyway a good solution for my problem ...

Votes.post_id should be a foreign key into the Posts table. From there you can get Posts.author_id, and send the notification to that user.
With your multiple Posts# tables, you can't use a real foreign key. But you can write a UNION query that joins with the appropriate table depending on the table_code value.
SELECT p.author_id
FROM Votes AS v
JOIN Posts2 AS p ON p.id = v.post_id
WHERE v.table_code = 2
UNION
SELECT p.author_id
FROM Votes AS v
JOIN Posts3 AS p ON p.id = v.post_id
WHERE v.table_code = 3
Try to avoid storing data that you can get by following foreign keys, so that the information is only stored one place. If you run into performance problems because of excessive joining, you may need to violate this normalization principle, but only as a last resort.

Related

mySQL column to hold array

I'm a beginner concerning coding and especially SQL and PHP.
I deal with app. 120 users.
The users can acquire app. 300 different collectible items.
When a user acquires a specific item, I would like the ID number of that particular item to be stored in the row of the user who acquired it, so that there is some information about what items the user already has (and to avoid duplicate items in his possession).
Is there a good way to store such information?
Is it even possible to set a column type to array and store it there?
Please note: I'm not lazy and I've been digging around and searching for the answer for 2 hours. I couldn't find a solution. I know of the rule that one should insert only one piece of information into one cell.
MySQL does not support storing arrays. However, you can use a second table to emulate an array by storing the relation between the users and items. Say you have the table users:
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
...
);
And you have a table defining items:
CREATE TABLE items (
item_id SERIAL PRIMARY KEY,
...
);
You can relate what items a user has using a table similar to user_items:
CREATE TABLE user_items (
id SERIAL PRIMARY KEY,
user_id BIGINT UNSIGNED NOT NULL,
item_id BIGINT UNSIGNED NOT NULL,
...,
FOREIGN KEY (user_id)
REFERENCES users (user_id),
FOREIGN KEY (item_id)
REFERENCES items (item_id)
);
Then, to determine what items user 123 has acquired, you could use JOINs similar to:
SELECT items.*
FROM users
INNER JOIN user_items
ON user_items.user_id = users.user_id
INNER JOIN items
ON items.item_id = user_items.item_id
WHERE users.user_id = 123; -- Or some other condition.
I assume you have 2 tables for example, users and items. To control which user already has a specific item, i would create an associative table, including the UserID from users and ItemID from items. This way you can now check in your user_items table if the user already has this item.
Here is a small example:
users (UserID is PK):
+--------+----------+
| UserID | UserName |
+--------+----------+
| 1 | Fred |
| 2 | Joe |
+--------+----------+
items (ItemID is PK):
+---------+----------+
| ItemID | ItemName |
+---------+----------+
| 5 | Book |
| 6 | Computer |
+---------+----------+
user_items (ItemID referencing items.ItemID, UserID referencing users.UserID):
+---------+--------+
| ItemID | UserID |
+---------+--------+
| 5 | 1 |
| 6 | 2 |
+---------+--------+

MySQL query get column value similar to given

Sorry if my question seems unclear, I'll try to explain.
I have a column in a row, for example /1/3/5/8/42/239/, let's say I would like to find a similar one where there is as many corresponding "ids" as possible.
Example:
| My Column |
#1 | /1/3/7/2/4/ |
#2 | /1/5/7/2/4/ |
#3 | /1/3/6/8/4/ |
Now, by running the query on #1 I would like to get row #2 as it's the most similar. Is there any way to do it or it's just my fantasy? Thanks for your time.
EDIT:
As suggested I'm expanding my question. This column represents favourite artist of an user from a music site. I'm searching them like thisMyColumn LIKE '%/ID/%' and remove by replacing /ID/ with /
Since you did not provice really much info about your data I have to fill the gaps with my guesses.
So you have a users table
users table
-----------
id
name
other_stuff
And you like to store which artists are favorites of a user. So you must have an artists table
artists table
-------------
id
name
other_stuff
And to relate you can add another table called favorites
favorites table
---------------
user_id
artist_id
In that table you add a record for every artist that a user likes.
Example data
users
id | name
1 | tom
2 | john
artists
id | name
1 | michael jackson
2 | madonna
3 | deep purple
favorites
user_id | artist_id
1 | 1
1 | 3
2 | 2
To select the favorites of user tom for instance you can do
select a.name
from artists a
join favorites f on f.artist_id = a.id
join users u on f.user_id = u.id
where u.name = 'tom'
And if you add proper indexing to your table then this is really fast!
Problem is you're storing this in a really, really awkward way.
I'm guessing you have to deal with an arbitrary number of values. You have two options:
Store the multiple ID's in a blob object in JSON format. While MySQL doesn't have JSON functions built in, there are user defined functions that will extract values for you, etc.
See: http://blog.ulf-wendel.de/2013/mysql-5-7-sql-functions-for-json-udf/
Alternatively, switch to PostGres
Add as many columns to your table as the maximum number of ID's you expect to have. So if /1/3/7/2/4/8/ is the longest entry, have 6 columns in your table. Reason this is bad: you'll have sparse columns that'll unnecessarily slow your tables.
I'm sure you could write some horrific regex to accomplish the task, but I caution on using complex regex's on enormous tables.

Which method is better way to store information in table?

I am going to store user Likes into database. But I am not sure which one of these 2 methods is better:
in my situation, users can like Posts, Comments and Groups. something like Facebook.
Assume there are 10 million likes for : Posts, Comments and Groups
Method A:
Create a Like table, and add a LikeType field in it:
+--------+----------+--------+
| likeID | LikeType | userID |
+--------+----------+--------+
| 1 | 1 | 1 | // User 1 liked a post
+--------+----------+--------+
| 2 | 2 | 1 | // User 1 liked a comment
+--------+----------+--------+
| 3 | 3 | 1 | // User 1 liked a group
which LikeType includes : 1,2,3
1 = Posts, 2= Comments, 3= Groups
Method B:
Create three separated tables for each one of Posts, Comments and Groups.
in Method A,
Because there are too many likes and it needs an extra condition ( Where status = 1, or 2, or 3 ) to get a Post, Comment or Group likes, which method is better?
UPDATED POST:
users
uid // PK
---------------------------------------
itemTypes
typeID // PK
typeText // comments, groups, posts
---------------------------------------
--------------------------------------- +
posts |
id // PK |
typeID // 1 |
... |
--------------------------------------- +
comments |
id // PK |
typeID // 2 |
... |
--------------------------------------- + Items
groups |
id // PK |
typeID // 3 |
... |
--------------------------------------- +
photos |
id // PK |
typeID // 4 |
... |
--------------------------------------- +
---------------------------------------
likes
uid // FK to user id
itemid // FK to posts, groups, photos, comments id
itemType // FK to itemsTypes.typeID
// select post #50 likes
SELECT count(*) FROM likes WHERE itemid = 50 and itemType = 1
// select comment #50 of user #2
SELECT * FROM likes WHERE itemid = 50 and uid = 2 and itemType = 2
is this a good schema ?
I don't like either of your methods. I would go more normalized. I would have a table for item types, such as comments, groups, posts, etc. Then I would have a table for items. It would have an ItemId as the PK and a FK reference to item types. There would also be a users table. Finally, the likes table would be a many to many relationship between items and users.
As Jan Doggen said, what you're doing with the information is an important consideration. In particular, if you want to be able to ask the question "what things does a given user like", then you will benefit from having all the data in one table -- otherwise, you'd have to have three separate queries to answer that question.
For the case of the question "which people like a given thing", the performance difference between the single-table model and the multiple-table model should be relatively small if your tables are properly indexed (with an index on likeID/likeType, in this case). The multiple-table model will make your application logic more complex, and will be harder to extend in the future when you want to add other things a user might be able to like.

How can I optimize this SQL query with a large IN clause?

I have a fairly complicated operation that I'm trying to perform with just one SQL query but I'm not sure if this would be more or less optimal than breaking it up into n queries. Basically, I have a table called "Users" full of user ids and their associated fb_ids (id is the pk and fb_id can be null).
+-----------------+
| id | .. | fb_id |
|====|====|=======|
| 0 | .. | 12345 |
| 1 | .. | 31415 |
| .. | .. | .. |
+-----------------+
I also have another table called "Friends" that represents a friend relationship between two users. This uses their ids (not their fb_ids) and should be a two-way relationship.
+----------------+
| id | friend_id |
|====|===========|
| 0 | 1 |
| 1 | 0 |
| .. | .. |
+----------------+
// user 0 and user 1 are friends
So here's the problem:
We are given a particular user's id ("my_id") and an array of that user's Facebook friends (an array of fb_ids called fb_array). We want to update the Friends table so that it honors a Facebook friendship as a valid friendship among our users. It's important to note that not all of their Facebook friends will have an account in our database, so those friends should be ignored. This query will be called every time the user logs in so it can update our data if they've added any new friends on Facebook. Here's the query I wrote:
INSERT INTO Friends (id, friend_id)
SELECT "my_id", id FROM Users WHERE id IN
(SELECT id FROM Users WHERE fb_id IN fb_array)
AND id NOT IN
(SELECT friend_id FROM Friends WHERE id = "my_id")
The point of the first IN clause is to get the subset of all Users who are also your Facebook friends, and this is the main part I'm worried about. Because the fb_ids are given as an array, I have to parse all of the ids into one giant string separated by commas which makes up "fb_array." I'm worried about the efficiency of having such a huge string for that IN clause (a user may have hundreds or thousands of friends on Facebook). Can you think of any better way to write a query like this?
It's also worth noting that this query doesn't maintain the dual nature of a friend relationship, but that's not what I'm worried about (extending it for this would be trivial).
If I am not mistaken, your query can be simplified, if you have a UNIQUE constraint on the combination (id, friend_id), to:
INSERT IGNORE INTO Friends
(id, friend_id)
SELECT "my_id", id
FROM Users
WHERE fb_id IN fb_array ;
You should have index on User (fb_id, id) and test for efficiency. if the number of the itmes in the array is too big (more than a few thousands), you may have to split the array and run the query more than once. Profile with your data and settings.
Depends on if if the following columns are nullable (value can be NULL):
USERS.id
FRIENDS.friend_id
Nullable:
SELECT DISTINCT
"my_id", u.id
FROM Users u
WHERE u.fb_id IN fb_array
AND u.id NOT IN (SELECT f.friend_id
FROM FRIENDS f
WHERE f.id = "my_id")
Not Nullable:
SELECT "my_id", u.id
FROM Users u
LEFT JOIN FRIENDS f ON f.friend_id = u.id
AND f.id = "my_id"
WHERE u.fb_id IN fb_array
AND f.fried_id IS NULL
For more info:
http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/
http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
Speaking to the number of values in your array
The tests run in the two articles mentioned above contain 1 million rows, with 10,000 distinct values.

Correct way of storing a list of related values in Database

Let's say I have the following scenario.
A database of LocalLibrary with two tables Books and Readers
| BookID| Title | Author |
-----------------------------
| 1 | "Title1" | "John" |
| 2 | "Title2" | "Adam" |
| 3 | "Title3" | "Adil" |
------------------------------
And the readers table looks like this.
| UserID| Name |
-----------------
| 1 | xy L
| 2 | yz |
| 3 | xz |
----------------
Now, lets say that user can create a list of books that they read (a bookshelf, that strictly contains books from above authors only i.e authors in our Db). So, what is the best way to represent that bookshelf in Database.
My initial thought was a comma separated list of BookIDin Readers table. But it clearly sounds awkward for a relational Db and I'll also have to split it every time I display the list of users' books. Also, when a user adds a new book to shelf, there is no way of checking if it already exists in their shelves except to split the comma-separated list and and compare the IDs of two. Deleting is also not easy.
So, in one line, the question is how does one appropriately models situations like these.
I have not done anything beyond simple SELECTs and INSERTs in MySQL. It would be much helpful if you could describe in simpler terms and provide links for further reading.
Please comment If u need some more explanation.
Absolutely forget the idea about a comma separated list of books to add to the Readers table. It will be unsearchable and very clumsy. You need a third table that join the Books table and the Readers table. Each record in this table represent a reader reading a book.
Table ReaderList
--------------------
UserID | BookID |
--------------------
You get a list of books read by a particular user with
select l.UserID, r.Name, l.BookID, b.Title, b.Author
from ReaderList l left join Books b on l.BookID = b.BookID
left join Readers r on l.UserID = r.UserID
where l.UserID = 1
As you can see this pattern requires the use of the keyword JOIN that bring togheter data from two or more table. You can read more about JOIN in this article
If you want, you could enhance this model adding another field to the ReaderList like the ReadingDate