I have a list of communities. Each community has a list of members. Currently I am storing each community in a row with the member names separated by a comma. This is good for smaller immutable communities. But as the communities are growing big, let us say with 75,000 members, loading of communities is becoming slower. Also partial loading of a community (let us say random 10 members) is also not very elegant. What would be the best table structure for the communities table in this scenario? Usage of multiple tables is also not an issue if there is a reason for doing that.
Use three tables
`community`
| id | name | other_column_1 | other_column_2 ...
`user`
| id | name | other_column_1 | other_column_2 ...
`community_user`
| id (autoincrement) | community_id | user_id |
Then to get user info for all users in a community you do something like this
SELECT cu.id AS entry_id, u.id, u.name FROM `community_user` AS cu
LEFT JOIN `user` AS u
ON cu.user_id = u.id
WHERE cu.community_id = <comminuty id>
Related
Would someone mind advising me please regarding this table setup.
Its the first time designing a database. This will be a part of it.
Its a report writing application. Multiple Engineers can be assigned to attend any job/report and multiple engineers can author the report as well as attending.
Is this the best way to do this. I would need to be able to search attendees and authors separately in the application.
Thanks very much for the assistance.
You have, I believe, two tables containing entities. The entities are employee and report.
These entities have two different many-to-many relationships: author and attendee.
So your tables are these
employee report
-------- -----
employee_id (PK) report_id (PK)
surname title
givenname releasedate
whatever whatever
Then you have two many:many relationship tables with the same columns as each other. One is author and the other is attendee.
author / attendee
------
employee_id PK, FK to employee.employee_id
report_id PK, FK to report.report_id
Notice the compound (two-column) primary keys.
+---------------------+\ /+-------------+\ /+-----------------------+
| +-----+ author +-----+ |
| |/ \+-------------+/ \| |
| employee | | report |
| | | |
| |\ /+-------------+\ /| |
| +-----+ attendee +-----+ |
+---------------------+/ \+-------------+/ \+-----------------------+
\ /
----- means a many-to-many relationship
/ \
When you determine an employee is an attendee for a certain report, you insert a row into the attendee table with the correct employee and report.
If you want, for example, all authors for each report you can do this sort of thing:
SELECT r.title, r.releasedate,
GROUP_CONCAT(e.surname ORDER BY e.surname SEPARATED BY ',')surnames
FROM report r
LEFT JOIN author a ON r.report_id = a.report_id
LEFT JOIN employee e ON a.report_id = e.report_id
GROUP BY r.title, r.releasedate
ORDER BY r.releasedate DESC
The LEFT JOIN operations allow your query to find reports that have no authors. Ordinary inner JOIN operations would suppress those rows from your result set.
There is a limitation with this strict E:R design. For many kinds of reports, (scientific papers for example) the order of authors is critically important. (You want to start an academic food fight? List the authors of a paper in the wrong order.)
So you author table might also contain an ordinal value.
author
------
employee_id PK, FK to employee.employee_id
report_id PK, FK to report.report_id
ordinal INT
and your report query would contain this line.
GROUP_CONCAT(e.surname ORDER BY e.ordinal SEPARATED BY ',')surnames
I'm creating a website like SO. Now I want to know, when I write a comment under Jack's answer/question, what happens? SO sends a notification to Jack, right? So how SO finds Jack?
In other word, should I store author-user-id in the Votes/Comments tables? Here is my current Votes-table structure:
// Votes
+----+---------+------------+---------+-------+------------+
| id | post_id | table_code | user_id | value | timestamp |
+----+---------+------------+---------+-------+------------+
// ^ this column stores the user-id who has sent vote
// ^ because there is multiple Posts table (focus on the Edit)
Now I want to send a notification for post-owner. But I don't know how can I find him? Should I add a new column on Votes table named owner and store the author-id ?
Edit: I have to mention that I have four Posts tables (I know this structure is crazy, but in reality the structure of those Posts tables are really different and I can't to create just one table instead). Something like this:
// Posts1 (table_code: 1)
+----+-------+-----------+
| id | title | content |
+----+-------+-----------+
// Posts2 (table_code: 2)
+----+-------+-----------+-----------+
| id | title | content | author_id |
+----+-------+-----------+-----------+
// Posts3 (table_code: 3)
+----+-------+-----------+-----------+
| id | title | content | author_id |
+----+-------+-----------+-----------+
// Posts4 (table_code: 4)
+----+-------+-----------+
| id | title | content |
+----+-------+-----------+
But the way, Just some of those Post tables have author_id column (Because I have two Posts tables which are not made by the users). So, as you see, I can't create a foreign key on those Posts tables.
What I need: I want a TRIGGER AFTER INSERT on Votes table which send a notification to the author if there is a author_id column. (or a query which returns author_id if there is a author_id). Or anyway a good solution for my problem ...
Votes.post_id should be a foreign key into the Posts table. From there you can get Posts.author_id, and send the notification to that user.
With your multiple Posts# tables, you can't use a real foreign key. But you can write a UNION query that joins with the appropriate table depending on the table_code value.
SELECT p.author_id
FROM Votes AS v
JOIN Posts2 AS p ON p.id = v.post_id
WHERE v.table_code = 2
UNION
SELECT p.author_id
FROM Votes AS v
JOIN Posts3 AS p ON p.id = v.post_id
WHERE v.table_code = 3
Try to avoid storing data that you can get by following foreign keys, so that the information is only stored one place. If you run into performance problems because of excessive joining, you may need to violate this normalization principle, but only as a last resort.
Sorry if my question seems unclear, I'll try to explain.
I have a column in a row, for example /1/3/5/8/42/239/, let's say I would like to find a similar one where there is as many corresponding "ids" as possible.
Example:
| My Column |
#1 | /1/3/7/2/4/ |
#2 | /1/5/7/2/4/ |
#3 | /1/3/6/8/4/ |
Now, by running the query on #1 I would like to get row #2 as it's the most similar. Is there any way to do it or it's just my fantasy? Thanks for your time.
EDIT:
As suggested I'm expanding my question. This column represents favourite artist of an user from a music site. I'm searching them like thisMyColumn LIKE '%/ID/%' and remove by replacing /ID/ with /
Since you did not provice really much info about your data I have to fill the gaps with my guesses.
So you have a users table
users table
-----------
id
name
other_stuff
And you like to store which artists are favorites of a user. So you must have an artists table
artists table
-------------
id
name
other_stuff
And to relate you can add another table called favorites
favorites table
---------------
user_id
artist_id
In that table you add a record for every artist that a user likes.
Example data
users
id | name
1 | tom
2 | john
artists
id | name
1 | michael jackson
2 | madonna
3 | deep purple
favorites
user_id | artist_id
1 | 1
1 | 3
2 | 2
To select the favorites of user tom for instance you can do
select a.name
from artists a
join favorites f on f.artist_id = a.id
join users u on f.user_id = u.id
where u.name = 'tom'
And if you add proper indexing to your table then this is really fast!
Problem is you're storing this in a really, really awkward way.
I'm guessing you have to deal with an arbitrary number of values. You have two options:
Store the multiple ID's in a blob object in JSON format. While MySQL doesn't have JSON functions built in, there are user defined functions that will extract values for you, etc.
See: http://blog.ulf-wendel.de/2013/mysql-5-7-sql-functions-for-json-udf/
Alternatively, switch to PostGres
Add as many columns to your table as the maximum number of ID's you expect to have. So if /1/3/7/2/4/8/ is the longest entry, have 6 columns in your table. Reason this is bad: you'll have sparse columns that'll unnecessarily slow your tables.
I'm sure you could write some horrific regex to accomplish the task, but I caution on using complex regex's on enormous tables.
I have a fairly complicated operation that I'm trying to perform with just one SQL query but I'm not sure if this would be more or less optimal than breaking it up into n queries. Basically, I have a table called "Users" full of user ids and their associated fb_ids (id is the pk and fb_id can be null).
+-----------------+
| id | .. | fb_id |
|====|====|=======|
| 0 | .. | 12345 |
| 1 | .. | 31415 |
| .. | .. | .. |
+-----------------+
I also have another table called "Friends" that represents a friend relationship between two users. This uses their ids (not their fb_ids) and should be a two-way relationship.
+----------------+
| id | friend_id |
|====|===========|
| 0 | 1 |
| 1 | 0 |
| .. | .. |
+----------------+
// user 0 and user 1 are friends
So here's the problem:
We are given a particular user's id ("my_id") and an array of that user's Facebook friends (an array of fb_ids called fb_array). We want to update the Friends table so that it honors a Facebook friendship as a valid friendship among our users. It's important to note that not all of their Facebook friends will have an account in our database, so those friends should be ignored. This query will be called every time the user logs in so it can update our data if they've added any new friends on Facebook. Here's the query I wrote:
INSERT INTO Friends (id, friend_id)
SELECT "my_id", id FROM Users WHERE id IN
(SELECT id FROM Users WHERE fb_id IN fb_array)
AND id NOT IN
(SELECT friend_id FROM Friends WHERE id = "my_id")
The point of the first IN clause is to get the subset of all Users who are also your Facebook friends, and this is the main part I'm worried about. Because the fb_ids are given as an array, I have to parse all of the ids into one giant string separated by commas which makes up "fb_array." I'm worried about the efficiency of having such a huge string for that IN clause (a user may have hundreds or thousands of friends on Facebook). Can you think of any better way to write a query like this?
It's also worth noting that this query doesn't maintain the dual nature of a friend relationship, but that's not what I'm worried about (extending it for this would be trivial).
If I am not mistaken, your query can be simplified, if you have a UNIQUE constraint on the combination (id, friend_id), to:
INSERT IGNORE INTO Friends
(id, friend_id)
SELECT "my_id", id
FROM Users
WHERE fb_id IN fb_array ;
You should have index on User (fb_id, id) and test for efficiency. if the number of the itmes in the array is too big (more than a few thousands), you may have to split the array and run the query more than once. Profile with your data and settings.
Depends on if if the following columns are nullable (value can be NULL):
USERS.id
FRIENDS.friend_id
Nullable:
SELECT DISTINCT
"my_id", u.id
FROM Users u
WHERE u.fb_id IN fb_array
AND u.id NOT IN (SELECT f.friend_id
FROM FRIENDS f
WHERE f.id = "my_id")
Not Nullable:
SELECT "my_id", u.id
FROM Users u
LEFT JOIN FRIENDS f ON f.friend_id = u.id
AND f.id = "my_id"
WHERE u.fb_id IN fb_array
AND f.fried_id IS NULL
For more info:
http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/
http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
Speaking to the number of values in your array
The tests run in the two articles mentioned above contain 1 million rows, with 10,000 distinct values.
Let's say I have the following scenario.
A database of LocalLibrary with two tables Books and Readers
| BookID| Title | Author |
-----------------------------
| 1 | "Title1" | "John" |
| 2 | "Title2" | "Adam" |
| 3 | "Title3" | "Adil" |
------------------------------
And the readers table looks like this.
| UserID| Name |
-----------------
| 1 | xy L
| 2 | yz |
| 3 | xz |
----------------
Now, lets say that user can create a list of books that they read (a bookshelf, that strictly contains books from above authors only i.e authors in our Db). So, what is the best way to represent that bookshelf in Database.
My initial thought was a comma separated list of BookIDin Readers table. But it clearly sounds awkward for a relational Db and I'll also have to split it every time I display the list of users' books. Also, when a user adds a new book to shelf, there is no way of checking if it already exists in their shelves except to split the comma-separated list and and compare the IDs of two. Deleting is also not easy.
So, in one line, the question is how does one appropriately models situations like these.
I have not done anything beyond simple SELECTs and INSERTs in MySQL. It would be much helpful if you could describe in simpler terms and provide links for further reading.
Please comment If u need some more explanation.
Absolutely forget the idea about a comma separated list of books to add to the Readers table. It will be unsearchable and very clumsy. You need a third table that join the Books table and the Readers table. Each record in this table represent a reader reading a book.
Table ReaderList
--------------------
UserID | BookID |
--------------------
You get a list of books read by a particular user with
select l.UserID, r.Name, l.BookID, b.Title, b.Author
from ReaderList l left join Books b on l.BookID = b.BookID
left join Readers r on l.UserID = r.UserID
where l.UserID = 1
As you can see this pattern requires the use of the keyword JOIN that bring togheter data from two or more table. You can read more about JOIN in this article
If you want, you could enhance this model adding another field to the ReaderList like the ReadingDate