Best table structure for users with different roles

Best table structure for users with different roles - mysql

We are working on a website which will feature about 5 different user roles, each with different properties. In the current version of the database schema we have a single users table which holds all the users, and all of their properties.
The problem is that the properties that we need differ per user role. All users have the same basis properties, like a name, e-mail address and password. But on top of that the properties differ per role. Some have social media links, others have invoice addresses, etc. In total there may be up to 60 columns (properties), of which only a portion are used by each user role.
In total we may have about 250,000 users in the table, of which the biggest portion (about 220,000) will be of a single user role (and use about 20 of the 60 columns). The other 30,000 users are divided over four other rules and use a sub-set of the other 40 columns.
What is the best database structure for this, both from a DB as a development perspective? My idea is to have a base users table, and then extend on that with tables like users_ moderators, but this may lead to a lot of JOIN'ed queries. A way to prevent this is by using VIEWs, but I've read some (out-dated?) articles that VIEWs may hurt performance, like: http://www.mysqlperformanceblog.com/2007/08/12/mysql-view-as-performance-troublemaker/.
Does the 'perfect' structure even exist? Any suggestion, or isn't this really a problem at all and should we just put all users in a single big tables?

There are two different ways to go about this. One is called "Single Table Inheritance". This is basically the design you ask for comments on. It's pretty fast because there are no joins. However NULLs can affect throughput to a small degree, because fat rows take a little longer to bring into memory than thinner rows.
An alternative design is called "Class Table Inheritance". In this design, there is one table for the super class and one table for each subclass. Non key attributes go into the table where they pertain. Often, a design called "Shared Primary Key" can be used with this design. In shared primary key, the key ids of the subclass tables are copies of the id from the corresponding row in the superclass table.
It's a little work at insert time, but it pays for itself when you go to join data.
You should look up all three of these in SO (they have their own tags) or out on the web. You'll get more details on the design, and an indication of how well each design fits your case.

'Perfect' structure for such cases, in my opinion, is party-role-relationship model. Search for Len Silverston's books about data models. It looks quite complicated at the beginning, but it gives great flexibility...
The biggest question is practicability of adopting perfect solution. Nobody except you can answer that. Refactoring is never an easy and fast task, so say if your project lifetime is 1 year, spending 9 month paying out 'technical debts' sounds more like wasting of time/efforts/etc.
As for performance of joins, having proper indexes usually solves potential issues. If not, you can always implement materialized view ; even though mysql doesn't have such option out of the box, you can design it yourself and refresh it in different ways(for instance, using triggers or launch refresh procedure periodically/on demand).

table user
table roles
table permissions
table userRole
table userPermission
table RolesPermissions
Each role have is permissions in role permissions table
Each user can have a permission whitout the role (extention...)
So in PHP you just have to merge arrays of user permissions in user roles and extended permissions...
And in your "acl" class you check if your user have the permission to view or process a webpage or a system process...

I think you don't need to worry about speed here so much.
Because it will be one time thing only. i.e. on user login store acl in session and get it next time from there.
JOINs are not so bad. If you have your indexes and foreign keys in right places with InnoDB engine it will be really fast.
I would use one table for users and role_id. Second table with roles. Third table for resources, and one to link it all together + enabled flag.

Related

What Are Good Solutions for a Database Table that Gets to Long?

I will describe a problem using a specific scenario:
Imagine that you create a website towhich users can register,
and after they register, they can send Private Messages to each other.
This website enables every user to maintain his own Friends list,
and also maintain a Blocked Users list, from which he prefers not to get messages.
Now the problem:
Imagine this website getting to several millions of users,
and let's also assume that every user has about 10 Friends in the Friends table, and 10 Blocked Users in the Blocked Users table.
The Friends list Table, and the Blocked Users table, will become very long,
but worse than that, every time when someone wants to send a message to another person "X",
we need to go over the whole Blocked Users table, and look for records that the user "X" defined - people he blocked.
This "scanning" of a long database table, each time a message is sent from one user to another, seems quite inefficient to me.
So I have 2 questions about it:
What are possible solutions for this problem?
I am not afraid of long database tables,
but I am afraid of database tables that contain data for so many users,
which means that the whole table needs to be scanned every time, just to pull out a few records from it for that specific user.
A specific solution that I have in my mind, and that I would like to ask about:
One solution that I have in mind for this problem, is that every user that registers to the website, will have his own "mini-database" dynamically (and programmatically) created for him,
that way the Friends table, an the Blocked Users table, will contain only records for him.
This makes scanning those table very easy, because all the records are for him.
Does this idea exist in Databases like MS-SQL Server, or MySQL? And If yes, is it a good solution for the described problem?
(each user will have his own small database created for him, and of course there is also the main (common) database for all other data that is not user specific)
Thank you all

I would wait on the partitioning and on creating mini-database idea. Is your database installed with the data, log and temp files on different RAID drives? Do you have clustered indexes on the tables and indexes on the search and join columns?
Have you tried any kind of reading Query Plans to see how and where the slowdowns are occurring? Don't just add memory or try advanced features blindly before doing the basics.
Creating separate databases will become a maintenance nightmare and it will be challenging to do the type of queries (for all users....) that you will probably like to do in the future.
Partitioning is a wonderful feature of SQL Server and while in 2014 you can have thousands of partitions you probably (unless you put each partition on a separate drive) won't see the big performance bump you are looking for.
SQL Server has very fast response time for tables (especially for tables with 10s of millions of rows (in your case the user table)). Don't let the main table get too wide and the response time will be extremely fast.

Right off the bat my first thought is this:
https://msdn.microsoft.com/en-us/library/ms188730.aspx
Partitioning can allow you to break it up into more manageable pieces and in a way that can be scalable. There will be some choices you have to make about how you break it up, but I believe this is the right path for you.
In regards to table scanning if you have proper indexing you should be getting seeks in your queries. You will want to look at execution plans to know for sure on this though.
As for having mini-DB for each user that is sort of what you can accomplish with partitioning.

Mini-Database for each user is a definite no-go zone.
Plus on a side note A separate table to hold just Two columns UserID and BlockedUserID both being INT columns and having correct indexes, you cannot go wrong with this approach , if you write your queries sensibly :)
look into table partitioning , also a well normalized database with decent indexes will also help.
Also if you can afford Enterprise Licence table partitioning with the table schema described in last point will make it a very good , query friendly database schema.

I did it once for a social network system. Maybe you can look for your normalization. At the time I got a [Relationship] table and it just got
UserAId Int
UserBId Int
RelationshipFlag Smallint
With 1 million users and each one with 10 "friends" that table got 10 millions rows. Not a problem since we put indexes on the columns and it can retrieve a list of all "related" usersB to a specific userA in no time.
Take a good look on your schema and your indexes, if they are ok you DB ill not got problems handling it.
Edit
I agree with #M.Ali
Mini-Database for each user is a definite no-go zone.
IMHO you are fine if you stick with the basic and implement it the right way

Is it good or bad practice to have multiple foreign keys in a single table, when the other tables can be connected using joins?

Let's say I wanted to make a database that could be used to keep track of bank accounts and transactions for a user. A database that can be used in a Checkbook application.
If i have a user table, with the following properties:
user_id
email
password
And then I create an account table, which can be linked to a certain user:
account_id
account_description
account_balance
user_id
And to go the next step, I create a transaction table:
transaction_id
transaction_description
is_withdrawal
account_id // The account to which this transaction belongs
user_id // The user to which this transaction belongs
Is having the user_id in the transaction table a good option? It would make the query cleaner if I wanted to get all the transactions for each user, such as:
SELECT * FROM transactions
JOIN users ON users.user_id = transactions.user_id
Or, I could just trace back to the users table from the account table
SELECT * FROM transactions
JOIN accounts ON accounts.account_id = transactions.account_id
JOIN users ON users.user_id = accounts.user_id
I know the first query is much cleaner, but is that the best way to go?
My concern is that by having this extra (redundant) column in the transaction table, I'm wasting space, when I can achieve the same result without said column.

Let's look at it from a different angle. From where will the query or series of queries start? If you have customer info, you can get account info and then transaction info or just transactions-per-customer. You need all three tables for meaningful information. If you have account info, you can get transaction info and a pointer to customer. But to get any customer info, you need to go to the customer table so you still need all three tables. If you have transaction info, you could get account info but that is meaningless without customer info or you could get customer info without account info but transactions-per-customer is useless noise without account data.
Either way you slice it, the information you need for any conceivable use is split up between three tables and you will have to access all three to get meaningful information instead of just a data dump.
Having the customer FK in the transaction table may provide you with a way to make a "clean" query, but the result of that query is of doubtful usefulness. So you've really gained nothing. I've worked writing Anti-Money Laundering (AML) scanners for an international credit card company, so I'm not being hypothetical. You're always going to need all three tables anyway.
Btw, the fact that there are FKs in the first place tells me the question concerns an OLTP environment. An OLAP environment (data warehouse) doesn't need FKs or any other data integrity checks as warehouse data is static. The data originates from an OLTP environment where the data integrity checks have already been made. So there you can denormalize to your hearts content. So let's not be giving answers applicable to an OLAP environment to a question concerning an OLTP environment.

You should not use two foreign keys in the same table. This is not a good database design.
A user makes transactions through an account. That is how it is logically done; therefore, this is how the DB should be designed.
Using joins is how this should be done. You should not use the user_id key as it is already in the account table.
The wasted space is unnecessary and is a bad database design.

In my opinion, if you have simple Many-To-Many relation just use two primary keys, and that's all.
Otherwise, if you have Many-To-Many relation with extra columns use one primary key, and two foreign keys. It's easier to manage this table as single Entity, just like Doctrine do it. Generally speaking simple Many-To-Many relations are rare, and they are usefull just for linking two tables.

Denormalizing is usually a bad idea. In the first place it is often not faster from a performance standard. What it does is make the data integrity at risk and it can create massive problems if you end up changing from a 1-1 relationship to a 1-many.
For instance what is to say that each account will have only one user? In your table design that is all you would get which is something I find suspicious right off the bat. Accounts in my system can have thousands of users. SO that is the first place I question your model. Did you actually think interms of whether the realtionships woudl be 1-1 or 1-many? Or did you just make an asssumpltion? Datamodels are NOT easy to adjust after you have millions of records, you need to do far more planning for the future in database design and far more thinking about the data needs over time than you do in application design.
But suppose you have one-one relationship now. And three months after you go live you get a new account where they need to have 3 users. Now you have to rememeber all the places you denornmalized in order to properly fix the data. This can create much confusion as inevitably you will forget some of them.
Further even if you never will need to move to a more robust model, how are you going to maintain this if the user_id changes as they are going to do often. Now in order to keep the data integrity, you need to have a trigger to maintain the data as it changes. Worse, if the data can be changed from either table you could get conflicting changes. How do you handle those?
So you have created a maintenance mess and possibly risked your data intergrity all to write "cleaner" code and save yourself all of ten seconds writing a join? You gain nothing in terms of things that are important in database development such as performance or security or data integrity and you risk alot. How short-sighted is that?
You need to stop thinking in terms of "Cleaner code" when developiong for databases. Often the best code for a query is the most complex appearing as it is the most performant and that is critical for databases. Don't project object-oriented coding techniques into database developement, they are two very differnt things with very differnt needs. You need to start thinking in terms of how this will play out as the data changes which you clearly are not doing or you would not even consider doing such a thing. You need to think more of thr data meaning and less of the "Principles of software development" which are taught as if they apply to everything but in reality do not apply well to databases.

It depends. If you can get the data fast enough, used the normalized version (where user_id is NOT in the transaction table). If you are worried about performance, go ahead and include user_ID. It will use up more space in the database by storing redundant information, but you will be able to return the data faster.
EDIT
There are several factors to consider when deciding whether or not to denormalize a data structure. Each situation needs to be considered uniquely; no answer is sufficient without looking at the specific situation (hence the "It depends" that begins this answer). For the simple case above, denormalization would probably not be an optimal solution.

How to handle massive storage of records in database for user authorization purposes?

I am using Ruby on Rails 3.2.2 and MySQL. I would like to know if it is "advisable" / "desirable" to store in a database table related to a class all records related to two others classes for each "combination" of their instances.
That is, I have User and Article models. In order to store all user-article authorization objects, I would like to implement a ArticleUserAuthorization model so that
given N users and M articles there are N*M ArticleUserAuthorization records.
Making so, I can state and use ActiveRecord::Associations as the following:
class Article < ActiveRecord::Base
has_many :user_authorizations, :class_name => 'ArticleUserAuthorization'
has_many :users, :through => :user_authorizations
end
class User < ActiveRecord::Base
has_many :article_authorizations, :class_name => 'ArticleUserAuthorization'
has_many :articles, :through => :article_authorizations
end
However, the above approach of storing all combinations will result in a big database table containing billions billions billions of rows!!! Furthermore, ideally speaking, I am planning to create all authorization records when an User or an Article object is created (that is, I am planning to create all previously mentioned "combinations" at once or, better, in "delayed" batches... in any way, this process creates other billions billions of database table rows!!!) and make the viceversa when destroying (by deleting billions billions of database table rows!!!). Furthermore, I am planning to read and update those rows at once when an User or Article object is updated.
So, my doubts are:
Is this approach "advisable" / "desirable"? For example, what kind of performance problems may occur? or, is a bad "way" / "prescription" to admin / manage databases with very large database tables?
How may / could / should I proceed in my case (maybe, by "re-thinking" at all how to handle user authorizations in a better way)?
Note: I would use this approach because, in order to retrieve only "authorized objects" when retrieving User or Article objects, I think I need "atomic" user authorization rules (that is, one user authorization record for each user and article object) since the system is not based on user groups like "admin", "registered" and so on. So, I thought that the availability of a ArticleUserAuthorization table avoids to run methods related to user authorizations (note: those methods involve some MySQL querying that could worsen performance - see this my previous question for a sample "authorization" method implementation) on each retrieved object by "simply" accessing / joining the ArticleUserAuthorization table so to retrieve only "user authorized" objects.

The fact of the matter is that if you want article-level permissions per user then you need a way to relate Users to the Articles they can access. This neccesitates a minimum you need N*A (where A is the number of uniquely permissioned articles).
The 3NF approach to this would be, as you suggested, to have a UsersArticles set... which would be a very large table (as you noted).
Consider that this table would be accessed a whole lot...
This seems to me like one of the situations in which a slightly denormalized approach (or even noSQL) is more appropriate.
Consider the model that Twitter uses for their user follower tables:
Jeff Atwood on the subject
And High Scalability Blog
A sample from those pieces is a lesson learned at Twitter that querying followers from a normalized table puts tremendous stress on a Users table. Their solution was to denormalize followers so that a user's follower's are stored on their individual user settings.
Denormalize a lot. Single handedly saved them. For example, they store all a user IDs friend IDs together, which prevented a lot of costly joins.
- Avoid complex joins.
- Avoid scanning large sets of data.
I imagine a similar approach could be used to serve article permissions and avoid a tremendously stressed UsersArticles single table.

You don't have to re-invent the wheel. ACL(Access Control List) frameworks deals with same kind of problem for ages now, and most efficiently if you ask me. You have resources (Article) or even better resource groups (Article Category/Tag/Etc).On the other hand you have users (User) and User Groups. Then you would have a relatively small table which maps Resource Groups to User Groups. And you would have another relatively small table which holds exceptions to this general mapping. Alternatively you can have rule sets to satify for accessing an article.You can even have dynamic groups like : authors_friends depending on your user-user relation.
Just take a look at any decent ACL framework and you would have an idea how to handle this kind of problem.

If there really is the prospect of "a big database table containing billions billions billions of rows" then perhaps you should craft a solution for your specific needs around a (relatively) sparsely populated table.
Large database tables pose a significant performance challange in how quickly the system can locate the relevant row or rows. Indexes and primary keys are really needed here; however they add to the storage requirements and also require CPU cycles to be maintained as records are added, updated, and deleted. Evenso, heavy-duty database systems also have partitioning features (see http://en.wikipedia.org/wiki/Partition_(database) ) that address such row location performance issues.
A sparsely populated table can probably serve the purpose assuming some (computable or constant) default can be used whenever no rows are returned. Insert rows only wherever something other than the default is required. A sparsely populated table will require much less storage space and the system will be able to locate rows more quickly. (The use of user-defined functions or views may help keep the querying straightforward.)
If you really cannot make a sparsely populated table work for you, then you are quite stuck. Perhaps you can make that huge table into a collection of smaller tables, though I doubt that's of any help if your database system supports partitioning. Besides, a collection of smaller tables makes for messier querying.
So let's say you have millions or billions of Users who or may not have certain privileges regarding the millions or billions of Articles in your system. What, then, at the business level determines what a User is privileged to do with a given Article? Must the User be a (paying) subscriber? Or may he or she be a guest? Does the User apply (and pay) for a package of certain Articles? Might a User be accorded the privilege of editing certain Articles? And so on and so forth.
So let's say a certain User wants to do something with a certain Article. In the case of a sparsely populated table, a SELECT on that grand table UsersArticles will either return 1 row or none. If it returns a row, then one immediately knows the ArticleUserAuthorization, and can proceed with the rest of the operation.
If no row, then maybe it's enough to say the User cannot do anything with this Article. Or maybe the User is a member of some UserGroup that is entitled to certain privileges to any Article that has some ArticleAttribute (which this Article has or has not). Or maybe the Article has a default ArticleUserAuthorization (stored in some other table) for any User that does not have such a record already in UsersArticles. Or whatever...
The point is that many situations have a structure and a regularity that can be used to help reduce the resources needed by a system. Human beings, for instance, can add two numbers with up to 6 digits each without consulting a table of over half a trillion entries; that's taking advantage of structure. As for regularity, most folks have heard of the Pareto principle (the "80-20" rule - see http://en.wikipedia.org/wiki/Pareto_principle ). Do you really need to have "billions billions billions of rows"? Or would it be truer to say that about 80% of the Users will each only have (special) privileges for maybe hundreds or thousands of the Articles - in which case, why waste the other "billions billions billions" (rounded :-P).

You should look at a hierarchical role based access control (RBAC) solutions. You should also consider sensible defaults.
Are all users allowed to read an article by default? Then store the deny exceptions.
Are all users not allowed to read an article by default? Then store the allow exceptions.
Does it depend on the article whether the default is allow or deny? Then store that in the article, and store both allow and deny exceptions.
Are articles put into issues, and issues collected into journals, and journals collected into fields of knowledge? Then store authorizations between users and those objects.
What if a User is allowed to read a Journal but is denied a specific Article? Then store User-Journal:allow, User-Article:deny and the most specific instruction (in this case the article) takes precedence over the more general (in this case the default, and the journal).

Shard the ArticleUserAuthorization table by user_id. The principle is to reduce the effective dataset size on the access path. Some data will be accessed more frequently than others, also it be be accessed in a particular way. On that path the size of the resultset should be small. Here we do that by having a shard. Also, optimize that path more by maybe having an index if it is a read workload, cache it etc
This particular shard is useful if you want all the articles authorized by a user.
If you want to query by article as well, then duplicate the table and shard by article_id as well. When we have this second sharding scheme, we have denormalized the data. The data is now duplicated and the application would need to do extra work to maintain data-consistency. Writes also will be slower, use a queue for writes
Problem with sharding is that queries across shards is ineffectve, you will need a separate reporting database. Pick a sharding scheme and think about recomputing shards.
For truly massive databases, you would want to split it across physical machines. eg. one or more machines per user's articles.
some nosql suggestions are:
relationships are graphs. so look at graph databases. particularly
https://github.com/twitter/flockdb
redis, by storing the relationship in a list.
column-oriented database like hbase. can treat it like a sparse nested hash
all this depends on the size of your database and the types of queries
EDIT: modified answer. the question previously had 'had_one' relationships Also added nosql suggestions 1 & 2

First of all, it is good to think about default values and behaviors and not store them in the database. For example, if by default, a user cannot read an article unless specified, then, it does not have to be stored as false in the database.
My second thought is that you could have a users_authorizations column in your articles table and a articles_authorizations in your users table. Those 2 columns would store user ids and article ids in the form 3,7,65,78,29,78. For the articles table for example, this would mean users with ids 3,7,65,78,29,78 can access the articles. Then you would have to modify your queries to retrieve users that way:
#article = Article.find(34)
#users = User.find(#article.user_authorizations.split(','))
Each time an article and a user is saved or destroyed, you would have to create callbacks to update the authorization columns.
class User < ActiveRecord
after_save :update_articles_authorizations
def update_articles_authorizations
#...
end
end
Do the same for Article model.
Last thing: if you have different types of authorizations, don't hesitate creating more columns like user_edit_authorization.
With these combined techniques, the quantity of data and hits to the DB are minimal.

Reading through all the comments and the question I still doubt the validity of storing all the combinations. Think about the question in another way - who will populate that table? The author of the article or moderator, or someone else? And based on what rule? You wound imagine how difficult that is. It's impossible to populate all the combinations.
Facebook has a similar feature. When you write a post, you can choose who do you want to share it with. You can select 'Friends', 'Friends of Friends', 'Everyone' or custom list. The custom list allows you to define who will be included and excluded. So same as that, you only need to store the special cases, like 'include' and 'exclude', and all the remaining combinations fall into the default case. By dong this, N*M could be reduced significantly.

Basic database design and complexity

I am designing a system which has a database for storing users and information related to the users. More specifically each user in the table has very little information. Something like Name, Password, uid.
Then each user has zero or more containers, and the way I've initially done this is to create a second table in the database which holds containers and have a field referencing the user owning it. So something like containerName, content, owner.
So a query on data from a container would look something like:
SELECT content
FROM containers
WHERE (containerName='someContainer' AND owner='someOwner');
My question is if this is a good way, I am thinking scalability say that we have thousands of users with say... 5 containers each (however each user could have a different number of containers, but 5 would probably be a typical case). My concern is that searching through the database will become slow when there is 5 entries out of 5*1000 entries I could ever want in one query. (We may typically only want a specific container's content from our query and we are looking into the database with basically a overhead of 4995 entries, am I right? And what happen if I subscribed a million users, it would become a huge table which just intuitively feel like a bad idea.
A second take on it which I had would be to have tables per user, however that doesn't feel like a very good solution either since that would give me 1000 tables in the database which (also by intuition) seem like a bad way to do it.
Any help in understanding how to design this would be greatly appreciated, I hope it's all clear and easy to follow.

The accepted way of handling this is by creating an INDEX on the owner field. That way, MySQL optimized queries for owner = 'some value' conditions.
See also: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
You're right in saying that a 1000 tables is not scalable. Once you start reaching a few million records you might want to consider doing sharding (split up records into several locations based on user attributes) ... but by that time you'd already be quite successful I think ;-)

If it is an RBMS(like Oracle / MySQL) datbase, you can create indexes on columns that are frequently queried to optimize the table traversal and query. Indexes are automatically created for PRIMARY and (optionally for) FOREIGN keys.

What is the most efficient method of keeping track of each user's "blocked users" in a MySQL Database?

What is the most efficient method of managing blocked users for each user so they don't appear in search results on a PHP/MySQL-run site?
This is the way I am currently doing it and I have a feeling this is not the most efficient way:
Create a BLOB for each user on their main user table that gets updated with the unique User ID's of each user they block. So if User ID's 313, 563, and 732 are blocked by a user, their BLOB simply contains "313,563,732". Then, whenever a search result is queried for that user, I include the BLOB contents like so "AND UserID NOT IN (313,563,732)" so that the blocked User ID's don't show up for that user. When a user "unblocks" someone, I remove that User ID from their BLOB.
Is there a better way of doing this (I'm sure there is!)? If so, why is it better and what are the pros and cons of your suggestion?
Thanks, I appreciate it!

You are saving relationships in a relational database in a way that it does not understand. You will not have the benefit of foreign keys etc.
My recommended way to do this would be to have a seperate table for the blocked users:
create table user_blocked_users (user_id int, blocked_user_id);
Then when you want to filter the search result, you can simply do it with a subquery:
select * from user u where ?searcherId not in (select b.blocked_user_id from user_blocked_users where b.user_id = u.id)
You may want to start out that way, and then optimize it with queries, caches or other things if neccessary - but do it last. First, do a consistent and correct data model that you can work with.
Some of the pros of this approach:
You will have a correct data model
of your block relations
With foreign keys, you will keep your data model consistent
The cons of this approach:
In your case, none that I can see
The cons of your approach:
It will be slow and not scalable, as blobs are searched binarily and not indexed
Your data model will be hard to maintain and you will not have the benefit of foreign keys

You are looking for a cross reference table.
You have a table containing user IDs and "Blocked" user IDs, then you SELECT blockid FROM blocked WHERE uid=$user and you have a list of user ids that are blocked, which you can filter through a where clause such as WHERE uid NOT IN(SELECT blockid FROM blocked WHERE uid=$user)
Now you can block multiple users per user, and the other way round, with all the speed of an actual database.

You are looking for a second table joined in a many-to-many relationship. Check this post:
Many-to-Many Relationships in MySQL
The "Pros" are numerous. You are handling your data with referential integrity, which has incalculable benefits down the road. The issue you described will be followed by others in your application, and some of those others will be more unmanageable than this one.
The "Cons" are that
You will have have to learn how referential data works (but that's ahead anyway, as I say)
You will have more tables to deal with (ditto)
You will have to learn more about CRUD, which is difficult ... but, just part of the package.

What you are currently using is not regarded as a good practice for relational database design, however, like with anything else, there are cases when that approach can be justified, albeit restrictive in terms of what you can accomplish.
What you could do is, like J V suggested, create a cross reference table that contains mappings of user relationships. This allows you to, among other things, skip unnecessary queries, make use of table indexes and possibly most importantly, it gives you far greater flexibility in the future.
For instance, you can add a field to the table that indicates the type/status of the relationship (ie. blocked, friend, pending approval etc.) which would allow a much more complex system to be developed easily.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008