Understanding Web App Permissions with MySQL - mysql

Assume I have a schema defined with the 4 following entities:
Users
-> Timeline (fk: userId)
-> Entries (fk: timelineId)
-> Tags (fk: entryId), where fk means foreign key.
Now, let's say I want to check in the web application if a user has permission to delete a particular tag. Right now I use Basic Authentication, check if the user's email // password exist in the database, and if so, grab the userId.
Because of the userId only existing on the Timeline entity, I feel like I'd need to do the following:
DELETE t.* FROM `tags` AS t
INNER JOIN `entries` AS e ON t.entryId = e.id
INNER JOIN `timelines` AS tl ON e.timelineId = tl.id
WHERE
tl.userId = ? AND
t.id = ?
This approach works, but I feel like it would be inefficient. Instead, I could add a userId FK to every single table such as the tags, but that also seems like a maintenance nightmare.
I can't think of any other approaches other than implementing some other type of permission system, such as using an ACL. Any suggestions?

I think you can choose from a few options:
leave it as is until it actually becomes a problem and fix it then (it probably won't be a problem, there's a lot of optimization in MySQL)
add foreign keys to tables as you proposed and take the overhead on changes (your models / data access layer should hide the problem from higher layers anyway)
implement some kind of custom caching
you can either create something like a cache table, probably in a nosql database like Redis, which would be very fast (once a permission is retrieved, it can live in the cache for a while, but be aware of consequences, for example permission changes will not be immediate)
you can use memcache
you can do custom in-memory caching in your app (be careful with using the session for this, session-related vulnerabilities might allow an attacker more access than you intended)
Basically and in general it's a compute / storage tradeoff I think. You either compute permissions every time or store them pre-computed somewhere, which means you need to re-compute them sometimes (but probably not all the time).
The right solution depends on your exact scenario. My experience is that in most cases it's not worth to fix something that is not broken yet (unless of course you know it will not work that way in the scenario you want to use it in).

Check out foreign keys here. You can simply add relationships through MySQL to the other tables, and cascade delete when the parents get removed.

Related

Translate sql database schema to IndexedDB

I have three tables in my SQL Schema: Clients, with address and so on, orders with order details and files, which stores uploaded files. both the files table and the orders table contain foreign keys referencing the Client tables.
How would I do that in IndexedDB? IÄm new to this whole key-index-thinking and would just like to understand, how the same Thing would be done with indexedDB.
Now I know there is a shim.js file, but I'm trying to understand the concept itself.
Help and tips highly appreciated!
EDIT:
So I would really have to think about which queries I want to allow and then optimize my IndexedDB implementation for those queries, is that the main point here? Basically, I want to to store a customer once and then many orders for that customer and then be able to upload small files (preferably pdfs) for that customer, not even necessarily for each order (although if that's easy to implement, I may do it)... I see every customer as a separate entity, I wont have things like "give me all customers who ordered xy" - I only need to have each customer once and then store all the orders for the customer and all the files. I wanto be able to go: Search for customer with the name of XY - which then gives me a list of all orders and their dates and a list of the files uploaded for that customer (maybe associated to the order).
This question is a bit too broad to answer correctly. Nevertheless, the major concept to learn when transitioning from SQL to No-SQL (indexedDB) is the concept of object stores. Most SQL databases are relational and perform much of the work of optimizing queries for you. indexedDB does not. So the concepts of normalization and denormalization work a bit differently. The focal point is to explicitly plan your own queries. Unlike the design of an app/system that allows simple ad-hoc SQL queries that are designed at a later point in time, and possibly even easily added/changed at a later time, you really need to do a lot of the planning up front for indexedDB.
So it is not quite safe to say that the transition is simply a matter of creating three object stores to correspond to your three relational tables. For one, there is no concept of joining in indexedDB so you cannot join on foreign keys.
It is not clear from your question but your 3 tables are clients, orders, and files. I will go out on a limb here and make some guesses. I would bet you could use a single object store, clients. Then, for each client object, store the normal client properties, store an orders array property, and store a files array property. In the orders array, store order objects.
If your files are binary, this won't work, you will need to use blobs, and may even encounter issues with blob support in various browser indexedDB implementations (Chrome sort of supports it, it is unclear from version to version).
This assumes your typical query plan is that you need to do something like list the orders for a client, and that is the most frequently used type of query.
If you needed to do something across orders, independent of which client an order belongs to, this would not work so well and you would have to iterate over the entire store.
If the clients-orders relation is many to many, then this also would not work so well, because of the need to store the order info redundantly per client. However, one note here, is that this redundant storage is quite common in NoSQL-style databases like indexedDB. The goal is not to perfectly model the data, but to store the data in such a way that it your most frequently occurring queries complete quickly (while still maintaining correctness).
Edit:
Based on your edit, I would suggest a simple prototype that uses three object stores. In your client view page where you display client details, simply run three separate queries.
Get the one entity from the client object store based on client id.
Open a cursor over the orders and get all orders for the client. In the orders store, use a client-id property. Create an index on this client-id property. Open the cursor over the index for a specific client id.
Open a cursor over the files store using a similar tactic as #2.
In your bizlogic layer, enforce your data constraints. For example, when deleting a client, first delete all the files from the files store, then delete all the orders from the orders store, and then delete the single client entity from the client store.
What I am suggesting is to not overthink it. It is not that complicated. So far you have not described something that sounds like it will have performance issues so there is no need for something more elegant.
I will go with Josh answer but if you are still finding it hard to use indexeddb and want to continue using sql. You can use sqlweb - It will let you do operation inside indexeddb by using sql query.
e.g -
var connection = new JsStore.Instance('jsstore worker path');
connection.runSql("select * from Customers").then(function(result) {
console.log(result);
});
Here is the link - http://jsstore.net/tutorial/sqlweb/

MySQL: Best way to check if user has permission by role

I'm building a small permission system, but unfortunately I'm no SQL expert by any means.
In this system I've decided to give all users a role and then assign specific permissions to the roles. My current database tables look like this:
My question is: What's the best way to check if a given User.id as a permission, by providing a Permission.permission_name value. I've come up with the following query:
SELECT EXISTS (
SELECT perm.id
FROM `User` userr
INNER JOIN `Role_Permission` connectionc
ON userr.role_id = connectionc.role_id
INNER JOIN `Permission` perm
ON connectionc.permission_id = perm.id
WHERE userr.id = 1
AND perm.permission_name LIKE 'doStuff'
) as userHasPermission
It works, but, from my understanding joining is expensive and that query is joining the content of 3 tables and then filtering what it needs.
Link to sqlfiddle: http://sqlfiddle.com/#!2/6ed7b/1
Thank you.
I don't think there's much place to optimise the query. From the real world scenario, no matter how big the user table is, role and permission table shouldn't exceed 3 digit and therefore, role_permission would not exceed 998001 records. If all the right columns are indexed properly, I believe, the sql will be quite fast (<0.1 sec). You can always check EXPLAIN do check if there's any bottlenecks.
(Off topic)
Alternatively, having worked on a similar project recently, there are few choices out there to improve speed fetching from 'finite' no. records.
Memory: You can choose to save all these relevant tables/data in memory (as opposed to disk) to minimise I/O related latency.
NoSQL: You can either choose a NoSQL solution like mongoDB and/or implement noSQL-like structure in MySql to eliminate Joins.
Redis: Arguably, the best solution if you'd like to think outside the box. Fastest of all.
I don't think there is much room for optimization, not without compromising the normalization of the database. Just make sure that you have the appropriate indexes in place.
Some alternatives would be:
Store the index name in the role permission table, thus requiring one less join. It will be not normalized, but this may be acceptable if permissions rarely change and you really need maximum performance.
Do not use integer ids for the permissions, instead, use their name as unique identifier. Then you don't need the table Permission at all, unless you need to add some attribute to them (but that would still allow you to check for a permission with only one join).
You should also consider how often do you need to run this query. Depending on you requirements, it may be acceptable to read all user permissions only when the user enters the system and store them on variables during the whole session; in this case you do not need so high a performance for the query. Or you could initially load not the permissions but the role, which would mean one less join on the query.

Database revisions for data and relations for moderating content changes

Short version: I'm looking for suggestions on how to implement a database versioning system that keeps track of relations as well as record data for moderating a social web app with user-editable content.
Long version: I'm developing a social app where all the users can edit content. To give you an example, let's say one type of an item to edit is a Book. A Book can have a title and a few authors (many-to-many relation to authors table). Of course this example is rather simple; real items will have many more fields as well as relations, but the rule should be the same.
Now, I need to implement a versioning system to moderate changes made by users. Let's say there are two groups of users: normal users and trusted users. Changes made by normal users are logged, but aren't commited until moderator accepts that particular change. Changes made by trusted users are commited immediately, but still logged so that they can be reverted at any time.
If I were to keep revisions of a Book from the example, I would need to keep track of changing relations with authors. I need to be able to track adding and deleting relations, and be able to revert them.
So if the title gets modified, I need to be able to revert the change. If a relation to an author gets deleted, I need to be able to revert that, as well as if some relation gets added, I need to be able to revert that too.
I only need to keep track of an item and it's relations, not anything that it relates to. If we had tables foos, bars, and foos_bars, I would be interested only in logging foos and foos_bars, bars would be pretty independent.
I'm familiar with this great question, as well as it's kind-of-an-adversary solution, and pretty comprehensive article on the second approach and it's follow-up. However, none of those give any special consideration to keeping track of relations as well as normal table data that would be obvious answer to my problem.
I like the one-table-for-all-history approach, as it allows for keeping only part of changes easily, and undo others. Like if one user submitted fields A and B, and then second user submitted A and B, it would be easy to undo just second user's B-change, and keep the A. It's also nice to have one table for the whole functionality, as opposed to many tables with the other approach. It also makes it easy to see who did exactly what (e.g. modified only foobar field) - it doesn't seem to be easy with the other approach. And it seems like it would be easier to automate the moderation process - we don't really even need to know table names, as everything needed is stored in a revision record.
If I were to use the one-revisions-table-for-each-revisioned-table approach, having limited experience in writing triggers, I don't know if it would be possible or relatively easy to implement a system that automatically records an edit, but doesn't commit it immediately unless some parameter is set (e.g. edit_from_trusted_user == true). And it makes me think of triggers invoking when I wouldn't really want them to (as the moderation wouldn't apply to e.g. changes made by admin, or some other "objects" that could try to modify the data).
No matter which solution I choose, it seems as if I'll have to add a rather artificial id to all many-to-many relation tables (instead of [book_id, author_id] I would have [id, book_id, author_id]).
I thought about implementing relations in the one table approach like so:
if we have standard revision table structure
[ID] [int]
[TableName] [varchar]
[RecordID] [int]
[FieldName] [varchar]
[OldValue] [varchar]
[NewValue] [varchar]
[EventType] [enum]
[EventDate] [datetime]
[UserID] [int]
we could store relations by simply setting RecordID and FieldName to NULL, EventType to either ADD or DELETE, and OldValue and NewValue to relation's foreign keys. The only problem is, some of my relations have some additional data (like a graph's edge weight), so I would have to store that somewhere too. Then again, operation of adding a new relation could be split into 2-event sequence: ADD and SET(weight), but then artificial relation IDs would be needed, and I'm not sure if such a solution wouldn't have some bad implications in the future.
There will be around 5 to 10 versioned tables, each with, on average, 3 many-to-many relations to keep track of. I'm using MySQL on InnoDB, app is written in PHP 5.3 and connected to the db using PDO.
Putting versioning in the app logic instead of db triggers is fine with me. I just need the whole thing to work, and be reasonably efficient. I expect reverts to occur rather seldom compared to edits, and edits will be few compared to number of views of content. Only moderators will access revision data, to either accept or reject recent changes.
Do you have any experience implementing such system? What are suggested solutions to this problem? Any considerations that come to mind?
I searched SO and the net for quite some time, but didn't find anything to help me with the matter. However, if I missed something, I'll be grateful for any links / directions.
Thanks.

What is the most efficient method of keeping track of each user's "blocked users" in a MySQL Database?

What is the most efficient method of managing blocked users for each user so they don't appear in search results on a PHP/MySQL-run site?
This is the way I am currently doing it and I have a feeling this is not the most efficient way:
Create a BLOB for each user on their main user table that gets updated with the unique User ID's of each user they block. So if User ID's 313, 563, and 732 are blocked by a user, their BLOB simply contains "313,563,732". Then, whenever a search result is queried for that user, I include the BLOB contents like so "AND UserID NOT IN (313,563,732)" so that the blocked User ID's don't show up for that user. When a user "unblocks" someone, I remove that User ID from their BLOB.
Is there a better way of doing this (I'm sure there is!)? If so, why is it better and what are the pros and cons of your suggestion?
Thanks, I appreciate it!
You are saving relationships in a relational database in a way that it does not understand. You will not have the benefit of foreign keys etc.
My recommended way to do this would be to have a seperate table for the blocked users:
create table user_blocked_users (user_id int, blocked_user_id);
Then when you want to filter the search result, you can simply do it with a subquery:
select * from user u where ?searcherId not in (select b.blocked_user_id from user_blocked_users where b.user_id = u.id)
You may want to start out that way, and then optimize it with queries, caches or other things if neccessary - but do it last. First, do a consistent and correct data model that you can work with.
Some of the pros of this approach:
You will have a correct data model
of your block relations
With foreign keys, you will keep your data model consistent
The cons of this approach:
In your case, none that I can see
The cons of your approach:
It will be slow and not scalable, as blobs are searched binarily and not indexed
Your data model will be hard to maintain and you will not have the benefit of foreign keys
You are looking for a cross reference table.
You have a table containing user IDs and "Blocked" user IDs, then you SELECT blockid FROM blocked WHERE uid=$user and you have a list of user ids that are blocked, which you can filter through a where clause such as WHERE uid NOT IN(SELECT blockid FROM blocked WHERE uid=$user)
Now you can block multiple users per user, and the other way round, with all the speed of an actual database.
You are looking for a second table joined in a many-to-many relationship. Check this post:
Many-to-Many Relationships in MySQL
The "Pros" are numerous. You are handling your data with referential integrity, which has incalculable benefits down the road. The issue you described will be followed by others in your application, and some of those others will be more unmanageable than this one.
The "Cons" are that
You will have have to learn how referential data works (but that's ahead anyway, as I say)
You will have more tables to deal with (ditto)
You will have to learn more about CRUD, which is difficult ... but, just part of the package.
What you are currently using is not regarded as a good practice for relational database design, however, like with anything else, there are cases when that approach can be justified, albeit restrictive in terms of what you can accomplish.
What you could do is, like J V suggested, create a cross reference table that contains mappings of user relationships. This allows you to, among other things, skip unnecessary queries, make use of table indexes and possibly most importantly, it gives you far greater flexibility in the future.
For instance, you can add a field to the table that indicates the type/status of the relationship (ie. blocked, friend, pending approval etc.) which would allow a much more complex system to be developed easily.

Social web application database design: how can I improve this schema?

Background
I am developing a social web app for poets and writers, allowing them to share their poetry, gather feedback, and communicate with other poets. I have very little formal training in database design, but I have been reading books, SO, and online DB design resources in an attempt to ensure performance and scalability without over-engineering.
The database is MySQL, and the application is written in PHP. I'm not sure yet whether we will be using an ORM library or writing SQL queries from scratch in the app. Other than the web application, Solr search server and maybe some messaging client will interact with the database.
Current Needs
The schema I have thrown together below represents the primary components of the first version of the website. Initially, users can register for the site and do any of the following:
Create and modify profile details and account settings
Post, tag and categorize their writing
Read, comment on and "favorite" other users' posts
"Follow" other users to get notifications of their activity
Search and browse content and get suggested posts/users (though we will be using the Solr search server to index DB data and run these type of queries)
Schema
Here is what I came up with on MySQL Workbench for the initial site. I'm still a little fuzzy on some relational databasey things, so go easy.
Questions
In general, is there anything I'm doing wrong or can improve upon?
Is there any reason why I shouldn't combine the ExternalAccounts table into the UserProfiles table?
Is there any reason why I shouldn't combine the PostStats table into the Posts table?
Should I expand the design to include the features we are doing in the second version just to ensure that the initial schema can support it?
Is there anything I can do to optimize the DB design for Solr indexing/performance/whatever?
Should I be using more natural primary keys, like Username instead of UserID, or zip/area code instead of a surrogate LocationID in the Locations table?
Thanks for the help!
In general, is there anything I'm doing wrong or can improve upon?
Overall, I don't see any big flaws in your current setup or schema.
What I'm wonderng is your split into 3 User* tables. I get what you want your intendtion was (having different user-related things seperate) but I don't know if I would go with the exact same thing. If you plan on displaying only data from the User table on the site, this is fine, since the other info is not needed multiple times on the same page but if users need to use their real name and display their real name (like John Doe instead of doe55) than this will slow down things when the data gets bigger since you may require joins. Having the Preferences seperate seems like a personal choice. I have no argument in favor of nor against it.
Your many-to-many tables would not need an addtional PK (e.g PostFavoriteID). A combined primary of both PostID and UserID would be enough since PostFavoriteID is never used anywhere else. This goes for all join tables
Is there any reason why I shouldn't combine the ExternalAccounts
table into the UserProfiles table?
As withe the prev. answer, I don't see a advatanage or disadvantage. I may put both in the same table since the NULL (or maybe better -1) values would not bother me.
Is there any reason why I shouldn't combine the PostStats table
into the Posts table?
I would put them into the same table using a trigger to handle the increment of the ViewCount table
Should I expand the design to include
the features we are doing in the
second version just to ensure that the
initial schema can support it?
You are using a normalsied schema so any additions can be done at any time.
Is there anything I can do to optimize the DB design for Solr
indexing/performance/whatever?
Can't tell you, haven't done it yet but I know that Solr is very powerfull and flexible so I think you should be doing fine.
Should I be using more natural primary keys, like Username instead of
UserID, or zip/area code instead of a
surrogate LocationID in the Locations
table?
There are many threads here on SO discussing this. Personally, I like a surrogate key better (or another unique number key if available) since it makes queries more easier and faster since an int is looked up easier. If you allow a change of username/email/whatever-your-PK-is than there are massive updates required. With the surrogate key, you don't need to bother.
What I would also do is to add things like created_at, last_accessed at (best done via triggers or procedures IMO) to have some stats already available. This can realy give you valuable stats
Further strategies to increate the performance would be things like memcache, counter cache, partitioned tables,... Such things can be discussed when you are really overrun by users because there may be things/technologies/techniques/... that are very specific to your problem.
I'm not clear what's going on with your User* tables - they're set up as if they're 1:1 but the diagram reflects 1-to-many (the crow's foot symbol).
The ExternalAccounts and UserSettings could be normalised further (in which case they would then be 1-to-many!), which will give you a more maintainable design - you wouldn't need to add further columns to your schema for additional External Account or Notification Types (although this may be less scalable in terms of performance).
For example:
ExternalAccounts
UserId int,
AccountType varchar(45),
AccountIdentifier varchar(45)
will allow you to store LinkedIn, Google, etc. accounts in the same structure.
Similarly, further Notification Types can be readily added using a structure like:
UserSettings
UserId int,
NotificationType varchar(45),
NotificationFlag ENUM('on','off')
hth