Mysql deduce foreign key relationship for random queries - mysql

I am an MySQL novice and am looking for the solution to the following problem:
I would like to create a CMS with cppcms which shall be capable to have modules. Since I want to reduce the chance of (accidental) access to private data, I want a module which handles data access and rights. Since this module is supposed to be unaware of data structures created by other modules I would like it to deduce the data owner through foreign key relations. My idea would be to search for a path (over foreign keys) which links a row to a user id.
Sum up:
What I am trying to do
Taking a random query, determine the affected rows
for the affected rows determine a relationship/path (via foreign keys) to a user/userid (a column in an existing table)
return only the rows for which a relationship could be determined and a condition holds (e.g. the userid found in the related query matches a fixed user id, such as the user currently accessing the system)
(As far as I know foreign keys only enforce the existence of a key in another table, however the precondition I assume is, that every row is linked to a user over a path of foreign key relations)
My Problem/Question:
Is there an existing solution/Better approach to the problem? Prepared statements wont do the trick since I don't know all datastructures/queries in advance.
How do I get the foreign key relations? Is there another way besides "SHOW CREATE TABLE" and then parsing the result string?
How can I determine the rows that would be affected, without modifing them? I would like to filter this set afterwards by determining if I can link it to the current user (not the mysql user but system user).
Could I try executing the query, and then select the affect rows, and if I determine an access violation simply do a rollback? Problem with this: how to do the changes to the subset of rows for which it is legal (e.g. I attempt to change 5 rows, may only change 2, how to only change those 2). One idea was to search a way to create a temporary table with the result set; this solution has several drawbacks: foreign key relations are not possilbe for temporary tables, they are 'lost'.
P.S.: I am coding in c++, therfore I would prefer cpp-compatible library recommendations, however I am open to other suggestions. While googling I stumbled over doctrine and Iam currently researching it.
P.P.S.: Database engine is InnoDB (has to because of the foreign keys)
UPDATE: Explanation Attempt of Part 2:
I am trying to filter which collumns a user is allowed to see of tables. To do so I would like to find a connection in the database over foreign keys (By foreign keys I ensure that I can get to all data over joins, and they are a hint on which columns I have to join). Since I plan on a complexer system (e.g. forum) I don't want to join all data in a temporary table and run a user query on those. I would rather evaluate the userquery and check for the result if I can map it with a join to the users id. For example I could use this to enforce that an edit button is only enabled for the posts created by the user. (I know there are easier ways to do this, but I basically want to allow programmers to write their own queries without giving them the chance to edit or view data that they are not allowed to see. My assumption is that the programmer is not an evildoer but simply forgetting constraints, thus I want to enforce them in software).
Getting here would be pretty good, but I have a little more complex need.
First a basic example. Let's say its like facebook and all the friends of a person are allowed to see his pictures.
pictures = id **userid** file (bool)visibleForFriends album
friendship = **userid1** **userid2**
users = userid
What I want to happen is:
Programmer input "SELECT * FROM pictures WHERE album=2"
System gets all matching records (e.g. set of ids)
System sees foreign key userid, tries to match current userid against the pictures userid, adds all matching to the returned result part
System notices special column visibleForFriends
System tries to determin all Friends (SELECT userid1 FROM friendship WHERE userid2=currentUserID join (have to read up on joins) SELECT userid2 FROM friendship WHERE userid1 =currentUserID)
System adds all rows where visibleForFriends is true and pictures.userid=Result from 5.
While the Friendship part is some extra code (I think doable if igot started on the first bit), I still need to figure out how to automatically follow the foreign keys to see the connection. Ignoring the special Friendship case (special case), I would like the system to work on this as well:
pictures = id **albumid** file (bool)visibleForFriends album
albums = id **userid**
users = userid
Now the system should go pictures.albumid ==> albums.id -> albums.userid ==> users.userid.
I hope the examples clarified the question a bit. One problem is, that in point one from the example (programmer query input) I dont want to let "DELETE *" take effect on anything not owned by the user. So I have to filter which rows to actually delete.

In response to part of your answer (part 1), providing the Mysql user you access the database with has access rights to information_schema then you can use the following query to understand existing foreign key relations within a specific database:
SELECT
TABLE_NAME,
COLUMN_NAME,
REFERENCED_TABLE_NAME,
REFERENCED_COLUMN_NAME
FROM
information_schema.KEY_COLUMN_USAGE
WHERE
TABLE_SCHEMA = 'dbname' AND REFERENCED_COLUMN_NAME IS NOT NULL;
I am slightly confused by the part 2 and am unsure how to give an appropriate response to this section. I hope you find the above query helpful though in your project!

Is there an existing solution/Better approach to the problem?
Yes, I think so. You're describing a multi-tenant database. In a multi-tenant database in which the users share tables (also known as "shared everything"), each table should have a column for the user id. In effect, each row knows its owner.
This will vastly simplify your SQL, since you need no joins to determine who a row belongs to. it will probably speed up your SQL a lot, too.
This SO answer has a decent summary of the issues and alternatives.

Related

Is it correct to delete a tuple from my query result which has been populated from a sql statement?

I am learning sql queries. If my query result comes back like the picture given query result, is that a valid query result? If not what should I do to get back the result as solid table where it shouldn't let me edit the result. I am using phpmyadmin.
The query was to list all the student that doesn't provide the NOK details.
my sql statement is given below:
select CONCAT(s.fname,' ', s.lname) as Names,s.bannerNo
from Student s
where NOT EXISTS (select*from NOK n where n.studentID = s.bannerNo)
Student table:
Student (bannerNo,fName,lName, street,city,postcode,mobile,email,DOB,gender,category,nationality,special needs, comments,status,courseNo,staffNo)
Primary Key bannerNo
NOK Table:
NOK (StudentID,fName,lName,relationship,street,city,postcode,phoneNo)
Primary Key StudentID
Foreign key StudentID references Student(bannerNo)
If I understand correctly you want to delete a tuple on the visual tool (phpmyadmin) that shows you the result of the query.
I don't know the tool, but it does show the "Delete" button, as your image proves.
Let me translate this into SQL terms: you are seeing a "resultset" and you want to know if it's valid to manipulate it, so it will affect the underlying data. Did I get it right?
If so, then this question is similar to the manipulation of data using views. A view is a projection of one or more tables. Then... are views updatable? The answer is: some of them are.
Which ones? That depends on the database engine. Oracle is quite good at those; other database don't really allow it but in a small number of cases.
So, what's the main idea, then? Well in most databases the main concept is about "traceability". Does each resulting row correspond to a single row in one underlying table? If this is the case, then most likely you'll be able to update/delete it. If not, you're out of luck.
Why? Because, the database engine needs to have some way back to the source of the data you are viewing. If your query includes the primary key of the table, that's good news. If not, then it will be more difficult. Maybe some other column has a unique constraint or something that can be of use.
Bottom line, it depends on:
The database engine (MySQL in this case).
The "traceability" of the specific query. Can it be traced back to the rows? Seems so in your query, since the PK bannerNo is right there.

How do I setup a Symfony 3 entity that may occasionally need to break foreign key constraints?

For the purposes of my question lets assume that I have 'Users' table and a 'Referrals' table. The Referrals table has a user_id field which needs to be constrained to the Users.id so I can do something like:
$referral->getUser()->getId();
Here is where the issue has arisen before in the past. In some special cases like in the case of a Referral (a person who refers a customer to buy a product) there can be cases where no one referred a customer (user_id=0).
Trying to set up Symfony entities where a user_id in the referral table is 0 will cause doctrine to throw a fit and it wont allow the foreign key to be setup. Even if I bypass the fk constraints in MySQL in order to restore a backup later or whatever, it will always cause issues and besides, its not good database design.
However I have had this circumstance come up several times and was wondering if there is a way I could setup my Referral entity to basically say "Hey, give me User if there is one but otherwise return NULL?" I know this would mean not defining relationships in the entities but then how could/would you return a User?

MS Access 2013 query query criteria can't assess if value A is contained in value B string

Issue:
I am developing a simple issue tracking database and have hit a stumbling block that I’m not sure how to resolve. Have tried several approaches using queries, sql statement etc but still not working. I may have to rethink how I am doing this but hoping someone may be able to address the issue as it stands, though if a more elegant way of doing it happy to implement that.
Scenario:
A table called tblUsers has a field called Access that is a lookup to a table called tblCategory and allows for multiple values to be stored (one to many). In essence this is saying which category(s) of “issue” the user is allowed to
A simple msgbox test in code shows that this is correctly storing the values selected in the following format "1, 2, 3, 4"
In turn, each issue can only have a single category (one to one) which is stored in a field called Category in table tblGMPIssues and is also populated from a lookup to the tblCategory table.
So far so good ….
I then have a query called qryUserIssues that should show all issues from the table tblGMPIssues that are a) “Open” (status = 1) and that b) match any of the categories that the user is permitted to view.
I can get this to work with a single value i.e. as it stands query prompts for input and if you enter a single valid integer it returns expected results
But I can’t work out the syntax to get the criteria to accommodate multiple values. For example, in above scenario our user should be allowed to see 4 different category or calls “1, 2, 3, 4”
Tried using INNER joins, tried assigning to variables and using a LIKE criteria but can’t seem to get the syntax right.
If anyone could let me know if this can be done and if so how as it’s driving me nuts.
All help and suggestions gratefully received.
Updated relationship diagram --> 1
For precisely the reason that you've asked this question I would recommend never using the multi-select lookup option for columns in MS Access tables. Instead create an intersection table which tells you the combinations of values from the two main tables that are allowed. So instead of having the multi-select Access column in tblUsers, you should have a separate table called tblUserAccess with two columns (UserID and CategoryID). The two columns together will form a composite Primary Key for this table, and individually they will be Foreign Keys to tblUsers and tblCategory respectively. (You should do the same kind of thing with tblType - remove the Categories column and set up a separate table called tblTypeCategories).
Coming to your query, are you expecting this to show you all the relevant Issues for a particular user? At the moment, it is not doing this. The reason it is prompting you for input is because it doesn't understand ([tblUsers].[Access]) - tblUsers is not referenced in your query, and the query has no way of knowing which particular user you're interested in.
With your new table in place (and populated with the relevant data) you should add tblUserAccess to the query, joining tblGMPIssues.Category to tblUserAccess.CategoryID. Take the ([tblUsers].[Access]) condition off the Category column. Add the UserID column to the grid and set the criteria to [Input UserID]. Now when you run the query it will ask you for a user ID, and it should hopefully show you all the Issues that the given user can access.
Good luck!
First, I suggest you normalize your data a bit:
You have a number of tables that are reference data (e.g. tables tblStatus, tblSeverity, tblLocation). You have a s a primary key a (system generated) ID. That is wrong! The primary key of these should be their data, i.e. status, severity, location.
I can't see what the relationships are between the data. It should be one-to-many, mandatory (i.e. one Status can occur in many tblGMPIssues and a status is mandatory).
Your table tblType is unclear to me but it contains the categories. I am not familiar with the '-' before Categories followed by a Categories.Value but I assume an occurrence of tblType can contain exactly one Categories.Value. If not, then you must decompose this table.
If a User has access to a number of Categories, then there must be a many-to-many relationship betwen Users and Categories. From this relationship you do your select query, but I don't see this relationship.
Use following query to get any of the Category IDs 1, 2, 3 or 4
Select * from tblGMPIssues where tblGMPIssues.Category in (Select UserAccess from tblUserAccess)
I still have many problems with your relational design, or actually the lack of a proper relational design. As an example, below is a diagram from my Access 2007 showing a part of your database with a proper design. Access automatically shows that "one" and "many" symbols (which I don't see in your diagrams). I also show the relationship dialog with the proper fields checked. Note that none of the keys of any table, except tblIssue, has a system generated primary key. They are all plain text whch allows better understanding when inspecting the data and, as said, the database automaticlly updates child tables when the primary key value of a parent table changes.
Note table tblCategoryType: it implements a many-to-many relation between categories and types, meaning a category can be of zero or more types and a type can be in zero or more categories. In addition to "update cascades", this table has the "delete cascades" checkbox checked so if a category is deleted, all its relations with types are deleted (not the types).

ERD design pro and contra

I have one question regarding database design.
Here is the first example:
User may have a multiple Websites, and user can request specific resource for every of his websites. All requests are saved in RequestForResource table.
Now, if I want to see the name of an user who requested a resource, I have to join tables RequestForResource Website and table User.
To avoid this, I can make foreign key between RequestForResource and User table like it is demonstrated here:
Now, in order to get an user name, I have to join table RequestForResource and table User which is probably easier for SQL server, but at the other hand I have one foreign key more.
Which approach is better and (or) faster and why?
You can always duplicate information to gain execution speed. This is called: denormalisation. Yes, it will probably speed up the queries by lowering the required count of index seeks.
BUT
You have to write your code to make sure, that the data is consistent:
With the second design it is possible, to insert Website.User_idUser and a RequestForResource.User_idUser with different IDs for the same site! According to the design this is valid (but probably this will not satisfy your business rules).
Consider to update the foreign key constraint (or add a second one) which refers only to the Website table (User_idUser, Website_idWebsite) and remove the User-RequestForResource one.
Also consider to build a view to query your data with all the required info (probably with a clustered index).

Different database tables joining on single table

So imagine you have multiple tables in your database each with it's own structure and each with a PRIMARY KEY of it's own.
Now you want to have a Favorites table so that users can add items as favorites. Since there are multiple tables the first thing that comes in mind is to create one Favorites table per table:
Say you have a table called Posts with PRIMARY KEY (post_id) and you create a Post_Favorites with PRIMARY KEY (user_id, post_id)
This would probably be the simplest solution, but could it be possible to have one Favorites table joining across multiple tables?
I've though of the following as a possible solution:
Create a new table called Master with primary key (master_id). Add triggers on all tables in your database on insert, to generate a new master_id and write it along the row in your table. Also let's consider that we also write in the Master table, where the master_id has been used (on which table)
Now you can have one Favorites table with PRIMARY KEY (user_id, master_id)
You can select the Favorites table and join with each individual table on the master_id and get the the favorites per table. But would it be possible to get all the favorites with one query (maybe not a query, but a stored procedure?)
Do you think that this is a stupid approach? Since you will perform one query per table what are you gaining by having a single table?
What are your thoughts on the matter?
One way wold be to sub-type all possible tables to a generic super-type (Entity) and than link user preferences to that super-type. For example:
I think you're on the right track, but a table-based inheritance approach would be great here:
Create a table master_ids, with just one column: an int-identity primary key field called master_id.
On your other tables, (users as an example), change the user_id column from being an int-identity primary key to being just an int primary key. Next, make user_id a foreign key to master_ids.master_id.
This largely preserves data integrity. The only place you can trip up is if you have a master_id = 1, and with a user_id = 1 and a post_id = 1. For a given master_id, you should have only one entry across all tables. In this scenario you have no way of knowing whether master_id 1 refers to the user or to the post. A way to make sure this doesn't happen is to add a second column to the master_ids table, a type_id column. Type_id 1 can refer to users, type_id 2 can refer to posts, etc.. Then you are pretty much good.
Code "gymnastics" may be a bit necessary for inserts. If you're using a good ORM, it shouldn't be a problem. If not, stored procs for inserts are the way to go. But you're having your cake and eating it too.
I'm not sure I really understand the alternative you propose.
But in general, when given the choice of 1) "more tables" or 2) "a mega-table supported by a bunch of fancy code work" ..your interests are best served by more tables without the code gymnastics.
A Red Flag was "Add triggers on all tables in your database" each trigger fire is a performance hit of it's own.
The database designers have built in all kinds of technology to optimize tables/indexes, much of it behind the scenes without you knowing it. Just sit back and enjoy the ride.
Try these for inspiration Database Answers ..no affiliation to me.
An alternative to your approach might be to have the favorites table as user_id, object_id, object_type. When inserting in the favorites table just insert the type of the favorite. However i dont see a simple query being able to work with your approach or mine. One way to go about it might be to use UNION and get one combined resultset and then identify what type of record it is based on the type. Another thing you can do is, turn the UNION query into a MySQL VIEW and simply query that VIEW.
The benefit of using a single table for favorites is a simplicity, which some might consider as against the database normalization rules. But on the upside, you dont have to create so many favorites table and you can add anything to favorites easily by just coming up with a new object_type identifier.
It sounds like you have an is-a type relationship that needs to be modeled. All of the items that can be favourited are a type of "item". It sounds like you are on the right track, but I wouldn't use triggers. What could be the right answer if I have understood correctly, is to pull all the common fields into a single table called items (master is a poor name, master of what?), this should include all the common data that would be needed when you need a users favourite items, I'd expect this to include fields like item_id (primary key), item_type and human_readable_name and maybe some metadata about when the item was created, modified etc. Each of your specific item types would have its own table containing data specific to that item type with an item_id field that has a foreign key relationship to the item table. Then you'd wrap each item type in its own insertion, update and selection SPs (i.e. InsertItemCheese, UpdateItemMonkey, SelectItemCarKeys). The favourites table would then work as you describe, but you only need to select from the item table. If your app needs the specific data for each item type, it would have to be queried for each item (caching is your friend here).
If MySQL supports SPs with multiple result sets you could write one that outputs all the items as a result set, then a result set for each item type if you need all the specific item data in one go. For most cases I would not expect you to need all the data all the time.
Keep in mind that not EVERY use of a PK column needs a constraint. For example a logging table. Even though a logging table has a copy of the PK column from the table being logged, you can't build a constraint.
What would be the worst possible case. You insert a record for Oprah's TV show into the favorites table and then next year you delete the Oprah Show from the list of TV shows but don't delete that ID from the Favorites table? Will that break anything? Probably not. When you join favorites to TV shows that record will fall out of the result set.
There are a couple of ways to share values for PK's. Oracle has the advantage of sequences. If you don't have those you can add a "Step" to your Autonumber fields. There's always a risk though.
Say you think you'll never have more than 10 tables of "things which could be favored" Then start your PK's at 0 for the first table increment by 10, 1 for the second table increment by 10, 2 for the third... and so on. That will guarantee that all the values will be unique across those 10 tables. The risk is that a future requirement will add table 11. You can always 'pad' your guestimate