MySQL: how to do row-level security (like Oracle's Virtual Private Database)? - mysql

Say that I have vendors selling various products. So, at a basic level, I will have the following tables: vendor, product, vendor_product.
If vendor-1 adds Widget 1 to the product table, I want only vendor-1 to see that information (because that information is "owned" by vendor-1). Same goes for vendor-2. Say vendor-2 adds Widget 2, only vendor-2 should see that information.
If vendor-1 tries to add Widget 2, which was already entered by vendor-2, a duplicate entry for Widget 2 should not be made in the product table. This means that, somehow, I need to know that vendor-2 now also "owns" Widget 2.
A problem with having multiple "owners" of a piece of information is how to deal owners editing/deleting the data. Perhaps vendor-1 no longer wants Widget 2 to be available to him/her, but that doesn't necessarily apply for vendor-2.
Finally, I want the ability to flag(?) certain records as "yes, I have reviewed this data and it is correct" such that it then becomes available to all the vendors. Say I flag Widget 1 as good data, that product should now be seen by all vendors.
It seems that the solution is row level security. The problem is that I'm not too familiar with its concepts or how to implement it in MySQL. Any help is highly appreciated. Thanks.
NOTE: this problem is somewhat discussed here: Database Design: use composite key as FK, flag data for sharing?. When I asked the question, I wasn't sure how to phrase the question very well. Hopefully, I explained my problem better this time.

Mysql doesn't natively support row level security on tables. However, you can sort of implement it with views. So, just create a view on your table that exposes only the rows you want a given client to see. Then, only provide that client access to those views, and not the underlying tables.
See http://www.sqlmaestro.com/resources/all/row_level_security_mysql/

You already suggested a vendor, product and vendor_product mapping table. You want vendors to share the same product if they both want to use it, but you don't want duplicate products. Right?
If so, then define a unique index/constraint on the natural key that identifies a product (product name?).
If a vendor adds a product, and it doesn't exist, insert it into the product table, and map it to that vendor via the vendor_product table.
If the product already exists, but is mapped to another vendor, do not insert anything into the product table, and add another mapping row mapping the new vendor to the existing product (so that now the product is mapped to two vendors).
Finally, when a vendor removes a product, instead of actually removing it, just delete the vendor_product reference mapping the two. Finally, if no other vendors are still referencing a product, you can remove the product. Alternatively, you could run a script periodically that deletes all products that no longer have vendors referencing them.
Finally, have a flag on the product table that says that you've reviewed the product, and then use something like this to query for products viewable by a given vendor (we'll say vendor id 7):
select product.*
from product
left join vendor_map
on vendor_map.product_id = product.product_id
where vendor_map.vendor_id = 7
or product.reviewed = 1;
Finally, if a product is owned by multiple vendors, then you can either disallow edits or perhaps "split" the single product into a new unique product when one of the owning vendors tries to edit it, and allow them to edit their own copy of the product. They would likely need to modify the product name though, unless you come up with some other natural key to base your unique constraint on.

This sounds to me that you want to normalize your data. What you have is a 1 (product) to many (vendors) relationship. That the relationship is 1:1 for most cases and only 1:n for some doesn't really matter I would say - in general terms it's still 1:n and therefor you should design your database this way. The basic layout would probably be this:
Vendor Table
VendorId VendorName OtherVendorRelatedInformation
WidgetTable
WidgetId WidgetName WidgetFlag CreatorVendor OtherWidgetInformation
WidgetOwnerships
VendorId WidgetId OwnershipStatus OtherInformation
Update: The question of who is allowed to do what is a business problem so you need to have all the rules laid out. In the above structure you can flag which vendor created the widget. And in the ownership you can flag what the status of the ownership is, for example
CreatorFullOwnership
SharedOwnership
...
You would have to make up the flags based on your business rules and then design the business logic and data access part accordingly.

Related

ERD: how to store data for three different payment types

Let’s say a user has 3 different ways for payment. Cash, IBAN, and through providing certain_document. Each payment type requires its own different details.
How can I store this in my database?
Let’s say the user has chosen to pay using his IBAN, assuming this picture is the current database, do I fill the fields associated with the IBAN option and set the others to Null? Or is there a more professional way to store the data without having these Null values?
UPDATE
I found a solution to this problem in the answer to this question, however, the answer is still not sufficient. If anybody has a link to a more detailed documents please let me know.
As #philipxy noted, you're asking about representing inheritance in a RDBMS. There are a few different ways to do this:
Have all of your attributes in one table (which, based on your screenshot, is what you have now). With this approach, it would be best to store NULLs in non-applicable columns---if nothing else, the default settings for InnoDB tables uses the compact row format, which means NULL columns don't take up extra storage. Of course, your queries can get complex, and maintaining these tables can become cumbersome.
Have child tables to store your details:
Payments (PaymentID, PaymentDate, etc.)
CashPaymentDetails(PaymentID, Cash_Detail_1, Cash_Detail_2, etc.)
IBANPaymentDetails(PaymentID, IBAN_Detail_1, etc.)
You can get the information for each payment by joining the base payment table with one of the "subsidiary" tables:
SELECT *
FROM Payments P INNER JOIN CashPaymentDetails C ON
C.PaymentID = P.PaymentID
Your third option is to use the entity-attribute-value (EAV) model. Like with Option 2, you have a base Payment table. However, instead of having one table for each payment method, you have one subsidiary table that contains the payment details. For more information, here's the Wiki page, and here's a blog with some additional information.

Database Architecture Many-to-Many-to-Many

I have got an issue how to change a model of database:
For now we have predefined table Categories
and let's say tables Places and People which can be assigned to categories so it looks like this:
People <=> PeopleCategories <=> Categories <=> PlaceCategories <=> Places
(People can have many categories, categories can have many people, places can have many categories, categories can have many places)
But now there is a new requirement:
On person profile show all corresponding places based on categories (so far no problem) and add a tick box modeling some attribute (for example show on front-end as favorite place). The same from the other side on Place profile mark people assigned to at least one same category with a tick box.
I wonder whether there is some nice way to model this - the only thing which came to my mind is to add a new PeoplePlaces table but then I have to manually control whether people or places did not change their categories and they are still assigned and so on - There will be quite a problem with consistency of data which I will have to manage on application layer.
The second thing I could probably do is to delete categories totally and make it only on PeoplePlaces level but I will lose some simplicity for user: there are like 10 predefined categories which user can select so the linking between People and Places is quite automatic on front-end and only admin should see which places are assigned to which people and manage that tick box I was talking about
What would you suggest for this architecture? Thanks in advance! (It is a MySQL db if it is important for some kind of solution but this is more a general architecture thing)
If I understood your question correctly, you need to ensure that a person can only favor a place that is connected to the same category as the person herself?
If so, take a look at the following model:
We don't link the "endpoints" directly, and instead "link the links". This allows us to migrate PERSON_CATEGORY.CATEGORY_ID and PLACE_CATEGORY.CATEGORY_ID into the FAVORED_PLACE table, and "merge" them there, producing a single FAVORED_PLACE.CATEGORY_ID field (note FK1,FK2in the diagram above).
As a consequence, if a person is connected to a place, that must be done through a common category.
Furthermore, since CATEGORY_ID is outside PERSON_CATEGORY's PK, a particular combination of person and place can be used only once, even if they match through multiple categories. Effectively, you pick one common category as "special". If a place (or person) is removed from the special category, you'll need to pick another common category to serve as special. If there are no common categories left, the corresponding row in FAVORED_PLACE will not be allowed to exist anymore.
I don't think deleting Categories is a good idea.
What you are doing is introducing a new entity - PersonsFavouritePlaces - which relates People and Place directly rather than via a Category. It is sensible that a PersonsFavouritePlace be limited to a Person and a Place linked by Category, so it should probably reference PeopleCategories and PlaceCategories rather than the People and Category tables.
The table would look like:
create table PeopleFavourtiePlace
(
ID int not null, -- Primary key
PeopleCategoriesId int not null, -- FK to PK of PerpleCategories
PlaceCategoriesId int not null -- FK to PK of PlaceCategories
)
I don't know whether MySql supports cascading deletes, but if so the two FK's should have that turned on so when someone deselects a category (deleting the PeopleCategories row) if it linked to a favourite place in that category it too gets deleted.
However, if a person links to a place via multiple categories then it gets complicated....

Database Historization

We have a requirement in our application where we need to store references for later access.
Example: A user can commit an invoice at a time and all references(customer address, calculated amount of money, product descriptions) which this invoice contains and calculations should be stored over time.
We need to hold the references somehow but what if the e.g. the product name changes? So somehow we need to copy everything so its documented for later and not affected by changes in future. Even when products are deleted, they need to reviewed later when the invoice is stored.
What is the best practise here regarding database design? Even what is the most flexible approach e.g. when the user want to edit his invoice later and restore it from the db?
Thank you!
Here is one way to do it:
Essentially, we never modify or delete the existing data. We "modify" it by creating a new version. We "delete" it by setting the DELETED flag.
For example:
If product changes the price, we insert a new row into PRODUCT_VERSION while old orders are kept connected to the old PRODUCT_VERSION and the old price.
When buyer changes the address, we simply insert a new row in CUSTOMER_VERSION and link new orders to that, while keeping the old orders linked to the old version.
If product is deleted, we don't really delete it - we simply set the PRODUCT.DELETED flag, so all the orders historically made for that product stay in the database.
If customer is deleted (e.g. because (s)he requested to be unregistered), set the CUSTOMER.DELETED flag.
Caveats:
If product name needs to be unique, that can't be enforced declaratively in the model above. You'll either need to "promote" the NAME from PRODUCT_VERSION to PRODUCT, make it a key there and give-up ability to "evolve" product's name, or enforce uniqueness on only latest PRODUCT_VER (probably through triggers).
There is a potential problem with the customer's privacy. If a customer is deleted from the system, it may be desirable to physically remove its data from the database and just setting CUSTOMER.DELETED won't do that. If that's a concern, either blank-out the privacy-sensitive data in all the customer's versions, or alternatively disconnect existing orders from the real customer and reconnect them to a special "anonymous" customer, then physically delete all the customer versions.
This model uses a lot of identifying relationships. This leads to "fat" foreign keys and could be a bit of a storage problem since MySQL doesn't support leading-edge index compression (unlike, say, Oracle), but on the other hand InnoDB always clusters the data on PK and this clustering can be beneficial for performance. Also, JOINs are less necessary.
Equivalent model with non-identifying relationships and surrogate keys would look like this:
You could add a column in the product table indicating whether or not it is being sold. Then when the product is "deleted" you just set the flag so that it is no longer available as a new product, but you retain the data for future lookups.
To deal with name changes, you should be using ID's to refer to products rather than using the name directly.
You've opened up an eternal debate between the purist and practical approach.
From a normalization standpoint of your database, you "should" keep all the relevant data. In other words, say a product name changes, save the date of the change so that you could go back in time and rebuild your invoice with that product name, and all other data as it existed that day.
A "de"normalized approach is to view that invoice as a "moment in time", recording in the relevant tables data as it actually was that day. This approach lets you pull up that invoice without any dependancies at all, but you could never recreate that invoice from scratch.
The problem you're facing is, as I'm sure you know, a result of Database Normalization. One of the approaches to resolve this can be taken from Business Intelligence techniques - archiving the data ina de-normalized state in a Data Warehouse.
Normalized data:
Orders table
OrderId
CustomerId
Customers Table
CustomerId
Firstname
etc
Items table
ItemId
Itemname
ItemPrice
OrderDetails Table
ItemDetailId
OrderId
ItemId
ItemQty
etc
When queried and stored de-normalized, the data warehouse table looks like
OrderId
CustomerId
CustomerName
CustomerAddress
(other Customer Fields)
ItemDetailId
ItemId
ItemName
ItemPrice
(Other OrderDetail and Item Fields)
Typically, there is either some sort of scheduled job that pulls data from the normalized datas into the Data Warehouse on a scheduled basis, OR if your design allows, it could be done when an order reaches a certain status. (Such as shipped) It could be that the records are stored at each change of status (with a field called OrderStatus tacking the current status), so the fully de-normalized data is available for each step of the oprder/fulfillment process. When and how to archive the data into the warehouse will vary based on your needs.
There is a lot of overhead involved in the above, but the other common approach I'm aware of carries even MORE overhead.
The other approach would be to make the tables read-only. If a customer wants to change their address, you don't edit their existing address, you insert a new record.
So if my address is AddressId 12 when I first order on your site in Jamnuary, then I move on July 4, I get a new AddressId tied to my account. (Say AddressId 123123 because your site is very successful and has attracted a ton of customers.)
Orders I palced before July 4 would have AddressId 12 associated with them, and orders placed on or after July 4 have AddressId 123123.
Repeat that pattern with every table that needs to retain historical data.
I do have a third approach, but searching it is difficult. I use this in one app only, and it actually works out pretty well in this single instance, which had some pretty specific business needs for reconstructing the data exactly as it was at a specific point in time. I wouldn't use it unless I had similar business needs.
At a specific status, serialize the data into an Xml document, or some other document you can use to reconstruct the data. This allows you to save the data as it was at the time it was serialized, retaining original table structure and relaitons.
When you have time-sensitive data, you use things like the product and Customer tables as lookup tables and store the information directly in your Orders/orderdetails tables.
So the order table might contain the customer name and address, the details woudl contain all relevant information about the produtct including especially price(you never want to rely on the product table for price information beyond the intial lookup at teh time of the order).
This is NOT denormalizing, the data changes over time but you need the historical value, so you must store it at the time the record is created or you will lose data intergrity. You don't want your financial reports to suddenly indicate you sold 30% more last year because you have price updates. That's not what you sold.

Database design issue - trying to avoid circular reference

I have been at this for a day and a bit now, trying to figure out how to best model the database (MySQL) for an app I'm developing for a friend who owns a bakery. The assumptions are as follows:
many (external) Bakers produce many Products
BakersProducts is updated fortnightly by certain staff who either call bakers for their product prices, or the bakers fax through their pricelist themselves, which the staff then update via a front-end UI.
the manager should be able to generate an order based on the products that she anticipates having.
So the front-end UI must be able to allow the manager to purely choose the products she would like in the order, and then present her with a list of Bakers to choose from for each product in the order.
In other words, Orders_has_Products should also include a reference to BakersProducts.bpID. I'm sure though that if I do this, then I would create a circular reference (of sort) to Products.
Im sure I've gone about this the wrong way, and would really appreciate anyone's advice as to how I can restructure my design to acccommodate the chosen Product Price - ie. to include BakersProducts.bpID.
Thank you!
This is not a circular reference, since
Order_has_products references Products
Order_has_products references BakersProducts
BakersProducts references Products
a circular reference would be if, for example,
Order_has_products references Products
Products references BakersProducts
BakersProducts references Order_has_products
Aside from that, circular references are relatively normal in a database (i.e. Employees table with a manager field, where the manager is herself an employee is a one table circular reference)
What your design has is a simple redundancy, because one product is referenced twice in the Order_has_products table - once directly from the Products table, and once via the related BakersProducts record. There is posibility for getting out of sync, but, since you stated that the business rule is that the product is chosen before the baker, it's quite all right.
I would include the productID even if it was the other way around, because a little denormalization can go a long way when speeding up queries, because otherwise you would have to scan the BakersProducts table, even for simle questions, like, 'Did we have any bagels on wednesday?'
I think the mix-up is from a business process standpoint: you're getting requisitions mixed up with orders.
A requisition has a list of products needed without necessarily specifying the supplier of each, whereas an order is directed at a specific supplier, for specific price look-up codes (what bpID seems to represent). One requisition may spawn multiple orders if it is split across multiple suppliers, and even a single product may have its order split across multiple suppliers, perhaps due to vendor volume limits or locality of delivery.
You may want to provide a view of a requisition that shows the order line item(s) generated from each requisition line item, but that is a user interface concern.
one way of solving this is simply to eliminate the Products table, and move productName into the BakersProducts table.
This would essentially only work if you do not expect the bakers to carry the same product, if the products are unique to the bakers.
If you do expect the bakers to carry the same product, then you may want to leave the separate Products table, but instead of having Order_has_Products.Products_productID, I would change it to Order_has_Products.bpID. If/when you need to access the productName (or other product related metadata that may go in that table) you could just do a join between BakersProducts and Products.

Inventory Architecture in Database

This is not a question related to a specific language, rather on the correct methodology of architectural of handling inventory.
Consider the following structure for storing software:
platforms (platformID*, platformName)
titles (titleID*, titleName)
And the following is a joiner table for unique products
products (platformID*, titleID*, releaseDate, notes)
And the following is a table I would like to keep track of my inventory
inventory (platformID*, titleID*, quantityAvailable)
In examples I have seen, others have created a table where each unique copy of a software is stored in a separate line as such:
software(softwareID*, softwareTitle)
inventory(inventoryID*, softwareID*)
I would like to know which approach is preferable? Should I create an inventoryID and thus a row for each unique software copy:
inventory(inventoryID*, platformID*(fk), titleID*(fk))
Or use a table that stores quantity instead, as such:
inventory(platformID*(fk), titleID*(fk), quantityAvailable)
The advantage of having a unique row for each piece of inventory is that later if you want to keep track of things like inventory that's on layover, inventory that's on preorder, inventory that's been sold but could still be returned, etc.
I don't see any real disadvantage to this approach except that it's probably more work which might not pay off if these things aren't really needed.
I also would start with quantityAvailable instead of lines for all the items. But I would still opt for an inventoryId, since cases could occur, where you have to dissect the entries with the same platform/title combination -- with an inventoryId you are more enhanceable in the future.
I would also recommend to add a further column: versionNo --- the version number of a software product. Sometimes you might have differing versions of the same product. When you have this, it is not a good idea to drop the information into the title (for example you want to search for all "Microsoft Office" products regardless of version ...).
Should I create an inventoryID and
thus a row for each unique software
copy?
There is no reason to do this, unless you want to store some information on each unique software copy, such as the date each copy was purchased. This is rarely pracitical in the inventory of software.
Or use a table that stores quantity
instead?
You can also consider adding a quantityAvailable column in your products table, unless you think that eventually you'd want to have many inventories for each title, in order to be able to allocate a quantity of stock that is under special offer, that is soon going to expire, etc.