This is not a programming question at all. Let me explain: I am creating a game, for which I need a Database that will store users registration data (e.g. username, email, password). If a user wins, he/she will earn Cash Points. The user will be able to exchange his/her Cash Points by real money, so I consider that the Cash Points is a very, very critical data. My question is: would you store the Cash Points data in the same "users" table? Or would you create a new table named "cash" (for instance) and then store it into? (from a security point of view)
Thanks
It's best if you implement a simple ledger system whereby transactions are recorded against the user's account as credit or debits, and the account itself has a total that can be audited.
You must keep a record of transactions performed if you're involving cash or cash-like currency. If someone complains about missing money you need to be able to verify every transaction that affected their balance and uncover any discrepancies.
This also presumes you're making use of transactions to avoid committing incomplete transactions. The balance adjustment and transaction record should be part of the same transaction.
As always, test this as ruthlessly as you can.
It is considered bad design if you store cash points in the users table. Tables should be normalized. You should store cash points in a separate table and use the userId as the foreign key in that table. You could look into encrypting Cash Points table data as well.
Cashpoints definitely in a separate table but not from security perspective. It's better from design perspective and will allow you to keep a log of CashPoint changes for each user.
Well you should create a database design that resembles a bank balance. That way you can keep track of all changes, this is
create table balance
(id int,
debit numeric (10,2),
credit numeric (10,2),
balance_before numeric(10,2),
balance_after numeric(10,2),
timestamp datetime,
user_id int,
description varchar(32),
...
);
Related
I looked over some of the other answers here and they did not answer the question I have.
How would I go about this problem. If an invoice is created with the following database schema:
id
customer_id
invoice_date
status
the invoice is created, paid etc. If the customer information changes, the invoice information will change since its a FK. How would I create the invoice table to record customers information at that give moment in time, so if the customer change their address in the future for example the past invoices will stay the same with the previous address.
Just wanted to see if it makes sense to copy customer information into the invoice table with their address etc. but then it will adhere to normalizing the database.
Please let me know what your thoughts are in regards to this challenge. Invoices in one example, this can also apply to Purchase orders that were created in the past should have definite information that was recorded in the past no matter what has changed now.
There are different considerations for an order management system or a reporting system.
Within an application, data duplication may prove cumbersome at volume. Will prove cumbersome at high enough volume. So you'll want to be as normalized as possible in that environment.
For reporting and history, though, it would be preferable (maybe mandatory depending on your industry and locale) to have the address info for each invoice.
It would be better to have an address table that joins to your customer table. The addresses would each have their own id field, and then you could just reference that id in the invoice table, too. Probably both the billing address and the shipping address if they're both in play in your business model.
If you don't have the flexibility to introduce a stand alone address table, then copying the address information over to the invoice table becomes a necessary evil. But dead useful information to have.
I'm a bit of a newbie with databases and database design, but I'm hoping someone can point me in the right direction. I currently have 14 monthly loan extracts, each of which contain all accounts, their status, balance and customer contact info as-of month end. Not knowing what to do, I imported each of the monthly files into Access with each table acting more like a tab from an Excel workbook. Laugh away - I now know that's not how it's supposed to work.
I've done my homework and I understand how to split up part of my data into Customer and Account tables, but what do I do with the account balances? My thought is to create a Balances table, create a relationship to the Accounts table and create columns for each month. This seems logical, but is it the best way?
99% of my analysis involves trend reporting and other ad hoc tasks - tracking the total balances by product type over time given other criteria, such as credit score or age. My intended use is to create queries to select the data I need and connect to it via Get & Transform in Excel for final manipulation and report writing.
This also begs the question "how normalized should my new database be?" Each monthly extract is cumulative, so a good 75% of my data is redundant contact info already, but how normalized should I go?
Sorry for ranting,but if anyone has any experience in setting up their own historical database or could point me in a direction that will get me on track, I would appreciate it.
Best practice for transactional systems is close to what you expect:
1. Create a Customer table
2. Create an Account table
3. Create an Account Balance table
4. Create relationships from the Account to Customer, and from the Account Balance to the Account table.
You can create a column for each month, provided you have Year as part of the key of the Account Balance table. Even better would be to have the key for the Account Balance be Account ID and Date.
However, since you are performing analytics over the data, a de-normalized approach is not only acceptable -- it is preferable. So yes, you can (and perhaps should, based upon your use cases) put all the data into one big flat table and then compile your analytics.
I want to keep track of each User's current balance and balance history using the Django ORM. I imagine 2 tables (User and History) with a one-to-many between User and History representing a user's entire history, and a one-to-one between User and History for easy access to the current balance:
History
ID | User (FK to User) | Delta | Balance | Timestamp
User
ID | Name | Employee | Year | Balance (FK to History)
1) Does this seem reasonable given that I'm using the Django ORM? I think with raw SQL or another ORM, I could give history a start and stop date, then easily get the latest with SELECT * FROM History WHERE user_id=[id] AND stop IS NULL;.
2) Should History have a balance column?
3) Should User have a balance column (I could always compute the balance on the fly)? If so, should it be a "cached" decimal value? Or should it be a foreign key to the latest balance?
A strictly normal approach would say that neither table should contain a balance column, but that users' balances should be calculated when required from the sum of all their history. However, you may find that using such a schema would result in unacceptable performance—in which case cacheing the balance would be sensible:
if you're mostly interested in the current balance, then there's little reason to cache balances in the History table (just cache the current balance in the User table alone);
on the other hand, if you might be interested in arbitrary historical balances, then storing a historical balances in the History table would make sense (and then there'd be little point in also storing the current balance in the User table, since that could easily be discovered from the most recent History record).
But perhaps it's not worth worrying about cacheing right now? Have in mind the mantra "normalise until it hurts; denormalise until it works" as well as Knuth's famous maxim "premature optimisation is the root of all evil".
We have a requirement in our application where we need to store references for later access.
Example: A user can commit an invoice at a time and all references(customer address, calculated amount of money, product descriptions) which this invoice contains and calculations should be stored over time.
We need to hold the references somehow but what if the e.g. the product name changes? So somehow we need to copy everything so its documented for later and not affected by changes in future. Even when products are deleted, they need to reviewed later when the invoice is stored.
What is the best practise here regarding database design? Even what is the most flexible approach e.g. when the user want to edit his invoice later and restore it from the db?
Thank you!
Here is one way to do it:
Essentially, we never modify or delete the existing data. We "modify" it by creating a new version. We "delete" it by setting the DELETED flag.
For example:
If product changes the price, we insert a new row into PRODUCT_VERSION while old orders are kept connected to the old PRODUCT_VERSION and the old price.
When buyer changes the address, we simply insert a new row in CUSTOMER_VERSION and link new orders to that, while keeping the old orders linked to the old version.
If product is deleted, we don't really delete it - we simply set the PRODUCT.DELETED flag, so all the orders historically made for that product stay in the database.
If customer is deleted (e.g. because (s)he requested to be unregistered), set the CUSTOMER.DELETED flag.
Caveats:
If product name needs to be unique, that can't be enforced declaratively in the model above. You'll either need to "promote" the NAME from PRODUCT_VERSION to PRODUCT, make it a key there and give-up ability to "evolve" product's name, or enforce uniqueness on only latest PRODUCT_VER (probably through triggers).
There is a potential problem with the customer's privacy. If a customer is deleted from the system, it may be desirable to physically remove its data from the database and just setting CUSTOMER.DELETED won't do that. If that's a concern, either blank-out the privacy-sensitive data in all the customer's versions, or alternatively disconnect existing orders from the real customer and reconnect them to a special "anonymous" customer, then physically delete all the customer versions.
This model uses a lot of identifying relationships. This leads to "fat" foreign keys and could be a bit of a storage problem since MySQL doesn't support leading-edge index compression (unlike, say, Oracle), but on the other hand InnoDB always clusters the data on PK and this clustering can be beneficial for performance. Also, JOINs are less necessary.
Equivalent model with non-identifying relationships and surrogate keys would look like this:
You could add a column in the product table indicating whether or not it is being sold. Then when the product is "deleted" you just set the flag so that it is no longer available as a new product, but you retain the data for future lookups.
To deal with name changes, you should be using ID's to refer to products rather than using the name directly.
You've opened up an eternal debate between the purist and practical approach.
From a normalization standpoint of your database, you "should" keep all the relevant data. In other words, say a product name changes, save the date of the change so that you could go back in time and rebuild your invoice with that product name, and all other data as it existed that day.
A "de"normalized approach is to view that invoice as a "moment in time", recording in the relevant tables data as it actually was that day. This approach lets you pull up that invoice without any dependancies at all, but you could never recreate that invoice from scratch.
The problem you're facing is, as I'm sure you know, a result of Database Normalization. One of the approaches to resolve this can be taken from Business Intelligence techniques - archiving the data ina de-normalized state in a Data Warehouse.
Normalized data:
Orders table
OrderId
CustomerId
Customers Table
CustomerId
Firstname
etc
Items table
ItemId
Itemname
ItemPrice
OrderDetails Table
ItemDetailId
OrderId
ItemId
ItemQty
etc
When queried and stored de-normalized, the data warehouse table looks like
OrderId
CustomerId
CustomerName
CustomerAddress
(other Customer Fields)
ItemDetailId
ItemId
ItemName
ItemPrice
(Other OrderDetail and Item Fields)
Typically, there is either some sort of scheduled job that pulls data from the normalized datas into the Data Warehouse on a scheduled basis, OR if your design allows, it could be done when an order reaches a certain status. (Such as shipped) It could be that the records are stored at each change of status (with a field called OrderStatus tacking the current status), so the fully de-normalized data is available for each step of the oprder/fulfillment process. When and how to archive the data into the warehouse will vary based on your needs.
There is a lot of overhead involved in the above, but the other common approach I'm aware of carries even MORE overhead.
The other approach would be to make the tables read-only. If a customer wants to change their address, you don't edit their existing address, you insert a new record.
So if my address is AddressId 12 when I first order on your site in Jamnuary, then I move on July 4, I get a new AddressId tied to my account. (Say AddressId 123123 because your site is very successful and has attracted a ton of customers.)
Orders I palced before July 4 would have AddressId 12 associated with them, and orders placed on or after July 4 have AddressId 123123.
Repeat that pattern with every table that needs to retain historical data.
I do have a third approach, but searching it is difficult. I use this in one app only, and it actually works out pretty well in this single instance, which had some pretty specific business needs for reconstructing the data exactly as it was at a specific point in time. I wouldn't use it unless I had similar business needs.
At a specific status, serialize the data into an Xml document, or some other document you can use to reconstruct the data. This allows you to save the data as it was at the time it was serialized, retaining original table structure and relaitons.
When you have time-sensitive data, you use things like the product and Customer tables as lookup tables and store the information directly in your Orders/orderdetails tables.
So the order table might contain the customer name and address, the details woudl contain all relevant information about the produtct including especially price(you never want to rely on the product table for price information beyond the intial lookup at teh time of the order).
This is NOT denormalizing, the data changes over time but you need the historical value, so you must store it at the time the record is created or you will lose data intergrity. You don't want your financial reports to suddenly indicate you sold 30% more last year because you have price updates. That's not what you sold.
Good Day,
I'm currently designing database structure for a website of mine. I need community assistance in one aspect only as I never did something similar.
Website will include three types of the payments:
Internal payments (Escrow kind payments). User can send payment to another user.
Deposits. Users add fund to their accounts.
Withdrawal. User can request a withdrawal. Money will be sent to their bank/PayPal account.
Basically, I need some tips to get the best design possible.
Here's what I'm thinking about:
deposits - this table will store info about deposits
deposits_data - this table will store info about deposit transaction (ex. data returned by PayPal IPN)
payments - table to store internal payments
withdrawals - table to store info about withdrawal request
transactions - table to store info about ALL transactions (with ENUM() field called type with following values possible: internal, deposit, withdrawal)
Please note that I have already a table logs to store every user action.
Unfortunately, I feel that my design approch is not the best possible in this aspect. Can you share some ideas/tips?
PS. Can I use a name "escrow" for internal payments or should I choose different name?
Edit
DEPOSITS, PAYMENTS and WITHDRAWALS tables store specific transaction details. TRANSACTIONS table stores only limited info - it's a kind of logs table - with a details field (which contains a text to display in user log section, ex: "User 1 sent you a payment for something")/
Of course I have users tables, etc.
Can I use a name "escrow" for internal
payments or should I choose different
name?
Escrow has a specfic financial/legal meaning, which is different from how you seem to mean it: "a written agreement (or property or money) delivered to a third party or put in trust by one party to a contract to be returned after fulfillment of some condition" (source)
So choosing a different name seems like a good idea.
As for design, what data will DEPOSITS, PAYMENTS and WITHDRAWALS store which TRANSACTIONS won't? Also, you need an ACCOUNTS table. Or are you planning to just use your existing USERS table (I presume you have such a thing)? You probably ought to have something for external parties, even if you only intend to support PayPal for the time being.