Three tables -- Use cascade relationships or add extras? - mysql

I am a PHP developer and have basic knowledge of MySQL yet I am working on a personal project so have to dig into it.
Billing details will be stored mimicking stripe objects. However, I am wondering if I should use indexes in cascade or use them on each table.
Let me explain through (a simplified) example. A user has a subscription. Each subscription raises an invoice a month. Each invoice raises one charge (or several if some failed.)
table_user
- user_id
- username
- password
table_subscription
- sub_id
- start_date
- end_date
- amount
- user_id
table_invoice
- invoice_id
- period_start
- period_end
- amount
- paid
- subscription_id
table_charge
- charge_id
- amount
- status
- failure_code
- failure_reason
- invoice_id
My point is that I want to quickly list each user's charges and invoices. One charge/invoice belongs to one user. Should I use uder_id key on just subscription (as charges and invoices are linked) or, should I still add user_id to both invoice and charge tables?
It's easier to SELECT but also if in the future I make charges not related to an invoice (buying a one-off extra) but linked to that user.
Looking forward to receiving some suggestions.

The simple answer is to use 'user_id' as a key for each table. If you have MySQL Workbench you can also link them via keys. If 'user_id' is a unique identifier for a user than it needs to be on every table where it is used. This way you do not have a table that is 'lost' and needs code or triggers to find the user. It is only one extra column per table and worth the effort. PHP will use it as an absolute point of reference back to the user, and so can your background routines. I would also archive the users id and IP address for a few years, just in case there is a payment issue and they close their account.

Related

Structure and logic for payouts in a marketplace application

I'm developing a marketplace-style application that allows users to upload purchasable digital items -> the public purchases these items -> and for my application to pay the users (owners of items) their owed funds via PayPal Payouts API on a daily basis.
I'm struggling with how best to calculate/store the owing balance, and how to map the individual purchase transaction records to the concept of a "payout" (when we send owed funds to the user).
Schema so far:
User
id
name
createdAt
etc.
Purchasable Item
id
user_id (owner)
price
createdAt
etc.
Transaction
id
type ("purchase" or "payout")
status (depending on PayPal response. COMPLETED, FAILED, REFUNDED etc.)
value (integer (lowest demomination of currency). Positive integer for purchase, negative for a payout).
purchasable_id (For "purchase" transactions, reference the ID of the purchasable item that was purchased)
transaction_fee
createdAt
payout_id (?) The ID of the payout (below) this purchase is included in. Not sure about this. This won't be known at the time of the transaction, so it would need to be updated to store it and I'm not sure how to know which transaction will belong in which payout?
Payout
Not sure about this. Feels like a duplicate of a payout transaction entry, but I want a way to store which purchase transactions were paid out in which payouts.
id
status (depending on PayPal response to Payout API webhook. COMPLETED, FAILED, REFUNDED etc.)
createdAt
Logic:
This is where I need the most help.
CRON job. Every 24hrs:
Calculate each users balance by summing the payout_balance_change fields of the Transactions table. i.e balance isn't stored, it's always calculated. Is that a good idea?
Insert a row into "Transactions" of type "payout" with a negative "payout_balance_change". i.e. subtracting the amount we will send in the payout, zeroing their balance in the Transactions table.
Insert a row into "Payouts" table that stores the details of the payout attempt.
Problems:
How will I know which purchase transactions belong to each payout cycle (so I can then store the payout_id in those transaction records). I could use the date of the transaction, and each payout could be for the 24hr period prior to the CRON job? I'm flexible on this and not sure what the most robust logic would be.
Any advice on how best to structure this, or links to similar projects would be greatly appreciated.
Thank you!
and welcome to Stack Overflow.
This question may be a too wide for this format - please do read "how to ask".
Firstly - I'm answering on the assumption this is MySQL. Again - please read "how to ask", tagging with multiple technologies isn't helpful.
Fistly - this on how to store money in MySQL.
Secondly - the common pattern for doing this is to have the transaction table only reflect complete transactions. That way, the current balance is always sum(transaction_value), with the transaction date showing you the balance at a given point in time. You typically store the interim state for each transaction in a dedicated table (e.g. "payout"), and only insert into the transaction table once that payout transaction is complete.
You should remove all the status and transaction_fee references from the transaction table, and store them in the dedicated tables. A transaction_fee can be represented as a transaction in its own right.
If you want to store the relationship between purchase and payout, you might have something like:
Payout
Payout_id
Payout_amount
Payout_status
Payout_date
...
Purchase
Purchase_id
Customer_id
Item_id
Purchase_date
....
Payout_purchase
Purchase_id
Payout_id
Your logic then becomes:
cron job searches all purchases that haven't been paid out (where purchase_id not in (select purchase_id from payout_purchase)
for each vendor:
create new record in payout_purchase
find sum of new payout_purchase records
attempt payout
if (payout succeeded)
insert record into transaction table with vendor ID, payout ID and payout amount
else
handle error case. This could be deleting the record (and logging the failure somewhere else), or by adding a "status" column with the value "failed". The latter option makes it easier to provide your vendors with a statement - "we attempted to pay you, but the payment failed". Either way, you want to have a way of monitoring failures, and monitor them.
end if
next vendor
I've left out the various state and error management logic steps.
Things you want to worry about:
What happens if a purchase occurs while the payout logic is running? You need to make sure you work on defined data sets in each step. For instance, you need to insert data into the "payout_purchase" table, and then work only on those records - new purchases should not be included until the next run.
What happens if a payout fails? You must ensure they are included in the next payment run.
How do you provide a statement to your buyers and sellers? What level of detail do you want?
Transaction management from MySQL may help, but you need to spend time learning the semantics and edge cases.

Grouping with associated variables

i have a table as below:
Account no. Login Name Numbering
1234 rty234 1
1234 bhoin1 1
3456 rty234 2
3456 0hudp 2
9876 cfrdk 3
From the table above, you can see that rty234 and bhoin1 registered a same account no of 1234, thus i know that rty234 and bhoin1 are related and i numbered them as 1. The numbering field was based on the account no.
Then I found that rty234 also registered another account no of 3456 and the same account no was registered by 0hudp as well. Thus, i concluded that rty234, bhoin1 and 0hudp are related. Therefore, i wanted to renumber the third and forth row to 1. If they are not further related, then just remain the numbering. How can i achieve that using mysql.
The expected output will be as follow:
Account no. Login Name Numbering New_Numbering
1234 rty234 1 1
1234 bhoin1 1 1
3456 rty234 2 1
3456 0hudp 2 1
9876 cfrdk 3 3
You need to understand how to design a relational database.
These groupings that you want to make with the New_Numbering field should be done at the time the accounts are registered. I see two pieces of arbitrary information that needs to be tracked: account number and login name. Seems like the people registering the account can type whatever they want here, effectively, perhaps account numbers must be numerical. That detail doesn't matter.
What you want here is one account which can have multiple account numbers associated with it, and multiple logins. I would also assume that future development may add more to this, for example - why do people need multiple logins? Maybe different people are using them, or different applications. Presumably, we could collect additional information about the login names that stores additional details about each login. The same could be said about account numbers - certainly they contain more detail than just an account number.
First, you need one main login table.
You describe rty234 and bhoin1 as if they are unique people. So make this is a login_name column which is a unique index in a login table. This table should have an auto-increment login_id as the primary key. Probably this table also has a password field and additional information about that person.
Second, create an account table.
After creating their login, make them register an account with that login. Make this a two-step process. When they offer a new account number, create a record for it in the account table with additional identifying information that only the account-holder would know. Somehow you have to validate that this is actually their account in order to create this record, I would think. This table would also contain an auto-incremented primary key called account_id in addition to account_no and probably other details about the account.
Third, create a login_account table.
Once you validate that a login actually should have access to an account, create a record here. This should contain a login_id and an account_id which connects these two tables. Additionally, it might be good to include the information provided which shows that this login should have access to this account.
Now, when you want to query this data, you can find groups of data that have the same login_id or account_id, or even that share either a login or an account with a specific registration. Beyond that, it gets hairy to do in an SQL query. So if you really want to be able to go through the data and see who is in the same organization or something, because they share either a login or an account with the same group, you have to have some sort of script.
Create an organization table.
This table should contain an organization_id so you can track it, but probably once you identify the group you'll want to add a name or additional notes, or link it to additional functionality. You can then also add this organization_id field to the login or account tables, so you can fill them once you know the organization. You have to think about if it's possible for two organizations to share accounts, and maybe there's a more complicated design necessary. But I'm going to keep it simple here.
Your script should load up all of the login_id and account_id values and cache them somewhere. Then go through them all and if they have an organization_id, put their login_id or account_id in a hashmap with the value as the organization_id. Then load up all of the login_account records. If either the login_id or account_id has an organization_id in its hashmap, then add the other to its hashmap with the same organization_id. (if there's already one there, it would violate the simple organization uniqueness assumption I made, but this is where you would handle complexity - so I would just throw an exception and see if it happens when I run the script)
Hopefully this is enough example to get you started. When you properly design a database like this, you allow the information to connect naturally. This makes column additions and future updates much easier. Good luck!

Managing Historical and Current Records with SQL

I want to keep track of each User's current balance and balance history using the Django ORM. I imagine 2 tables (User and History) with a one-to-many between User and History representing a user's entire history, and a one-to-one between User and History for easy access to the current balance:
History
ID | User (FK to User) | Delta | Balance | Timestamp
User
ID | Name | Employee | Year | Balance (FK to History)
1) Does this seem reasonable given that I'm using the Django ORM? I think with raw SQL or another ORM, I could give history a start and stop date, then easily get the latest with SELECT * FROM History WHERE user_id=[id] AND stop IS NULL;.
2) Should History have a balance column?
3) Should User have a balance column (I could always compute the balance on the fly)? If so, should it be a "cached" decimal value? Or should it be a foreign key to the latest balance?
A strictly normal approach would say that neither table should contain a balance column, but that users' balances should be calculated when required from the sum of all their history. However, you may find that using such a schema would result in unacceptable performance—in which case cacheing the balance would be sensible:
if you're mostly interested in the current balance, then there's little reason to cache balances in the History table (just cache the current balance in the User table alone);
on the other hand, if you might be interested in arbitrary historical balances, then storing a historical balances in the History table would make sense (and then there'd be little point in also storing the current balance in the User table, since that could easily be discovered from the most recent History record).
But perhaps it's not worth worrying about cacheing right now? Have in mind the mantra "normalise until it hurts; denormalise until it works" as well as Knuth's famous maxim "premature optimisation is the root of all evil".

Database Schema for Registered Customers and Guest Checkout

For an ecommerce site that allows both Guest checkouts and registered user checkouts, how will you handle the 2 different customer groups?
Do you store both groups in the same table customers which has a foreign key customer_group_id pointing to another table customer_groups? In this case, will you worry about duplicate guest checkouts "polluting" the customers table?
How will the information captured for the 2 customer groups be different? I am thinking the difference is that the guest checkout customers will not have a password thats it.
I store customer information directly in the order rather than rely on the information in the customer record. That does a couple things:
I don't have to have a guest user in the customer database
If my customer information changes, or if a customer wants to ship one item to one place, and another item to another place the information about the order will still be correct.
I view order information as historical data. It should not change because something else in your database changes. For example if I order something from you, and at some later time I move and update my billing and shipping information, you should still know that the previous order was billed and shipped to the previous address. If you rely on the relationship between the customer and the order to retain bill to and ship to information, Once I move and update my profile, you think you shipped to my new address. You may not see that as an issue, but it will be.
You can still get the current information from the customer record to populate the fields on the order if the customer has an account. For a guest, he has to type it every time.
The common entity is a Checkout. The guest checkout would be a null FK link on the checkout entity to a registerd user entity.

Website Transactions in MySQL Database

Good Day,
I'm currently designing database structure for a website of mine. I need community assistance in one aspect only as I never did something similar.
Website will include three types of the payments:
Internal payments (Escrow kind payments). User can send payment to another user.
Deposits. Users add fund to their accounts.
Withdrawal. User can request a withdrawal. Money will be sent to their bank/PayPal account.
Basically, I need some tips to get the best design possible.
Here's what I'm thinking about:
deposits - this table will store info about deposits
deposits_data - this table will store info about deposit transaction (ex. data returned by PayPal IPN)
payments - table to store internal payments
withdrawals - table to store info about withdrawal request
transactions - table to store info about ALL transactions (with ENUM() field called type with following values possible: internal, deposit, withdrawal)
Please note that I have already a table logs to store every user action.
Unfortunately, I feel that my design approch is not the best possible in this aspect. Can you share some ideas/tips?
PS. Can I use a name "escrow" for internal payments or should I choose different name?
Edit
DEPOSITS, PAYMENTS and WITHDRAWALS tables store specific transaction details. TRANSACTIONS table stores only limited info - it's a kind of logs table - with a details field (which contains a text to display in user log section, ex: "User 1 sent you a payment for something")/
Of course I have users tables, etc.
Can I use a name "escrow" for internal
payments or should I choose different
name?
Escrow has a specfic financial/legal meaning, which is different from how you seem to mean it: "a written agreement (or property or money) delivered to a third party or put in trust by one party to a contract to be returned after fulfillment of some condition" (source)
So choosing a different name seems like a good idea.
As for design, what data will DEPOSITS, PAYMENTS and WITHDRAWALS store which TRANSACTIONS won't? Also, you need an ACCOUNTS table. Or are you planning to just use your existing USERS table (I presume you have such a thing)? You probably ought to have something for external parties, even if you only intend to support PayPal for the time being.