Group transactions in one stored procedure, or separate into two stored procedures? - mysql

I have a complex web site that handles online games with real money. I use a double-entry database design for my transactions, a simple example is as follows:
John deposits $5
John receives 5000 credits
John uses those 5000 credits to play a game.
The transaction in the database looks as follows:
trans_id | account_id | trans_type | date | amount |
-----------------------------------------------------
1 | John(PayPal)| Debit | date | -5.00 |
2 | System | Credit | date | 5.00 |
3 | SystemGame | Debit | date | -5000 |
4 | JohnGame | Credit | date | 5000 |
I wrote a stored procedure with a transaction in it that inserts transactions 1 and 2, the Debit from John's PayPal account, and the Credit to our System account.
My question is, should I also include the other transactions where John has money transferred from our SystemGame account to his Game account? Or should I have a stored procedure for each group of transactions? All 4 transactions occur simultaneously, John is credited immediately after depositing the $5.
Also, should I separate the transaction tables for game credits from the real money transaction table?

I think this is what you want. But I recommend you doing in 2 different transactions to separate ingame-money and real-money.
The only reason is for if in future you need to change ingame-money management or real-money management you can do it separately.

A transaction must be atomic: all the steps must be completed. If one fails, everything is rolled back.
If those 4 steps you mention must be all completed at once, they must be in a single transaction.
So, for me, your current approach is OK.
However, if you want to split things into 2 stored procedures, you can manage transactions from the language(PHP, C#) you use.
Check this: Transactions in MySQL - Unable to Roll Back and this: PHP + MySQL transactions examples

Related

Why Facebook Robyn Algorithm needs only unique dates?

My team and I have been exploring the Robyn algorithm to do Market Mix Modeling on my dataset. The dataset is monthly level data of the promotional activity for each customer. In this case the data looks like this-
|Id |Date |Revenue|Channels…|
|-----------|--------|-------|---------|
|Customer 1 |Jan-2021| | |
| ….. | | | |
|Customer 1 |Dec-2021| | |
|Customer 2 |Feb-2021| | |
| ….. | | | |
|Customer 2 |Dec-2021| | |
In this way we have over 1000 customers and their monthly data of the channel activity. We have been able to create models using linear regression to get the impact of each channel. Now when we tried to run this data on Robyn we get a duplicate date error, so does this mean we have to run Robyn algorithm for each customer separately? Then we will have only 12 data points for the model and getting daily or weekly is also not possible for us. Is there anyway to run this kind of data on Robyn? Also why does Robyn restrict us to use unique dates even though it uses ridge regression internally which would not be affected by dates, isn’t having more datapoints better?
About Robyn
You are receiving this duplicate date error because unfortunately, Robyn does not treat panel datasets properly at the moment. You can aggregate the different customer segments in your datasets and run the modeling at a total level which Robyn supports.

Housing Society management system database structure

I am designing billing structure for housing society yesterday I googled and gone through banking billing structure and designed my database structure but I am not sure whether this would be correct. So I am putting my billing database structure.
Please tell me if I am wrong any where or any changes has to be done in my database structure.
And one more questions where I have to post society balance (debit, credit )like eg expense Bldg insurance and income like Adv board hoarding and also not sure about balance audit trail (having it in transaction table of separate table with transaction id as FK).
Please note all table will have by default have created,modified time, by and ip address
Table billingstatement
id | description | amount | Bill Month | userId | societyId
1 | Maint Chrg 1000 sqft x 5 per sqft | 5000 | Aug-16 | 1001 | 101
2 | Water Charges | 200 | Aug-16 | 1001 | 101
3 | Construction Charges | 300 | Aug-16 | 1001 | 101
4 | Reserved Parking chrgs | 500 | Aug-16 | 1001 | 101
Table Accounts
id | balance(current bal) | societyId | modifiedTime |
1 | -6000 | 101 | 2016-01-01 21:01:01 |
2 | -5000 | 101 | 2016-01-01 21:01:01 |
3 | 1000 | 101 | 2016-01-01 21:01:01 |
Table transaction
id | amount | balance | trans_type | trans_time | account_id |
1 | 6000 | 0 | 1 | 2016-01-01 21:01:01 | 1 |
2 | 5500 | -6000 | 1 | 2016-02-01 21:01:01 | 2 |
tran_type :- 1 = Payment by user, 2 = Income to society, 3 = Expense to society
Table map_account_user
map_id | account_id | user_id
1 | 2 | 1001
If account mapping is not present then it means it is a society account and not a user account.
Reference :-
billing banking desing
banking project sample
I have a hesitation with storing an aggregate with the entity. Unless the aggregate is very difficult to calculate, you should always account for these by examining the details.
I'm assuming you are designing a relational database. In a relational database, you normalize the data.
I'm having trouble following your database design because you have too many different fields called id. Each id field should get a unique name, so people can tell what the different id fields represent.
Let's start with the Transaction table. Generally table names are singular. I capitalize table names and column names. You don't have to follow that convention.
Transaction
-----------
Transaction ID
Transaction Type
User ID
Society Account
Transaction Amount
Transaction Time Stamp
...
Transaction ID is an auto-incrementing integer. It is also the primary (clustering) key to the Transaction table. Transaction Type is 1 = Payment by user, 2 = Income to society, 3 = Expense to society. I'm not sure what the difference is between Transaction Type 1 and Transaction Type 2.
Either the User ID or the Society Account column is filled in. The User ID column is filled in for Transaction Type 1 and the Society Account column is filled in for Transaction Types 2 and 3. The not filled in column is set to null.
Transaction Amount is always a positive value. Your code will subtract the Transaction Amount from the Society Account for Transaction Type 3.
You will create a unique index for (User ID, Transaction Time Stamp descending, Transaction ID) and a unique index for (Society Account, Transaction Time Stamp descending, Transaction ID). This allows you to quickly get all the transactions for a user or society account, for a given month.
Next, let's look at the UserAccountBalance table.
UserAccountBalance
------------------
User ID
Balance Year and Month
Balance Amount
...
The primary key to this table is (User ID, Balance Year and Month descending). You maintain the historical balances for each month for each User ID. This allows an auditor to verify the balances by running queries against the Transaction table.
Next, let's look at the SocietyAccountBalance table.
SocietyAccountBalance
---------------------
Society Account
Balance Year and Month
Balance Amount
...
This table is similar to the UserAccountBalance table, but for Society accounts.
Next, let's look at the Billing table
Billing
-------
User ID
Billing Year and Month
Billing Type
Square Feet
Charge per Square Foot
Total Charge
...
The primary key is (User ID, Billing Year and Month descending, Billing Type). I'm assuming that you only get one billing charge per billing type per month.
Billing Type is 1 = Maintenance Charge, 2 = Water Charge, 3 = Construction Charge, 4 = Reserved Parking Charge. You can generate the text on the bill from the values in this table, so there's no need to store the text in the database. The Square Feet and Charge per Square Foot columns are filled in for Billing Type 1, otherwise they are null.
You still have to match up the payments with the billings, but this should be enough to get you started on the right path.
The structure works. I would change the id naming a bit.
account_id is referenced in some places
but in the account table it is Id.
I would make it account_id in the Account table. This makes it easier to locate where you can join for the users of the database. Every place you call it "Account_Id" is should be the same Account_Id with a single master table that generates it.
Really appreciated for your efforts. You reference on banking system is really a best approach. I would like to provide you some ideas which may help you if you feel it is better for you.
Let's start. If you use your description column's data:
{Water Charges
Construction Charges
Reserved Parking chrgs}
as a new columns, your no of rows will be decreased and since you can have efficient data maintenance. For each userId you can have a single row of data rather than maintaining four rows. Just verify below provided example.
For each sq ft you can have a description column separately.
id | userId | societyId | BillMonth | MaintChrg | WaterCharges | ConstructionCharges | ReservedParkingCharges | Description
1 | 1001 | 101 | Aug-16 | 5000 | 200 | 300 | 500 | Maint Chrg 1000 sqft x 5 per sqft
2 | 1002 | 101 | Aug-16 | 4000 | 200 | 300 | 500 | Maint Chrg 900 sqft x 5 per sqft
3 | 1003 | 102 | Aug-16 | 5000 | 200 | 300 | 500 | Maint Chrg 900 sqft x 5 per sqft
For you doubt on debit and credit use:
debit as expenseBldginsurance
credit as Adv board hoarding
You can also add may other columns like user name, No of peoples, electricity charges, etc.

Redundant data or two queries?

Take, for instance, a table that holds credit card charges, a table that holds credit cards, and a table that holds users.
Each charge is associated with exactly one card, and each card is associated with exactly one user. A user may have multiple cards on file.
If I were to hold this data in a MySQL database in three distinct tables, like so:
charges:
---------------------------------------------
id | card | amount | description | datestamp
---------------------------------------------
5 | 2 | 50.00 | Example | 1369429422
cards:
------------------------------------------------------------------
id | user | name | number | cvv2 | exp_mm | exp_yy
------------------------------------------------------------------
2 | 1 | Joe Schmoe | 4321432143214321 | 555 | 1 | 16
users:
-------------------------------------------
id | first_name | last_name | email
-------------------------------------------
1 | Joe | Schmoe | joe#schmoe.co
Now, let's say that I want to access a user, given a charge. In order to get to the user, I would first have to look up the card associated with the charge, and then look up the user associated with the card. Obviously, in an example like this, speed would be negligible. But in other scenarios, I see this as being two queries.
However, if I stored the data like so:
charges
----------------------------------------------------
id | card | user | amount | description | datestamp
----------------------------------------------------
5 | 2 | 1 | 50.00 | Example | 1369429422
then the charge would be associated directly with a user. That, however, is redundant information, since that same data is stored in the cards table.
Thoughts?
Your instinct not to include the user information in the charges table is correct; however, it's still only one query:
select first_name, last_name, email
from users, cards, charges
where users.id = cards.user
and cards.id = charges.card
and charges.id = 5;
That would give you the user info for the charge with id 5. This is the exact thing that relational databases are best at :) This kind of thing is called a "join" because it joins multiple tables together to give you the information you need. There are multiple ways to write this query.
As an aside, perhaps this is a contrived example, but if this is an application you are writing from scratch, there are lots of reasons to avoid storing credit cards in your own database. Often a payment processor can handle the details for you while still allowing you to charge credit cards. More info.
You could denormalize by adding the user id to the charges table. You need to know if that is necessary, given the expected size of the tables. If this optimization is warranted, use it. If you don't know, then optimize in the future as necessary.
As it stands, you don't need two queries though
SELECT users.* FROM charges
JOIN cards ON charges.card = cards.id
JOIN users ON cards.user = users.id
WHERE charges.id = ?

Database Design for Debit / Credit / Payout service

I have a webservice that I am designing which pays users to complete certain tasks. For example, if a user clicks a link, they are paid, say, $0.10 to their account. A user could perform any one of these tasks up to 20 times per day. In order for the user to request the funds be paid to them, they must have an account balance of $5.
I'm trying to decide the best way to keep track of the transactions and accounts. My design currently looks as follows:
Accounts
---------
| account_id | member_id | balance |
-------------------------------------
| 1 | 1 | 497.8500 | -- System Account
| 2 | 5 | 2.1500 |
Transactions
------------
| transaction_id | account_id | type_id | task_id | date | amount |
-------------------------------------------------------------------
| 1 | 1 | Debit | 1 | date | -1.10 |
| 2 | 2 | Credit | 1 | date | 1.10 |
| 3 | 1 | Debit | 1 | date | -1.05 |
| 4 | 2 | Credit | 1 | date | 1.05 |
This design is based off the accounting principals for double-entry. Now my dilemma is: technically the user isn't paid this money until they have requested a "Payout". A "Payout" consists of the user submitting a request, the request being approved, the money being deducted from their account balance and sent to them via PayPal. So my question is, is it a good idea to actually deduct the amounts from the system balance if the user has not requested a payout yet? The money in their account can be used to fund additional things on the site, it also expires after 30 days of inactivity.
My idea was to keep the transaction table as it is and design another table called payouts with the following structure
Payouts
-------
| payout_id | account_id | date | amount |
------------------------------------------
| 1 | 2 | date | $2.00 |
But then how do I reflect the payout in the transactions table? It seems incomplete.
Should I separate the tasks from the transactions table and only enter a transaction record if the user has requested the payout? I'm not sure if I would lose auditing abilities by doing it that way.
Does anyone have some insight?
Just use another account/ledger/balancesheet code
So on getting the 10c
Your account is - 10c
Their payout account is + 10c
When the balance on their payout account > $5, then can request one.
When they do debit the amount from payout account, credit it to Payoutpending account
When it confirmed debit payout pending and credit payed out.
If it's rejected debit payout pending and credit payout account
.
Oh and you really need to think about storing balance, as opposed to calculating and reporting it, you'll get in right mess doing that.
What are the business requirements regarding the tracking of the payout?
Does your business need to track payout as a separate event (double entry) with a date, amount, etc., or do they just need to know that payout was requested.
If any of this is true, or it is likely, given other companies practices (think merger) it might be good to attach a payout table just like you did to track payout.
Would you also give some details on the requirement on double-entry. I would try to keep business rules as explicitly captured in your design as possible.
Hope this helps. We can keep the dialog going in the comments,
I have limited bookkeeping skills, so while I hope I can help, I feel like I'm not fully qualified to give you an answer..
So while I think you are reasonably close to a pure accounting model, I think the payouts should also be transactions in this case.
If a user needs an overview from when cash is added to their account, this list should also contain the payouts. If the user received 5 times $1, and one payout of $5, his account balance should end up with 0.
So since you are properly doing both transactions, this means that the actual payment deducts the $5 from their account, but this needs to go somewhere too. So I feel that once this is paid, there should be a separate account 'paid'.
But.. the programmer in me wonders.. do you really want the double bookkeeping in this case in your database. You can deduce all the numbers without using the system account when you are actually making the payment.
So you are not actually missing any real information.. if indeed the cash always comes from a system account, and the user cannot transfer their balance to a different account..

Continue most recent value over a time range

I have this existing schema where a "schedule" table looks like this (very simplified).
CREATE TABLE schedule (
id int(11) NOT NULL AUTO_INCREMENT,
name varchar(45),
start_date date,
availability int(3),
PRIMARY KEY (id)
);
For each person it specifies a start date and percentage of work time available to spent on this project. That availability percentage implicitly continues until a newer value is specified.
For example take a project that lasts from 2012-02-27 to 2012-03-02:
id | name | start_date | availability
-------------------------------------
1 | Tom | 2012-02-27 | 100
2 | Tom | 2012-02-29 | 50
3 | Ben | 2012-03-01 | 80
So Tom starts on Feb., 27nd, full time, until Feb, 29th, from which on he'll be available only with 50% of his work time.
Ben only starts on March, 1st and only with 80% of his time.
Now the goal is to "normalize" this sparse data, so that there is a result row for each person for each day with the availability coming from the last specified day:
name | start_date | availability
--------------------------------
Tom | 2012-02-27 | 100
Tom | 2012-02-28 | 100
Tom | 2012-02-29 | 50
Tom | 2012-03-01 | 50
Tom | 2012-03-02 | 50
Ben | 2012-02-27 | 0
Ben | 2012-02-28 | 0
Ben | 2012-02-29 | 0
Ben | 2012-03-01 | 80
Ben | 2012-03-02 | 80
Think a chart showing the availability of each person over time, or calculating the "resource" values in a burndown diagram.
I can easily do this with procedural code in the app layer, but would prefer a nicer, faster solution.
To make this remotely effective, I recommend creating a calendar table. One that contains each and every date of interest. You then use that as a template on which to join your data.
Equally, things improve further if you have person table to act as the template for the name dimension of your results.
You can then use a correlated sub-query in your join, to pick which record in Schedule matches the calendar, person template you have created.
SELECT
*
FROM
calendar
CROSS JOIN
person
LEFT JOIN
schedule
ON schedule.name = person.name
AND schedule.start_date = (SELECT MAX(start_date)
FROM schedule
WHERE name = person.name
AND start_date <= calendar.date)
WHERE
calendar.date >= <yourStartDate>
AND calendar.date <= <yourEndDate>
etc
Often, however, it is more efficient to deal with it in one of two other ways...
Don't allow gaps in the data in the first place. Have a nightly batch process, or some other business logic that ensures all relevant dat apoints are populated.
Or deal with it in your client. Return each dimension in you report (data, and name) as seperate data sets to act as your templates, and then return the data as your final data set. Your client can itterate over the data and fill in the blanks as appropriate. It's more code, but can actually use less resource overall than trying to fill-the-gaps with SQL.
(If your client side code does this slowly, post another question examining that code. Provided that the data is sorted, this is acutally quite quick to do in most languages.)