A person gets a 10% commision for purchases made by his referred friends.
There are two tables :
Reference table
Transaction table
Reference Table
Person_id Referrer_id
3 1
4 1
5 1
6 2
Transaction Table
Person_id Amount Action Date
3 100 Purchase 10-20-2011
4 200 Purchase 10-21-2011
6 400 Purchase 12-15-2011
3 200 Purchase 12-30-2011
1 50 Commision 01-01-2012
1 10 Cm_Bonus 01-01-2012
2 20 Commision 01-01-2012
How to get the following Resultset for Referrer_Person_id=1
Month Ref_Pur Earn_Comm Todate_Earn_Comm BonusRecvd Paid Due
10-2011 300 30 30 0 0 30
11-2011 0 0 30 0 0 30
12-2011 200 20 50 0 0 50
01-2012 0 0 50 10 50 0
Labels used above are:
Ref_Pur = Total Referred Friend's Purchase for that month
Earn_Comm = 10% Commision earned for that month
Todate_Earn_Comm = Total Running Commision earned upto that month
MYSQL CODE that i wrote
SELECT dx1.month,
dx1.ref_pur,
dx1.earn_comm,
( #cum_earn := #cum_earn + dx1.earn_comm ) as todate_earn_comm
FROM
(
select date_format(`date`,'%Y-%m') as month,
sum(amount) as ref_pur ,
(sum(amount)*0.1) as earn_comm
from transaction tr, reference rf
where tr.person_id=rf.person_id and
tr.action='Purchase' and
rf.referrer_id=1
group by date_format(`date`,'%Y-%m')
order by date_format(`date`,'%Y-%m')
)as dx1
JOIN (select #cum_earn:=0)e;
How to join the query to also include BonusRecvd,Paid and Due trnsactions, which is not dependent on reference table?
and also generate row for the month '11-2011', even though no trnx occured on that month
If you want to include commission payments and bonuses into the results, you'll probably need to include corresponding rows (Action IN ('Commision', 'Cm_Bonus')) into the initial dataset you are using to calculate the results on. Or, at least, that's what I would do, and it might be like this:
SELECT t.Amount, t.Action, t.Date
FROM Transaction t LEFT JOIN Reference r ON t.Person_id = r.Person_id
WHERE r.Referrer_id = 1 AND t.Action = 'Purchase'
OR t.Person_id = 1 AND t.Action IN ('Commision', 'Cm_Bonus')
And when calculating monthly SUMs, you can use CASE expressions to distinguish among Amounts related to differnt types of Action. This is how the corresponding part of the query might look like:
…
IFNULL(SUM(CASE Action WHEN 'Purchase' THEN Amount END) , 0) AS Ref_Pur,
IFNULL(SUM(CASE Action WHEN 'Purchase' THEN Amount END) * 0.1, 0) AS Earn_Comm,
IFNULL(SUM(CASE Action WHEN 'Cm_Bonus' THEN Amount END) , 0) AS BonusRecvd,
IFNULL(SUM(CASE Action WHEN 'Commision' THEN Amount END) , 0) AS Paid
…
When calculating the Due values, you can initialise another variable and use it quite similarly to #cum_earn, except you'll also need to subtract Paid, something like this:
(#cum_due := #cum_due + Earn_Comm - Paid) AS Due
One last problem seems to be missing months. To address it, I would do the following:
Get the first and the last date from the subset to be processed (as obtained by the query at the beginning of this post).
Get the corresponding month for each of the dates (i.e. another date which is merely the first of the same month).
Using a numbers table, generate a list of months covering the two calculated in the previous step.
Filter out the months that are present in the subset to be processed and use the remaining ones to add dummy transactions to the subset.
As you can see, the "subset to be processed" needs to be touched twice when performing these steps. So, for effeciency, I would insert that subset into a temporary table and use that table, instead of executing the same (sub)query several times.
A numbers table mentioned in Step #3 is a tool that I would recommend keep always handy. You would only need to initialise it once, and its uses for you may turn out numerous, if you pardon the pun. Here's but one way to populate a numbers table:
CREATE TABLE numbers (n int);
INSERT INTO numbers (n) SELECT 0;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
/* repeat as necessary; every repeated line doubles the number of rows */
And that seems to be it. I will not post a complete solution here to spare you the chance to try to use the above suggestions in your own way, in case you are keen to. But if you are struggling or just want to verify that they can be applied to the required effect, you can try this SQL Fiddle page for a complete solution "in action".
Related
SQL Query I Am Working With
Result from the table
What I am trying to accomplish is that instead of just having values for places where num_opens is actually counted, I would want to have it show all potential num_opens values between the minimum and maximum value, and their total to be 0. For example, in the photo we see a jump between
num_opens: 7 Total: 1
num_opens: 10 Total: 1
But I would like it to be
num_opens: 7 Total: 1
num_opens: 8 Total: 0
num_opens: 9 Total: 0
num_opens: 10 Total: 1
and similarly for all potential num_opens values between the minimum and maximum (11 - 15, 15 - 31, 31 - 48). It is tricky because everyday the maximum value could be different (today the max is 48, but tomorrow it could be 37), so I would need to pull the max value somehow.
Thank you!
You can use generate_array() and unnest():
select num_opens, count(t.num_opens)
from (select min(num_opens) as min_no, max(num_opens) as max_no
from t
) x cross join
unnest(generate_array(t.min_no, t.max_no)) num_opens left join
t
on t.num_opens = num_opens
group by num_opens;
You need a reference table to start with. From your picture you have something called users, but really any (big enough) table will do.
So to start, you'll build the reference table using a rank() or row_count() function. Or if your users.id has no gaps it's even easier to use that.
SELECT *, rank() OVER (ORDER BY id) as reference_value FROM users
This will generate a table 1....n for users.
Now you join onto that, but count from the joined in table:
SELECT
a.reference_value, count(b.num_opens) as total
FROM
(SELECT rank() OVER (ORDER BY id) as reference_value from users) a
LEFT JOIN
[whatever table] b ON a.reference_value = b.num_opens
GROUP BY
a.reference_value
But this is too many rows! You definitely have more users than these event counts. So throw a quick filter in there.
SELECT
a.reference_value, count(b.num_opens) as total
FROM
(SELECT rank() OVER (ORDER BY id) as reference_value from users) a
LEFT JOIN
[whatever table] b ON a.reference_value = b.num_opens
WHERE
a.reference_value <= (SELECT max(num_opens) FROM [whatever table])
GROUP BY
a.reference_value
I want to get relative counts/frequency of values (can be many) in the column.
From this toy table numbers:
num
1
2
3
1
1
2
1
0
This one:
num | count
0 | 0.125
1 | 0.5
2 | 0.25
3 | 0.125
I can do this with a variable and two queries:
SET #total = (SELECT COUNT(*) FROM numbers);
SELECT num, ROUND(COUNT(*) / #total, 3) AS count
FROM numbers
GROUP BY num
ORDER BY num ASC
But how I can get the results in one query (without listing all the possible values of num)?
If I am querying joins of several tables, then even getting a total number of rows becomes quite long and ugly.
EDIT: This is tested in msSql, misread question!
You can try this:
--DROP TABLE numbers
CREATE TABLE numbers(num decimal(16,3))
INSERT INTO numbers VALUES(1)
INSERT INTO numbers VALUES(2)
INSERT INTO numbers VALUES(3)
INSERT INTO numbers VALUES(1)
INSERT INTO numbers VALUES(1)
INSERT INTO numbers VALUES(2)
INSERT INTO numbers VALUES(1)
INSERT INTO numbers VALUES(0)
SELECT
num,
CAST(numCount as DECIMAL(16,2)) / CAST(sum(numCount) over() AS decimal(16,2)) frequency
FROM (
SELECT
num,
count(num) numCount
FROM
numbers
GROUP BY
num
) numbers
num frequency
0.000 0.1250000000000000000
1.000 0.5000000000000000000
2.000 0.2500000000000000000
3.000 0.1250000000000000000
You can use windowing functions -
SELECT DISTINCT num,
ROUND(CAST(COUNT(1) OVER (Partition by num) AS DECIMAL) / CAST(COUNT(1)OVER() AS DECIMAL),3) AS [count]
FROM numbers
ORDER BY num ASC
COUNT(num) would give the same results, it's personal preference for me to count a supplied value per row rather than counting the value in the rows, the partitioning handles which rows are included in the count.
Note the counts need to be cast as decimal, otherwise your division will be integer division, giving you wrong numbers.
Using DISTINCT instead of GROUP lets your windowing function apply to the whole table, not just each group within that table, and still only returns one result per num.
SQLFiddle
This is about the same number of keystrokes, and about the same performance, but it is only one statement:
SELECT n.num, ROUND(COUNT(*) / t.total, 3) AS count
FROM ( SELECT COUNT(*) AS total FROM numbers ) AS t
JOIN numbers AS n
GROUP BY n.num
ORDER BY n.num ASC
I have 2 different tables called observations and intervals.
observations:
id,
type,
date
1 recess 03.05.2011 17:00
2 recess 03.06.2011 12:00
intervals:
id,
observation id,
value
1 1 5
2 1 8
3 2 4
4 2 4
I want a view that will display:
observation_id
percent_positive ((count where value = 5)/(total number of observations))
1 .5
2 0
I know
Select observation_id, Count(*) from intervals where value = 5 Group by
observation_id
will give me:
1 1
1 0
and
Select observation_id, Count(*) from intervals Group by
observation_id
will give me:
1 2
2 2
So how do I combine these to create a view with the percent_positive variable I'm looking for?
You can use joins to fetch data from two tables having a common column field .
For more ,please read it in detail Multiple values from multiple Tables
This gave me your desired result. Not proficient enough in SQL to determine if this is the optimal way of solving the issue though.
SELECT
observation_id as obs,
(SELECT COUNT(*) FROM intervals WHERE observation_id = obs AND value = 5)/(SELECT COUNT(*) FROM INTERVALS WHERE observation_id = obs) as percent
FROM observation
JOIN intervals ON observation.id = intervals.observation_id
GROUP BY observation_id;
SELECT
i.observation_id,
SUM(IF(i.value=5,1,0)) / counts.num as 'percent_positive'
FROM intervals i
inner join (
select observation_id, count(1) as num from intervals group by observation_id
) counts on counts.observation_id = i.observation_id
group by i.observation_id
order by i.observation_id
;
That oughta get you close, can't actually run to test at the moment. I'm not sure about the significance of the value 5 meaning positive, so the i.value=5 test might need to be modified to do what you want. This will only include observation IDs with intervals that refer to them; if you want ALL observations then you'll need to join that table (select from that table and left join the others, to be precise). Of course the percentage for those IDs will be 0 divided by 0 anyway...
Consider a student table with 104 rows in it. I need to create groups with a minimum of 10 students in each groups. In the case with 104 students, I would end up having 10 groups of 10 students and 1 group of 4 students if I iterate on each students and create the grouping. There's a rule that a group with remaining students cannot have less than 5 students in it (in this case the last group consist of 4 student). Two possible approach I'm trying to do:
Roll up the last group that has less than 5 students and assign each of them to any groups, or
Spread the last group evenly to any groups.
How do I achieve any of these? Many thanks.
Eric
You can use ntile.
Distributes the rows in an ordered partition into a specified number
of groups. The groups are numbered, starting at one. For each row,
NTILE returns the number of the group to which the row belongs.
Some sample code:
declare #NumberOfStudents int
declare #StudentsPerGroup int
set #StudentsPerGroup = 10
set #NumberOfStudents = 104
select StudentID,
ntile(#NumberOfStudents / #StudentsPerGroup) over(order by StudentID) as GroupID
from Students
Try it out on SE-Data.
Here is a variant 2. First part prepares counters. As I don't have any data on students I resolved to creating a temporary table of #maxStudents rows with only one column ID.
First cte (students) generates a list of students of maxStudents rows. Second (s) extracts students assigning them row number (obviously not necessary here, but essential when you plug in your query that retrieves students). It also returns number of students.
Third part places students into groups. Students belonging to last group will be relocated to another group if they belong to last group having less than #minGroupSize members. Version one can be achieved by replacing then part in case statement with for example 1 to place them in group one.
declare #group_size int
set #group_size = 10
declare #maxStudents int
set #maxStudents = 104
declare #minGroupSize int
set #minGroupSize = 5
;with students as (
select 1 id
union all
select 2 * id + b
from students cross join (select 0 b union all select 1) b
where 2 * id + b <= #maxStudents
),
s as (
select students.id, row_number() over(order by students.id) - 1 rowNumber, count (*) over () TotalStudents
from students
)
select s.id StudentID,
case when TotalStudents % #group_size < #minGroupSize
and rowNumber >= (TotalStudents / #group_size * #group_size)
then rowNumber - (TotalStudents / #group_size * #group_size)
else rowNumber / #group_size
end + 1 Group_number
from s
order by 2, 1
i have three tables on mysql database which are:
RECHARGE with these columns: rid, uid,res_id, agent_id, batch_id, rcard_order, serialno, email,
units, bankid, paydate, slipno, rpin, amtpd, bonus, description,
crSender, crSenderId,
transaction_ref,rechargeDate,
processed
SENT with these columns: sendid, uid, res_id, recipients, volume, ffdaily, message, sender, msgtype, flash, mob_field, wapurl,
date
BILL with these columns: bid, uid, email, unitBals, lastusedate
The question is these:i want a query that will subtract the sum of volume in SENT table from units in RECHARGE table and use the result to update the unitBals column on BILL table where the primary key joining the three tables is their uid.
i used this query but it is not giving me the same answer as when i sum(volume) and subtract it from sum(units) separately doing the calculation on my own
update bill set unitbals = (SELECT sum( recharge.units ) - sum( sent.volume )
FROM sent, recharge
WHERE sent.uid = recharge.uid)
where email = 'info#dunmininu.com'
There are two problems here. First, from the fact that you are using sum, I take it that there can be more than one Recharge record for a given Uid and more than one Sent record for a given Uid. If this is true, then when you do the join, you are not getting all the Recharges plus all the Sents, you are getting every combination of a Recharge and a Sent.
For example, suppose for a given Uid you have the following records:
Recharge:
Uid Units
42 2
42 3
42 4
Sent
Uid Volume
42 1
42 6
Then a query
select recharge.units, sent.volume
from recharge, sent
where recharge.uid=sent.uid
will give
Units Volume
2 1
2 6
3 1
3 6
4 1
4 6
So doing sum(units)-sum(volume) will give 18-21 = -3.
Also, you're doing nothing to connect the Uid of the Sent and Recharge to the Uid of the Bill. Thus, for any given Bill, you're processing records for ALL uids. The Uid of the Bill is never considered.
I think what you want is something more like:
update bill
set unitbals = (SELECT sum( recharge.units ) from recharge where recharge.uid=bill.uid)
- (select sum(sent.volume) from sent where sent.uid=bill.uid)
where email='info#dunmininu.com';
That is, take the sum of all the recharges for this uid, minus the sum of all the sents.
Note that this replaces the old value of Unitbals. It's also possible that you meant to say "unitbals=unitbals +" etc.
I think you need separate sum in the two tables:
update bill
set unitbals =
( SELECT sum( recharge.units )
FROM recharge
WHERE bill.id = recharge.uid
) -
( SELECT sum( sent.volume )
FROM sent
WHERE bill.id = sent.id
)
where email = 'info#dunmininu.com'