Simple Percentage Column in SQL - mysql

I am relatively new to SQL and looking to pick up a few simple tricks. I have managed to create a query that selects each different type of car permit (chargeType), counts the number issued for each one (num), and adds a column that shows the total number of permits issued (total). The code is below.
SELECT chargeType,
COUNT(chargeType) AS num,
(SELECT COUNT(chargeType)
FROM permit) AS total
FROM permit
GROUP BY chargeType
I now want to add a final column which shows the percentage of each permit type issued. So the number of each permit type divided by the total multiplied by 100, but I am struggling to do it. Can anybody help?

Try something like this
SELECT chargeType,
num,
total,
num / NULLIF(total, 0) * 100 AS Percenatge
FROM (SELECT chargeType,
Count(chargeType) AS num,
(SELECT Count(chargeType)
FROM permit) AS total
FROM permit
GROUP BY chargeType) a
NULLIF is used to avoid divide by zero expection

This will work. The plus of this solution is there is no subquery in SELECT
SELECT *, (num * 100 / total) as percentage
FROM
(
SELECT
chargeType,
COUNT(chargeType) AS num,
total,
(num * 100 / total) as percentage
FROM
(SELECT COUNT(chargeType) as total FROM permit) ttotal
CROSS JOIN
permit
GROUP BY
chargeType
) tsub

Related

SQL - How To Add Rows Between a Range of Numbers and Make Their Value 0

SQL Query I Am Working With
Result from the table
What I am trying to accomplish is that instead of just having values for places where num_opens is actually counted, I would want to have it show all potential num_opens values between the minimum and maximum value, and their total to be 0. For example, in the photo we see a jump between
num_opens: 7 Total: 1
num_opens: 10 Total: 1
But I would like it to be
num_opens: 7 Total: 1
num_opens: 8 Total: 0
num_opens: 9 Total: 0
num_opens: 10 Total: 1
and similarly for all potential num_opens values between the minimum and maximum (11 - 15, 15 - 31, 31 - 48). It is tricky because everyday the maximum value could be different (today the max is 48, but tomorrow it could be 37), so I would need to pull the max value somehow.
Thank you!
You can use generate_array() and unnest():
select num_opens, count(t.num_opens)
from (select min(num_opens) as min_no, max(num_opens) as max_no
from t
) x cross join
unnest(generate_array(t.min_no, t.max_no)) num_opens left join
t
on t.num_opens = num_opens
group by num_opens;
You need a reference table to start with. From your picture you have something called users, but really any (big enough) table will do.
So to start, you'll build the reference table using a rank() or row_count() function. Or if your users.id has no gaps it's even easier to use that.
SELECT *, rank() OVER (ORDER BY id) as reference_value FROM users
This will generate a table 1....n for users.
Now you join onto that, but count from the joined in table:
SELECT
a.reference_value, count(b.num_opens) as total
FROM
(SELECT rank() OVER (ORDER BY id) as reference_value from users) a
LEFT JOIN
[whatever table] b ON a.reference_value = b.num_opens
GROUP BY
a.reference_value
But this is too many rows! You definitely have more users than these event counts. So throw a quick filter in there.
SELECT
a.reference_value, count(b.num_opens) as total
FROM
(SELECT rank() OVER (ORDER BY id) as reference_value from users) a
LEFT JOIN
[whatever table] b ON a.reference_value = b.num_opens
WHERE
a.reference_value <= (SELECT max(num_opens) FROM [whatever table])
GROUP BY
a.reference_value

Use max column value in order by

I'm trying to order a table by two columns, each with a different weighting. The first is uptime, which is a value between 0 and 1 and has a weighting of 0.3. The second is votes, which is a non-negative integer and has a weighting of 0.7.
The weighting needs to be multiplied by a value between 0-1, so I'm going to get this for votes by dividing the number of votes for each row by the maximum number of votes held by any row.
This is my query so far, and it almost works:
SELECT addr
FROM servers
ORDER BY (0.3 * uptime) +
(0.7 * (votes / 100)) DESC
The 100 is hard-coded and should be the maximum value of votes. Using MAX(votes) makes the query return only the record with highest number of votes. Can this be done in a single query?
You could use a subquery for selecting the maximum value of votes
SELECT addr
FROM servers
ORDER BY (0.3 * uptime) +
(0.7 * (votes / (SELECT MAX(votes) FROM servers))) DESC
Example fiddle here.
Define a variable and use it:
DECLARE #maxVotes int
SELECT #maxVotes = MAX(votes) from servers
SELECT addr
FROM servers
ORDER BY (0.3 * uptime) +
(0.7 * (votes / #maxVotes)) DESC
or use a subquery in the order by:
SELECT addr
FROM servers
ORDER BY (0.3 * uptime) +
(0.7 * ( votes / (SELECT MAX(votes) FROM servers))) DESC

mysql query to generate a commision report based on referred members

A person gets a 10% commision for purchases made by his referred friends.
There are two tables :
Reference table
Transaction table
Reference Table
Person_id Referrer_id
3 1
4 1
5 1
6 2
Transaction Table
Person_id Amount Action Date
3 100 Purchase 10-20-2011
4 200 Purchase 10-21-2011
6 400 Purchase 12-15-2011
3 200 Purchase 12-30-2011
1 50 Commision 01-01-2012
1 10 Cm_Bonus 01-01-2012
2 20 Commision 01-01-2012
How to get the following Resultset for Referrer_Person_id=1
Month Ref_Pur Earn_Comm Todate_Earn_Comm BonusRecvd Paid Due
10-2011 300 30 30 0 0 30
11-2011 0 0 30 0 0 30
12-2011 200 20 50 0 0 50
01-2012 0 0 50 10 50 0
Labels used above are:
Ref_Pur = Total Referred Friend's Purchase for that month
Earn_Comm = 10% Commision earned for that month
Todate_Earn_Comm = Total Running Commision earned upto that month
MYSQL CODE that i wrote
SELECT dx1.month,
dx1.ref_pur,
dx1.earn_comm,
( #cum_earn := #cum_earn + dx1.earn_comm ) as todate_earn_comm
FROM
(
select date_format(`date`,'%Y-%m') as month,
sum(amount) as ref_pur ,
(sum(amount)*0.1) as earn_comm
from transaction tr, reference rf
where tr.person_id=rf.person_id and
tr.action='Purchase' and
rf.referrer_id=1
group by date_format(`date`,'%Y-%m')
order by date_format(`date`,'%Y-%m')
)as dx1
JOIN (select #cum_earn:=0)e;
How to join the query to also include BonusRecvd,Paid and Due trnsactions, which is not dependent on reference table?
and also generate row for the month '11-2011', even though no trnx occured on that month
If you want to include commission payments and bonuses into the results, you'll probably need to include corresponding rows (Action IN ('Commision', 'Cm_Bonus')) into the initial dataset you are using to calculate the results on. Or, at least, that's what I would do, and it might be like this:
SELECT t.Amount, t.Action, t.Date
FROM Transaction t LEFT JOIN Reference r ON t.Person_id = r.Person_id
WHERE r.Referrer_id = 1 AND t.Action = 'Purchase'
OR t.Person_id = 1 AND t.Action IN ('Commision', 'Cm_Bonus')
And when calculating monthly SUMs, you can use CASE expressions to distinguish among Amounts related to differnt types of Action. This is how the corresponding part of the query might look like:
…
IFNULL(SUM(CASE Action WHEN 'Purchase' THEN Amount END) , 0) AS Ref_Pur,
IFNULL(SUM(CASE Action WHEN 'Purchase' THEN Amount END) * 0.1, 0) AS Earn_Comm,
IFNULL(SUM(CASE Action WHEN 'Cm_Bonus' THEN Amount END) , 0) AS BonusRecvd,
IFNULL(SUM(CASE Action WHEN 'Commision' THEN Amount END) , 0) AS Paid
…
When calculating the Due values, you can initialise another variable and use it quite similarly to #cum_earn, except you'll also need to subtract Paid, something like this:
(#cum_due := #cum_due + Earn_Comm - Paid) AS Due
One last problem seems to be missing months. To address it, I would do the following:
Get the first and the last date from the subset to be processed (as obtained by the query at the beginning of this post).
Get the corresponding month for each of the dates (i.e. another date which is merely the first of the same month).
Using a numbers table, generate a list of months covering the two calculated in the previous step.
Filter out the months that are present in the subset to be processed and use the remaining ones to add dummy transactions to the subset.
As you can see, the "subset to be processed" needs to be touched twice when performing these steps. So, for effeciency, I would insert that subset into a temporary table and use that table, instead of executing the same (sub)query several times.
A numbers table mentioned in Step #3 is a tool that I would recommend keep always handy. You would only need to initialise it once, and its uses for you may turn out numerous, if you pardon the pun. Here's but one way to populate a numbers table:
CREATE TABLE numbers (n int);
INSERT INTO numbers (n) SELECT 0;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
INSERT INTO numbers (n) SELECT cnt + n FROM numbers, (SELECT COUNT(*) AS cnt FROM numbers) s;
/* repeat as necessary; every repeated line doubles the number of rows */
And that seems to be it. I will not post a complete solution here to spare you the chance to try to use the above suggestions in your own way, in case you are keen to. But if you are struggling or just want to verify that they can be applied to the required effect, you can try this SQL Fiddle page for a complete solution "in action".

can i use column alias in many positions in the query?

if i managed to make the folowing view in mysql
select id,name,score,total,CALCIT(total - score) as x,(CALCIT(total - score) / total) as per from tblx;
the process CALCIT(total - score) is being caculated two times
how to do some thing like this:
select id,name,score,total,CALCIT(total - score) as `x`,`x`/total as per from tblx;
where CALCIT is a function
MySQL will permit you to use a column alias inside the ORDER BY, GROUP BY clauses, but you won't be able to reuse the alias in the SELECT list. If you really needed to do this, having many instances of the calculated value, you can do a self JOIN which produces the calculation.
SELECT
id,
name,
score,
total,
x,
x / total AS per
FROM tblx JOIN (
/* Subquery JOIN which performs the calculation */
SELECT CALCIT(total - score) AS x FROM tblx xcalc
) ON tblx.id = xcalc.id
This method may be more performant than redoing the calculation in one SELECT, but as with anything, benchmark to find out.
Try something like this:
select *, x/total from (
select id,name,score,total,CALCIT(total - score) as x from tblx;
) as tblx
better you can use inner query --
select id,
name,
score,
total,
X,
X/total as per
from (
select id,
name,
score,
total,
CALCIT(total - score) as X from tblx
)

How can I get the percentage of total rows with mysql for a group?

below I have a query that will get the most common user agents for a site from a table of user agents and a linked table of ip addresses:
SELECT count(*) as num, string FROM `useragent_ip`
left join useragents on useragent_id = useragents.id
group by useragent_id
having num > 2
order by num desc, string
Sometimes it will show me something like
25 Firefox
22 IE
11 Chrome
3 Safari
1 Spider 1
1 Spider 2
1 Spider 3
My question is if there is a way that since the numbers on the left represent percentages of a whole, and will grow with time, can I have part of the sql statement to show each group's percentage of the whole? So that instead of using having num > 2 then I could do something that would say get the percentage of the total rows instead of just the number of rows?
Yes you can:
select num, string, 100 * num / total as percent
from (
select count(*) as num, string
from useragent_ip
left join useragents on useragent_id = useragents.id
group by useragent_id) x
cross join (
select count(*) as total
from useragent_ip
left join useragents on useragent_id = useragents.id) y
order by num desc, string;
I removed the having num > 2, because it didn't seem to make sense.
If you add with rollup after your group by clause, then you will get a row where string is NULL and num is the total of all the browsers. You can then use that number to generate percentages.
I can't really imagine a single query doing the calculation and being more efficient than using with rollup.