SQL SUM and comparing - mysql

I have two database tables:
***aff_purchases***
id | affiliate_id | payout
1 | 12 | 50.00
2 | 12 | 10.00
3 | 12 | 50.00
4 | 12 | 10.00
***aff_payments***
id | affiliate_id | amount_paid
8 | 12 | 50.00
I would like to return an array of all affiliate IDs where the 'payout' total is 50 or more than the 'amount_paid' for an affiliate ID.
I think that I need to SUM together the columns and then compare, but I am struggling to understand how. Please see my efforts below:
SELECT
(SELECT SUM(amount_paid) FROM exp_cdwd_aff_payments AS pay WHERE pay.affiliate_id = 12) AS 'amount_paid'
(SELECT SUM(payout) FROM exp_cdwd_aff_purchases AS pur WHERE pur.affiliate_id = 12) AS 'payout'
FROM
exp_cdwd_aff_payments AS pay
WHERE
payout > amount_paid

One approach here is to use join two separate subqueries which find the payout and payment totals. Then, compare each affiliate_id to see if meets your requirement.
SELECT
t1.affiliate_id
FROM
(
SELECT affiliate_id, SUM(amount_paid) AS amount_paid_total
FROM aff_payments
GROUP BY affiliate_id
) t1
LEFT JOIN
(
SELECT affiliate_id, SUM(payout) AS payout_total
FROM aff_purchases
GROUP BY affiliate_id
) t2
ON t1.affiliate_id = t2.affiliate_id
WHERE COALESCE(t2.payout_total, 0) > t1.amount_paid_total + 50
Note that affiliates who have a payout, but have not paid, would not appear in the result set.

The problem is that you cannot use aliases defined in the SELECT in the WHERE. In most databases, you would use a CTE or subquery. However, MySQL does not support CTEs and it imposes overhead on subqueries (by materializing them).
So, MySQL has overloaded the HAVING clause, to allow it to be used in non-aggregation queries. You can do what you want using HAVING:
SELECT a.affiliate_id,
(SELECT SUM(cap.amount_paid) FROM exp_cdwd_aff_payments cap WHERE cap.affiliate_id = a.affiliate_id) AS amount_paid
(SELECT SUM(pur.payout) FROM exp_cdwd_aff_purchases pur WHERE pur.affiliate_id = a.affiliate_id) AS payout
FROM affiliates a
HAVING payout > amount_paid;
The above assumes you have a table with one row per affiliate_id. It uses this via a correlated subquery. Also note the use of table aliases and qualified column names.

Related

How to get list of students who have enrolled atleast once and then final status is not enrolled?

I've following table, It has log of students who enrolled and enrolled out datewise.
student_id | is_enrolled | created_at
-------------------------------------
1 | 1 | 2020-01-01
2 | 0 | 2020-01-02
3 | 0 | 2020-01-01
1 | 0 | 2020-01-02
4 | 1 | 2020-01-02
1 | 0 | 2020-01-03
3 | 0 | 2020-01-03
4 | 1 | 2020-01-04
If you see, the student 1 has enrolled on 2020-01-01 and then enrolled out on 2020-01-02. Student 2 and 3 have never enrolled. Student 4 enrolled multiple times but never enrolled out. Hence, not in the output.
Basically, I want to write a query whose output is students like 1, who have atleast enrolled once and final status is not enrolled. I was able to get all the enrolled students, but stuck after that point.
My queries,
SELECT DISTINCT student_id
FROM student
WHERE is_enrolled = 1
ORDER
BY student_id; # gives me 1 and 4
SQL fiddle
Ideally, a single query solution without nested query would be awesome. I'm, okay with multiple query solution as well.
Note: I was able to get the required output by using for-loops in my code, but I would like to learn can I do this just by SQL queries. I'm not looking for any programming language code.
SELECT DISTINCT x.*
FROM student x
JOIN
( SELECT student_id
, MAX(created_at) created_at
FROM student
GROUP
BY student_id
) y
ON y.student_id = x.student_id
AND y.created_at = x.created_at
JOIN student z
ON z.student_id = x.student_id
AND z.is_enrolled = 1
WHERE x.is_enrolled = 0;
As an aside, never use SELECT *, and in the absence of any aggregating functions, a GROUP BY clause is NEVER appropriate.
I'm not a DBA ( database expert ), but I'll normally use something like this for my MSSQL database:
WITH summary AS (
SELECT
student_id,
is_enrolled,
created_at
ROW_NUMBER() OVER(PARTITION BY s.student_id ORDER BY s.created_at DESC) AS rk
FROM student s)
SELECT s.*
FROM summary s
WHERE s.rk = 1
AND is_enrolled = 1
What I did was adding an extra column after the order by is done, you want to see if the latest created value has an is_enrolled value of 1.
The "With" part is used to define a sub query, with some extra logic in there.
You can use aggreation:
select student_id
from student s
group by student_id
having sum(is_enrolled) >= 1 and
max(created_at) = max(case when is_enrolled = 0 then created_at end);
The first condition checks that the student is enrolled at least once.
The second checks that the latest created_at is the latest created_at for an unenrolled record. That checks that the last status is "unenrolled".
Here is the SQL Fiddle.

SQL Predicates per aggregate function

I am struggling with a SQL query,
Query:
I want to find a list of hospitals with a count of dentists (is_denitist=true) and all doctors (including dentists) having monthly_income > 100 000
I have 2 tables Hospitals and Doctors with the following schema,
-------------
| Hospital |
|-----------|
| id | name |
|-----------|
---------------------------------------------------------
| Doctor |
|--------------------------------------------------------
| id | name | monthly_income | is_dentist | hospital_id |
|--------------------------------------------------------
The query I came up with is,
select h.name, count(d.is_dentist), sum(d.monthly_income)
from Hospital h inner join Doctor d
on h.id = d.hospital_id
where d.monthly_income > 100000 and d.is_dentist=true
group by h.name;
If I am a dentist and having income less than 100 000 then the hospital should still count me as a dentist.
But the caveat in the above query is it filters out all doctors having monthly_income above 100 000 and are dentists. I want an independent count of these conditions like predicates over each count() column. How can we achieve this in a single query?
You can do conditional aggregation.
Since is_dentist (presumably) contains 0/1 values, you can just sum() this column to count how many doctors belong to the group.
On the other hand, you can use another conditional sum() to count how many doctors have an income above the threshold.
select
h.name,
sum(d.is_dentist) no_dentists,
sum(d.monthly_income > 100000) no_doctors_above_100000_income
from Hospital h
inner join Doctor d on h.id = d.hospital_id
group by h.name;
You have two independent conditions (monthly_income > 100000, and is_dentist=true) which means there are two different data sets. You can't be used two different data set in the same group query.
So you need to divide it into two subqueries. You can check the following query whether the result is you wanted:
select temp3.name, temp1.dentist_count, temp2.income_count from
(select d1.hospital_id, count(*) as dentist_count from Doctor d1 where d1.monthly_income>100000 group by d1.hospital_id) as temp1
join
(select d2.hospital_id, count(*) as income_count from Doctor d2 where d2.is_dentist=true group by d2.hospital_id) as temp2
on temp1.hospital_id=temp2.hospital_id
join
(select h.id, h.name from Hospital h) as temp3
on temp2.hospital_id=temp3.id;

Counting all the wins in two columns using two columns?

Let's take this table for an example...
m_tid | m_tid2 | m_hteam_score | m_ateam_score
2 5 69 30
5 2 0 5
I'm bad at custom making tables, sorry...
So let's take this data, now m_tid and m_tid2 are columns for TID's that are in a separate table of their own.
Now what I want to do, is collect the score for team id2 (or team id1) of all the scores... How would I count two columns for whether or not the team is on m_tid and m_tid2
I don't have a query made, but I wouldn't know how I would go about making a query for this anyways. :(
The expected results would be something like this
m_tid | m_tid_score | m_tid2 | m_tidscore2
5 35 2 69
If you want to get the total score for each team, here is one method using correlated subqueries:
select t.*,
(coalesce((select sum(s.m_hteam_score) from scores s where s.m_tid = t.tid), 0) +
coalesce((select sum(s.m_ateam_score) from scores s where s.m_tid2 = t.tid), 0)
) as totalscore
from teams t;
Here's another option using conditional aggregation:
select o.id, sum(case when y.m_tid = o.id then y.m_hteam_score
when y.m_tid2 = o.id then y.m_ateam_score
else 0 end) score
from othertable o
join yourtable y on o.id in (y.m_tid, y.m_tid2)
group by o.id

Mysql each row sum

How can I get result like below with mysql?
> +--------+------+------------+
> | code | qty | total |
> +--------+------+------------+
> | aaa | 30 | 75 |
> | bbb | 20 | 45 |
> | ccc | 25 | 25 |
> +--------+------+------------+
total is value of the rows and the others that comes after this.
You can do this with a correlated subquery -- assuming that the ordering is alphabetical:
select code, qty,
(select sum(t2.qty)
from mytable t2
where t2.code >= t.code
) as total
from mytable t;
SQL tables represent unordered sets. So, a table, by itself, has no notion of rows coming after. In your example, the codes are alphabetical, so they provide one definition. In practice, there is usually an id or creation date that serves this purpose.
I would use join, imho usually fits better.
Data:
create table tab (
code varchar(10),
qty int
);
insert into tab (code, qty)
select * from (
select 'aaa' as code, 30 as qty union
select 'bbb', 20 union
select 'ccc', 25
) t
Query:
select t.code, t.qty, sum(t1.qty) as total
from tab t
join tab t1 on t.code <= t1.code
group by t.code, t.qty
order by t.code
The best way is to try both queries (my and with subquery that #Gordon mentioned) and choose the faster one.
Fiddle: http://sqlfiddle.com/#!2/24c0f/1
Consider using variables. It looks like:
select code, qty, (#total := ifnull(#total, 0) + qty) as total
from your_table
order by code desc
...and reverse query results list afterward.
If you need pure SQL solution, you may compute sum of all your qty values and store it in variable.
Also, look at: Calculate a running total in MySQL

What to do with Full Outer Join

I need a Full Outer Join in mysql. I found a solution here: Full Outer Join in MySQL My problem is that t1 and t2 are subqueries themselves. So resulting query looks like a monster.
What to do in this situation? Should I use views instead of subqueries?
Edit:
I'll try to explain a bit more. I have orders and payments. One payment can cower multiple orders, and one order can be cowered by multiple payments. That is why I have tables orders, payments, and paymentitems. Each order has field company (which made this order) and manager (which accepted this order). Now I need to group orders and payments by company and manager and count money. So I want to get something like this:
company1 | managerA | 200 | 200 | 0
company1 | managerB | Null | 100 | 100
company1 | managerC | 300 | Null | -300
company2 | managerA | 150 | Null | -150
company2 | managerB | 100 | 350 | 250
The query, I managed to create:
SELECT coalesce(o.o_company, p.o_company)
, coalesce(o.o_manager, p.o_manager)
, o.orderstotal
, p.paymentstotal
, (coalesce(p.paymentstotal, 0) - coalesce(o.orderstotal, 0)) AS balance
FROM
(((/*Subquery A*/SELECT orders.o_company
, orders.o_manager
, sum(o_money) AS orderstotal
FROM
orders
WHERE
(o_date >= #startdate)
AND (o_date <= #enddate)
GROUP BY
o_company
, o_manager) AS o
LEFT JOIN (/*Subquery B*/SELECT orders.o_company
, orders.o_manager
, sum(paymentitems.p_money) AS paymentstotal
FROM
((payments
INNER JOIN paymentitems
ON payments.p_id = paymentitems.p_id)
INNER JOIN orders
ON paymentitems.p_oid = orders.o_id)
WHERE
(payments.p_date >= #startdate)
AND (payments.p_date <= #enddate)
GROUP BY
orders.o_company
, orders.o_manager) AS p
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager))
union
(/*Subquery A*/
right join /*Subquery B*/
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager)))
This is simplified version of my query. Real query is much more complex, that is why I want to keep it as simple as it can be. Maybe even split in to views, or may be there are other options I am not aware of.
I think the clue is in "group orders and payments by company". Break the outer join into a query on orders and another query on payments, then add up the type of money (orders or payments) for each company.
If you are trying to do a full outer join and the relationship is 1-1, then you can accomplish the same thing with a union and aggreagation.
Here is an example, pulling one column from two different tables:
select id, max(col1) as col1, max(col2) as col2
from ((select t1.id, t1.col1, NULL as col2
from t1
) union all
(select t23.id, NULL as col1, t2.col2
from t2
)
) t
group by id