Apply group by on result of window function - mysql

I used window function to calculate each product's profit percentage
SELECT
productCode, productProfit, paymentDate, productName,
productProfit/sum(productProfit) OVER (PARTITION BY productCode) AS percent
FROM
profit;
The output
The next step, I want to calculate AVG(percent). How can I it into the first statement?
The result will look like this

Your way of calculating percent is bit weird. It seems that you are identifying contribution of particular transaction in overall profit.
Anyways, you can simply use your existing query's result-set as a Derived Table, and do a Group By using Year() function, to calculate the Avg():
SELECT
YEAR(dt.paymentDate) AS payment_date_year,
AVG(dt.percent) AS average_profit_percent
FROM
(
SELECT
productCode,
productProfit,
paymentDate,
productName,
productProfit/sum(productProfit) OVER (PARTITION BY productCode) AS percent
FROM
profit
) AS dt
GROUP BY
payment_date_year

Related

Sum Distinct Duplicated Values

I have a dataset as below:
customer buy profit
a laptop 350
a mobile 350
b laptop case 50
c laptop 200
c mouse 200
It does not matter how many rows the customer has, the profit is already stated in the row (it's already an accumulative sum). For example, the profit of customer a is 350 and the profit of customer c is 200.
I would like to sum uniquely the profit for all the customers so the desire output should be 350 + 50 + 200 = 600. However, I need to execute this in one line of code (without doing subquery, nested query or write in a separate CTE).
I tried with Partition By but cannot combine MAX and SUM together. I also tried SUM (DISTINCT) but it does not work either
MAX(profit) OVER (PARTITION BY customer)
If anyone could give me a hint on an approach to tackle this, it would be highly appreciated.
You really should use a subquery here:
SELECT SUM(profit) AS total_profit
FROM (SELECT DISTINCT customer, profit FROM yourTable) t;
By the way, your table design should probably change such that you are not storing the redundant profit per customer across many different records.
You can combine SUM() window function with MAX() aggregate function:
SELECT DISTINCT SUM(MAX(profit)) OVER () total
FROM tablename
GROUP BY customer;
See the demo.
Select sum(distinct profit) as sum from Table
You can select max profit of each customer like this:
SELECT customer, MAX(profit) AS max_profit
FROM tablename
GROUP BY customer
Then you can summarise the result in your code or even in the query as nested queries:
SELECT SUM(max_profit) FROM (
SELECT customer, MAX(profit) AS max_profit
FROM tablename
GROUP BY customer
) AS temptable

How to use a column in select statement which is not in aggregate function nor in group by clause? [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 10 months ago.
Above is the table and on the basis of which I have to answer the below question in my past interview.
Q. The most recent order value for each customer?
Answer which I have given in interview:
select customerID, ordervalue, max(orderdate)
from office
group by customerID;
I know since we are not using ordervalue in aggregate and nor in group by so this query will throw an error in SQL but I want to know how to answer this question.
Many times in my past interviewers asked a question where I need to use a column in select statement which is not in aggregate function or nor in group by. So I want know in general what is a workaround for it with an example so that I can resolve these type of questions or how to answer these questions.
The work around depends on what is being asked. For the requirements you have above, I think it makes sense to create (customerid, MAX(orderdate)) pairs.
SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid;
Then you can use them to match the row you need from the table.
SELECT customerid, ordervalue, orderdate
FROM office
WHERE (customerid, orderdate) IN
(SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid);
Note, this assumes there is only one order per customer per day. If there were more than one, you would see the most recent order(s) per customer. You could add also a GROUP BY on the outer query if needed.
SELECT customerid, MAX(ordervalue), orderdate
FROM office AS tt
WHERE (customerid, orderdate) IN
(SELECT customerid, MAX(orderdate)
FROM office
GROUP BY customerid)
GROUP BY customerid, orderdate;
If the non-aggregate column you need in the SELECT is functionally dependent on the column in the GROUP BY, you can add a subquery in the SELECT.
We can extend your example by adding a name column, where the name of different customers could be the same. If you wanted name instead of ordervalue, just match the customerid of the outer query to get name.
SELECT customerid,
(SELECT name FROM office WHERE customerid=o.customerid LIMIT 1) AS name,
MAX(orderdate)
FROM office AS o
GROUP BY customerid;
You are approaching the task as follows: Aggregate all rows to get one result line per customer, showing the maximum order date and its order value. The problem with this: you'd need an aggregate function to get the value for the maximum order date. The only DBMS I know of featuring such a function is Oracle with KEEP FIRST/LAST.
So look at the task from a different angle. Don't think aggregation-wise where you could count and add up values for a group and get the minimum or maximum value over all the group's rows, because after all you just want to pick single rows. (That is, pick the top 1 row per customer.) In order to pick rows, you'll use a WHERE clause.
One option has been shown by Steve in his answer:
select *
from office
where (customerid, orderdate) in
(
select customerid, max(orderdate)
from office
group by customerid
);
This is a good, straight-forward approach. (Some DBMS, though, don't feature tuples with IN clauses.)
Another way to get the "best" row for a customer would be to pick those rows for which not exists a better row:
select *
from office
where not exists
(
select null
from office better
where better.customerid = office.customerid
and better.orderdate > office.orderdate
);
And then there is the option to use a window function (aka analytic function) in order to get those rows. One example is to get the maximum dates along with the rows' data:
select customerid, ordervalue, orderdate
from
(
select
customerid, ordervalue, orderdate,
max(orderdate) over (partition by customerid) as max_orderdate
from office
)
where orderdate = max_orderdate;
And with ROW_NUMBER, RANK, and DENSE_RANK there are window functions to assign numbers to your rows in the order you want. You number them such that the best rows get number 1 and pick them. The big advantage here: you can apply any order, deal with ties and not only get the top 1, but the top n rows.
select customerid, ordervalue, orderdate
from
(
select
customerid, ordervalue, orderdate,
row_number() over (partition by customerid order by orderdate desc) as rn
from office
)
where rn = 1;

What is the best way to select rows with maximum value?

I have come across a task, I managed to complete the objective but the solution I got is not optimum, I need more optimum solution. I have used normal Sub Queries May be Correlated Sub Query can solve this better.
This is the table i made
SELECT custid,
count(DISTINCT bid) AS Total
FROM loan
GROUP BY custid;
The output of this is like:-
What I want is the custid having maximum Total.
One way to do it is using Order by Total DESC LIMIT 1 but this will give only 1 result.
What I did is
SELECT custid
FROM (SELECT custid,
count(DISTINCT bid) AS Total
FROM loan
GROUP BY custid) c1
WHERE total = (SELECT max(Total)
FROM (SELECT custid,
count(DISTINCT bid) AS Total
FROM loan
GROUP BY custid) c2)
This gives me correct result that is
What I want to do is reduce the code, because here I am writing the same thing again. I know there must be a simpler way to do it. Maybe a correlated query.
Looking for some good answers. This is basically to clear my concepts only
Sorry, if it is noob question. I am a noob to SQL.
After understand what OP want with #Ravinder 's tip,
I guess build in mysql function GROUP_CONCAT is what you need, sql is:
select custid_count.Total, GROUP_CONCAT(custid_count.custid order by custid_count.custid asc SEPARATOR ',') as custids from
(select custid, count(distinct bid) as Total from loan group by custid order by Total desc) as custid_count
group by custid_count.Total
order by custid_count.Total desc
limit 1;
the result column custids is the max ids concated by ',' ,after the query, you need to split custids by ',' and convert each substring to number type you need,
Here is another way:
select * from loan
where custid =
(
select custid_count.custid from
(select custid, count(distinct bid) as Total from loan group by custid order by Total desc) as custid_count
order by custid_count.Total desc
limit 1
);
First find the custid with max count, then query all rows which match the custid,
I haven't tried this in mysql, but in the sql language I'm using it is fine to use a aggregation function without a group by so something like this
select custid, total, max(total) as maxtotal
from (select custid, count(distinct bid) as total
from loan
group by custid) c1;
would tag on every line both the individual customer total and the table wide max total, and you'd just have to filter on the ones that where the total was equal to the max total. That would give you a final answer of something like this:
select custid
from (select custid, count(distinct bid) as total
from loan
group by custid) c1
where total = max(total);

Filling a field inside a select stamenet - MySQL

I'm trying to do calculations and fill a field inside a select statement. It looks like this:
CREATE VIEW SALES_REPORT AS(
SELECT
INVOICENO,
INVOICEDATE,
CLIENTID,
CONTACT,
INVOICEJOBNO,
ADDCHARGES,
CHARGESINFO,
EMPLOYEEID,
USUALPAY,
VAT,
SUBTOTAL (SELECT(USUALPAY * COUNT(*) AS SUBTOTAL FROM SALES_REPORT)),
TOTAL = (SUBTOTAL * VAT)
FROM SALES_REPORT_JOINS_CONFIG
GROUP BY INVOICENO ORDER BY INVOICEDATE DESC);
Any help would be great, thanks!
TOTAL = (SUBTOTAL * VAT)
should probably be
(SUBTOTAL * VAT) AS TOTAL
right now it's going to return the boolean true/false result of an equality comparison. You're NOT assigning the multiplication results to a 'total' field - you're comparing whatever value is in total to the result of the multiplication.
and this is a flat-out syntax error:
SUBTOTAL (SELECT(USUALPAY * COUNT(*) AS SUBTOTAL FROM SALES_REPORT)),

SQL aggregation query for SUM, but only allow positive sums (otherwise 0)

I want to query an orders table and show the customer id and the total of all his orders, however the orders can have positive or negative totals.
select customer_id, SUM(order_total) from orders group by customer_id;
Now my question - how can I achieve the following in one sql query:
If the total sum is positive, I want to display it as is; if the total sum is negative, I just want to display 0 instead the actual amount.
What I am looking for is a function that can handle this, similar to the IFNULL function (IFNULL(SUM(order_total),0)), but instead of checking for null, it should check for a negative result.
Pseudo code:
IFNEGATIVE(SUM(order_total),0)
Is there a simple way in standard sql (or specifically in Mysql 5.5, would also be ok).
SELECT customer_id,
CASE
WHEN SUM(order_total) < 0 THEN 0
ELSE SUM(order_total)
END
FROM orders
GROUP BY customer_id;
Check your execution plan, but the 2 SUMs will probably be optimized to a single SUM under the hood.
Try with:
select customer_id, GREATEST( SUM(order_total),0) from orders group by customer_id;
Not tested, but something like this should do it:
SELECT customer_id , IF( SUM(order_total) > 0, SUM(order_total), 0) AS sum FROM orders GROUP BY customer_id
Could you not use a CASE statement?
Something like:
CASE WHEN [Field] < 0 THEN 0
Or did I miss something?
if i understand its only wrap it with GREATEST
SELECT customer_id, GREATEST(0,SUM(order_total))
FROM orders GROUP BY customer_id;
look on the link
select Id,case when (sum(amount)<0) then 0 else sum(amount) end from tblsum group by Id
You can also try;
select
(sum(fld) + abs(sum(fld))) / 2
from tbl
To only display positive Use HAVING thus:
select customer_id, SUM(order_total) from orders group by customer_id HAVING SUM(order_total) > 0;
Otherwise use case as listed elsewhere here