SUM DISTINCT based off a specific column? - mysql

I am attempting to sum the balance of each customer only once. Normally I would use a SUM DISTINCT expression, however, one column is throwing it off and the row is no longer "Distinct".
For example:
Customer Number - Customer Name - Exception Type - Balance
CIF13443 - Paul - 1 - 125
CIF13452 - Ryan - 2 - 85
CIF13443 - Paul - 3 - 125
CIF13765 - Linda - 1 - 90
In this case, if I use SUM DISTINCT, Paul's balance will be summed up twice simply because he has a different exception type, where in fact I only want SSRS to sum each customer once. One way would be to SUM the "Balance" of only DISTINCT customer numbers. Possibly by grouping the Customer Number? Is that possible on SSRS? I would rather not touch the SQL Dataset.
Thanks!
Any input is appreciated.

you can use union to dustinct the duplicate line:
Query 1(Using Union):
;WITH testdata(CustomerNumber,CustomerName,ExceptionType,Balance)AS(
SELECT 'CIF13443','Paul','1',125 UNION all
SELECT 'CIF13452','Ryan ','2',85 UNION all
SELECT 'CIF13443','Paul','3',125 UNION all
SELECT 'CIF13765','Linda','1',90
)
SELECT CustomerNumber,CustomerName,Balance FROM testdata UNION SELECT NULL,NULL,NULL WHERE 1!=1
/*
SELECT CustomerNumber,CustomerName,SUM(t.Balance) AS total_balance
FROM (
SELECT CustomerNumber,CustomerName,Balance FROM testdata UNION SELECT NULL,NULL,NULL
) AS t WHERE CustomerNumber IS NOT null
GROUP BY CustomerNumber,CustomerName
*/
CustomerNumber CustomerName Balance
-------------- ------------ -----------
CIF13443 Paul 125
CIF13452 Ryan 85
CIF13765 Linda 90
Using windows function pick one line for the duplicate lines:
;WITH testdata(CustomerNumber,CustomerName,ExceptionType,Balance)AS(
SELECT 'CIF13443','Paul','1',125 UNION all
SELECT 'CIF13452','Ryan ','2',85 UNION all
SELECT 'CIF13443','Paul','3',125 UNION all
SELECT 'CIF13765','Linda','1',90
)
SELECT CustomerNumber,CustomerName,SUM(t.Balance) AS total_balance
FROM (
SELECT CustomerNumber,CustomerName,Balance,ROW_NUMBER()OVER(PARTITION BY CustomerNumber,CustomerName,Balance ORDER BY testdata.ExceptionType) seq FROM testdata
) AS t WHERE t.seq=1
GROUP BY CustomerNumber,CustomerName

Related

SQL Query Sequential Month Logins

I have the following SQL table
username
Month
292
10
123
12
123
1
123
2
123
4
345
6
345
7
I want to query it, to get each username's login streak in Count of sequential Month. meaning the end result I am looking for looks like this :
username
Streak
292
1
123
3
345
2
How can I achieve it ? taking into note the Month 12 --> Month 1 issue;
Appreciate your help;
This would give you the result you want:
select username, count(*)
from (
select
username
, month_1
, coalesce(nullif(lead(month_1)
over (partition by username
order by coalesce(nullif(month_1,12),0))
- coalesce(nullif(month_1,12),0),-1),1) as MonthsTillNext
from login_tab
) Step1
where MonthsTillNext=1
group by username
By calculating the difference from the next row, where the next row is defined as the next month_no in ascending order, treating 12 as 0 (refer to the ambiguity I mentioned in my comment). It then just leaves the rows for consecutive months rows, and counts them.
Beware though, in addition to the anomaly around month:12, there is another case not considered: if the months for the user are 1,2,3 and 6,7,8 this would count as Streak:6; is it what you wanted?
One way would be with a recursive CTE, like
WITH RECURSIVE cte (username, month, cnt) AS
(
SELECT username, month, 1
FROM test
UNION ALL
SELECT test.username, test.month, cte.cnt+1
FROM cte INNER JOIN test
ON cte.username = test.username AND CASE WHEN cte.month = 12 THEN 1 ELSE cte.month + 1 END = test.month
)
SELECT username, MAX(cnt)
FROM cte
GROUP BY username
ORDER BY username
The idea is that the CTE (named cte in my example) recursively joins back to the table on a condition where the user is the same and the month is the next one. So for user 345, you have:
Username
Month
Cnt
345
6
1
345
7
1
345
7
2
The rows with cnt=1 are from the original table (with the extra cnt column hardcoded to 1), the row with cnt=2 is from the recursive part of the query (which found a match and used cnt+1 for its cnt). The query then selects the maximum for each user.
The join uses a CASE statement to handle 12 being followed by 1.
You can see it working with your sample data in this fiddle.
The one shared by #EdmCoff is quite elegant.
Another one without recursive and just using conditional logic -
with data_cte as
(
select username, month_1,
case when (count(month_1) over (partition by username) = 1) then 1
when (lead(month_1) over (partition by username order by username) - month_1) = 1 OR (month_1 - lag(month_1) over (partition by username order by username)) = 1 then 1
when (month_1 = 12 and min (month_1) over (partition by username) =1) then 1
end cnt
from login_tab
)
select username, count(cnt) from data_cte group by username
DB Fiddle here.

Select Distinct whilst adding tuples together in SQL

I have a table that contains random data against a key with duplicate entries. I'm looking to remove the duplicates (a projection as it is called in relational algebra), but rather than discarding the attached data, sum it together. For example:
orderID cost
1 5
1 2
1 10
2 3
2 3
3 15
Should remove duplicates from orderID whilst summing each orderID's values:
orderID cost
1 17 (5 + 2 + 10)
2 6
3 15
My assumption is I'd use SELECT DISTINCT somehow, but I don't know how I'd go about doing so. I understand GROUP BY might be able to do something but I am unsure.
This is a very basic aggregation:
SELECT orderId, SUM(cost) AS cost
FROM MyTable
GROUP BY orderId
This says, for each "orderId" grouping, sum the "cost" field and return one value per group.
You can use the group by clause to get one row per distinct values of the column(s) you're grouping by - orderId in this case. You can the apply an aggregate function to get a result of the columns you aren't grouping by - sum, in this case:
SELECT orderId, SUM(cost)
FROM mytable
GROUP BY orderId

Guidance required for sql query

I have a database with one table as shown below. Here I'm trying to write a query to display the names of medication manufactured by the company that manufactures the most number of medications.
By looking at the table we could say the medication names which belongs to the company id 1 and 2 - because those company manufactures the most medication according to this table, but I'm not sure how to write a query for selecting the same i said before.
ID | COMPANY_ID | MEDICATION_NAME
1 1 ASPIRIN
2 1 GLUCERNA
3 2 SIBUTRAMINE
4 1 IBUPROFEN
5 2 VENOFER
6 2 AVONEN
7 4 ACETAMINOPHEN
8 3 ACETAMINO
9 3 GLIPIZIDE
Please share your suggestions. Thanks!
Several ways to do this. Here's one which first uses a subquery to get the maximum count, then another subquery to get the companies with that count, and finally the outer query to return the results:
select *
from yourtable
where companyid in (
select companyid
from yourtable
group by companyid
having count(1) = (
select count(1) cnt
from yourtable
group by companyid
order by 1 desc
limit 1
)
)
SQL Fiddle Demo
This Query might work. I have not tested but the logic is correct
SELECT MEDICATION_NAME
FROM TABLE where
COMPANY_ID=(SELECT
MAX(counted)
FROM ( SELECT COUNT(*) AS counted FROM TABLE ) AS counts);

Lead() and LAG() functionality in SQL Server 2008

Hope all the SQL GURUS out there are doing great :)
I am trying to simulate LEAD() and LAG() functionality in SQL Server 2008.
This is my scenario: I have a temp table which is populated using the base query with the business logic for mileage. I want to calculate accumulated mileage for each user per day.
The temp table is setup using ROW_NUMBER(), so I have all the data needed in the temp table except the accumulated mileage.
I have tried using a CTE with the base query and self joining with itself and couldn't get it working. I am attaching the screen shot for the same.
Any help/suggestion would be appreciated.
You are on the right track by joining the table to itself. I included 2 methods of doing this below that should work fine here. The first trick is in your ROW_NUMBER, be sure to partition by the user id and sort by the date. Then you can use either an INNER JOIN with aggregation or CROSS APPLY to build your running totals.
Setting up the data with the partitioned ROW_NUMBER():
DECLARE #Data TABLE (
RowNum INT,
UserId INT,
Date DATE,
Miles INT
)
INSERT #Data
SELECT
ROW_NUMBER() OVER (PARTITION BY UserId
ORDER BY Date) AS RowNum,
*
FROM (
SELECT 1, '2015-01-01', 5
UNION ALL SELECT 1, '2015-01-02', 6
UNION ALL SELECT 2, '2015-01-01', 7
UNION ALL SELECT 2, '2015-01-02', 3
UNION ALL SELECT 2, '2015-01-03', 2
) T (UserId, Date, Miles)
Use INNER JOIN with Aggregation
SELECT
D1.UserId,
D1.Date,
D1.Miles,
SUM(D2.Miles) AS [Total]
FROM #Data D1
INNER JOIN #Data D2
ON D1.UserId = D2.UserId
AND D2.RowNum <= D1.RowNum
GROUP BY
D1.UserId,
D1.Date,
D1.Miles
Use CROSS APPLY for the running total
SELECT
UserId,
Date,
Miles,
Total
FROM #Data D1
CROSS APPLY (
SELECT SUM(Miles) AS Total
FROM #Data
WHERE UserId = D1.UserId
AND RowNum <= D1.RowNum
) RunningTotal
Output is the same for each method:
UserId Date Miles Total
----------- ---------- ----------- -----------
1 2015-01-01 5 5
1 2015-01-02 6 11
2 2015-01-01 7 7
2 2015-01-02 3 10
2 2015-01-03 2 12

How to add different columns from different tables?

I just wanted to add different columns from different tables... Has anyone any idea on how to do that?
Consider I have 3 tables as below
tv sales
AC sales
cooler sales
And the tables data as follows
1)Tv Sales
Id Date NoOfSales Totalamount
1 03/05/2014 10 10000
2 04/05/2014 20 20000
3 05/05/2014 30 30000
2)Ac Sales
Id Date NoOfSales Totalamount
1 03/05/2014 10 50000
2 04/05/2014 20 60000
3 05/05/2014 30 70000
3)cooler Sales
Id Date NoOfSales Totalamount
1 03/05/2014 10 30000
2 04/05/2014 20 60000
3 05/05/2014 30 70000
Now I want to add the "Totalamount" from all the tables for a particular "date"
for example I need totalamount on 03/05/2014 as 90000
In MySQL, the easiest way to do this is with union all and aggregation:
select date, sum(totalamount) as TotalSales
from ((select date, totalamount from TvSales
) union all
(select date, totalamount from AcSales
) union all
(select date, totalamount from CoolerSales
)
) t
group by date;
The reason you want to use union all is in case the dates are different in the various tables. A join makes it possible to lose rows.
Second, having three tables with the same format is an indication of poor database design. You should really have one table with the sales and a column indicating which type of product it refers to.
You could solve your problem by making a union of the information you want to aggregate on the different tables and them sum the amounts. This would look like:
SELECT t.Date,SUM(t.Totalamount)
FROM
(
SELECT Date,Totalamount
FROM tvSales
UNION ALL
SELECT Date,Totalamount
FROM acSales
UNION ALL
SELECT Date,Totalamount
FROM coolerSales
) t
WHERE t.Date='03/05/2014'
GROUP BY t.Date
It is important that the fields of the union have the same name and type. In case they haven't the same name you should create common aliases for the 2 columns across the 3 select queries and then work with these aliases on the main query. Also the UNION should be performed including the ALL keyword in order to avoid eliminating duplicate records across the three tables.