Issue using the count function in SQL - sql-server-2008

I am running this query to get a count of the bookedby users total number of sales with insurance and total number of sales without insurance. However, all of the users are getting the same count for some reason. how can i change my query to show each users totals instead.
what i want is basically to figure out how many bookings each user had with and without insurance sales
bookedby is the agent
and T0 is the table that includes the information about bookings that do not include insurance
and t1 is the table that includes information about bookings with insurance
while both tables provide the same information how can i get a total by booked by for each agent from both tables
SELECT t0.BookedBy, count(t0.resnumber) as NonInsurance, COUNT(t1.resnumber) as Insurance
FROM (SELECT BookedBy, ResNumber, DATEPART(year, BookDate) AS Year, DATEPART(month, BookDate) AS month
FROM dbo.ResGeneral
WHERE ResNumber NOT IN (SELECT ResNumber FROM dbo.ResItinerary_insurance)
and ResStatus = 'a'
GROUP BY BookedBy, ResNumber, BookDate) t0
left JOIN (SELECT BookedBy, ResNumber, DATEPART(year, BookDate) AS Year, DATEPART(month, BookDate) AS month
FROM dbo.ResGeneral
WHERE ResNumber IN (SELECT ResNumber FROM dbo.ResItinerary_insurance)
and ResStatus = 'a') t1
ON t1.year = t0.year
group by t0.bookedby

I think this query is equivalent:
SELECT g.BookedBy,
SUM(CASE WHEN i.ResNumber IS NULL THEN 1 ELSE 0 END) AS NonInsurance,
SUM(CASE WHEN i.ResNumber IS NOT NULL THEN 1 ELSE 0 END) AS Insurance
FROM dbo.ResGeneral g
LEFT JOIN dbo.ResItinerary_insurance i
ON g.ResNumber = i.ResNumber
WHERE g.ResStatus = 'a'
GROUP BY g.BookedBy;

Your join condition looks incorrect:
ON t1.year = t0.year
This will cross join all rows with the same year. You probably want to use a more specific condition, for example t1.BookedBy = t0.BookedBy.

Related

Querying Customers who have rented a movie at least once every week or in the Weekend

I have a DB for movie_rental. The Tables I have are for :
Customer Level:
Primary key: Customer_id(INT)
first_name(VARCHAR)
last_name(VARCHAR)
Movie Level:
Primary key: Film_id(INT)
title(VARCHAR)
category(VARCHAR)
Rental Level:
Primary key: Rental_id(INT).
The other columns in this table are:
Rental_date(DATETIME)
customer_id(INT)
film_id(INT)
payment_date(DATETIME)
amount(DECIMAL(5,2))
Now the question is to Create a master list of customers categorized by the following:
Regulars, who rent at least once a week
Weekenders, for whom most of their rentals come on Saturday and Sundays
I am not looking for the code here but the logic to approach this problem. Have tried quite a number of ways but was not able to form the logic as to how I can look up for a customer id in each week. The code I tried is as follows:
select
r.customer_id
, concat(c.first_name, ' ', c.last_name) as Customer_Name
, dayname(r.rental_date) as day_of_rental
, case
when dayname(r.rental_date) in ('Monday','Tuesday','Wednesday','Thursday','Friday')
then 'Regulars'
else 'Weekenders'
end as Customer_Category
from rental r
inner join customer c on r.customer_id = c.customer_id;
I know it is not correct but I am not able to think beyond this.
First, you don't need the customer table for this. You can add that in after you have the classification.
To solve the problem, you need the following information:
The total number of rentals.
The total number of weeks with a rental.
The total number of weeks overall or with no rental.
The total number of rentals on weekend days.
You can obtain this information using aggregation:
select r.customer_id,
count(*) as num_rentals,
count(distinct yearweek(rental_date)) as num_weeks,
(to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
from rental r
group by r.customer_id;
Now, your question is a bit vague on thresholds and what to do if someone only rents on weekends but does so every week. So, I'll just make arbitrary assumptions for the final categorization:
select r.customer_id,
(case when num_weeks > 10 and
num_weeks >= num_weeks_overall * 0.9
then 'Regular' -- at least 10 weeks and rents in 90% of the weeks
when weekend_rentals >= 0.8 * num_rentals
then 'Weekender' -- 80% of rentals are on the weekend'
else 'Hoi Polloi'
end) as category
from (select r.customer_id,
count(*) as num_rentals,
count(distinct yearweek(rental_date)) as num_weeks,
(to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
from rental r
group by r.customer_id
) r;
The problem with the current approach is that every rental of every customer will be treated separately. I am assuming a customer might rent more than once and so, we will need to aggregate all rental data for a customer to calculate the category.
So to create the master table, you have mentioned in the logic that weekenders are customers "for whom most of their rentals come on Saturday and Sundays", whereas regulars are customers who rent at least once a week.
2 questions:-
What is the logic for "most" for weekenders?
Are these two categories mutually exclusive? From the statement it does not seem so, because a customer might rent only on a Saturday or a Sunday.
I have tried a solution in Oracle SQL dialect (working but performance can be improved) with the logic being thus: If the customer has rented more on weekdays than on weekends, the customer is a Regular, else a Weekender. This query can be modified based on the answers to the above questions.
select
c.customer_id,
c.first_name || ' ' || c.last_name as Customer_Name,
case
when r.reg_count>r.we_count then 'Regulars'
else 'Weekenders'
end as Customer_Category
from customer c
inner join
(select customer_id, count(case when trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') then 1 end) as reg_count,
count(case when trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') then 1 end) as we_count
from rental group by customer_id) r on r.customer_id=c.customer_id;
Updated query based on clarity given in comment:-
select
c.customer_id,
c.first_name || ' ' || c.last_name as Customer_Name,
case when rg.cnt>0 then 1 else 0 end as REGULAR,
case when we.cnt>0 then 1 else 0 end as WEEKENDER
from customer c
left outer join
(select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') group by customer_id) rg on rg.customer_id=c.customer_id
left outer join
(select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') group by customer_id) we on we.customer_id=c.customer_id;
Test Data :
insert into customer values (1, 'nonsensical', 'coder');
insert into rental values(1, 1, sysdate, 1, sysdate, 500);
insert into customer values (2, 'foo', 'bar');
insert into rental values(2, 2, sysdate-5, 2, sysdate-5, 800); [Current day is Friday]
Query Output (first query):
CUSTOMER_ID CUSTOMER_NAME CUSTOMER_CATEGORY
1 nonsensical coder Regulars
2 foo bar Weekenders
Query Output (second query):
CUSTOMER_ID CUSTOMER_NAME REGULAR WEEKENDER
1 nonsensical coder 0 1
2 foo bar 1 0
This is a study of cohorts. First find the minimal expression of each group:
# Weekday regulars
SELECT
customer_id
FROM rental
WHERE WEEKDAY(`date`) < 5 # 0-4 are weekdays
# Weekend warriors
SELECT
customer_id
FROM rental
WHERE WEEKDAY(`date`) > 4 # 5 and 6 are weekends
Now we know how to get a listing of customers who have rented on weekdays and weekends, inclusive. These queries only actually tell us that these were customers who visited on a day in the given series, hence we need to make some judgements.
Let's introduce a periodicity, which then allows us to gain thresholds. We'll need aggregation too, so we're going to count the weeks that are distinctly knowable by grouping to the rental.customer_id.
# Weekday regulars
SELECT
customer_id
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
GROUP BY customer_id
# Weekend warriors
SELECT
customer_id
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
We also need a determinant period:
FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS weeks_in_period
Put those together:
# Weekday regulars
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
GROUP BY customer_id
# Weekend warriors
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
So now we can introduce our threshold accumulator per cohort.
# Weekday regulars
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
GROUP BY customer_id
HAVING total_weeks = weeks_as_customer
# Weekend warriors
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
HAVING total_weeks = weeks_as_customer
Then we can use these to subquery our master list.
SELECT
customer.customer_id
, CONCAT(customer.first_name, ' ', customer.last_name) as customer_name
, CASE
WHEN regulars.customer_id IS NOT NULL THEN 'regular'
WHEN weekenders.customer_id IS NOT NULL THEN 'weekender'
ELSE NULL
AS category
FROM customer
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
LEFT JOIN (
SELECT
rental.customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(rental.`date`) < 5
GROUP BY rental.customer_id
HAVING total_weeks = weeks_as_customer
) AS regulars ON customer.customer_id = regulars.customer_id
LEFT JOIN (
SELECT
rental.customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(rental.`date`) > 4
GROUP BY rental.customer_id
HAVING total_weeks = weeks_as_customer
) AS weekenders ON customer.customer_id = weekenders.customer_id
HAVING category IS NOT NULL
There is some ambiguity as far as whether cross-cohorts are to be left out (regulars who missed a week because they rented on the weekend-only at least once, for instance). You would need to work this type of inclusivity/exclusivity question out.
This would involve going back to the cohort-specific queries to introduce and tune the queries to explain that degree of further comprehension, and/or add other cohort cross-cutting subqueries that can be combined in other ways to establish better and/or more comprehensions at the top view.
However, I think what I've provided matches reasonably with what you've provided given this caveat.

Convert SumIfs Excel Function to MySQL

The formula in cell G2 "ReplenQty" is:
=SUMIFS(D:D,A:A,A2,B:B,B2,C:C,">=" & E2,C:C,"<=" &F2)
The formula in cell H2 "RpInVar" is:
=IF($A2<>$A1,ROUND(VAR(IF($A:$A=$A2,$G:$G)),2),0)
I attempted this in MySQL:
SELECT DISTINCT
Part,
Customer,
OrdDt,
OrdQty,
StartDate,
ReplenDate,
SUM(CASE WHEN Part = Part AND Customer = Customer AND OrdDt >= StartDate AND OrdDt <= ReplenDate THEN OrdQty ELSE 0 END) AS ReplenQty,
VARIANCE(CASE WHEN Part = Part AND Customer = Customer AND OrdDt >= StartDate AND OrdDt <= ReplenDate THEN OrdQty ELSE 0 END) AS RpInVar,
FROM
BeforeReplenQty
GROUP BY
Part,
Customer,
OrdDt,
OrdQty,
StartDate,
ReplenDate;
Problem is OrdQty and ReplenQty are the same and RpInVar are all 0.
This query is quite long and complicated but working on this demo: http://sqlfiddle.com/#!9/3b3334/70
One task is to do a sum where order date is between start date and replenish date.
Then get the row where part is new compared to previous row.
The first part of the query is to get the variance, the second subquery is to get the sum of Ordered qty and the sub-query at the bottom is to get the row where part column has changed.
select tab.Part,tab.Customer,tab.OrdDt,tab.OrdQty,tab.StartDate,tab.ReplenDate,tab.ReplenQty,
case when sumtab.Rnk=1 then
(select variance(ReplenQty)
from (select sum(t1.OrdQty) as ReplenQty
from BeforeReplenQty t2
inner join BeforeReplenQty t1
where t2.part=t1.part and t2.customer=t1.customer
and t2.OrdDt between t1.StartDate and t1.ReplenDate
group by t1.Part,t1.Customer,t1.OrdDt,t1.OrdQty,t1.StartDate,t1.ReplenDate) t3) else 0 end as ReplenVar
from (
select t1.*,sum(t1.OrdQty) as ReplenQty
from BeforeReplenQty t2
inner join BeforeReplenQty t1
where t2.part=t1.part and t2.customer=t1.customer
and t2.OrdDt between t1.StartDate and t1.ReplenDate
group by t1.Part,t1.Customer,t1.OrdDt,t1.OrdQty,t1.StartDate,t1.ReplenDate) tab
left join (select part,customer,orddt,rnk
from (
select t.part,t.customer,t.OrdDt,
#s:=CASE WHEN #c <> t.part THEN 1 ELSE #s+1 END AS rnk,
#c:=t.part AS partSet
from (SELECT #s:= 0) s
inner join (SELECT #c:= 'A') c
inner join (SELECT * from BeforeReplenQty
order by Part, Customer, OrdDt) t
) tab
where rnk = 1
) sumtab
on tab.part=sumtab.part and tab.customer=sumtab.customer and tab.orddt=sumtab.orddt;

mysql Sub queries in COUNT along with GROUP BY YEAR

Having some trouble figuring out the best way to do this.
Here is what I'm trying to do:
SELECT
YEAR(t.voucher_date) as period,
COUNT(t.id) as total_count,
(SELECT COUNT(t2.id) FROM booking_global as t2 where t2.booking_status = 'CONFIRMED') as confirmed,
(SELECT COUNT(t3.id) FROM booking_global as t3 where t3.booking_status = 'PENDING') as pending
FROM booking_global t
GROUP BY YEAR(t.voucher_date)
This produces the below result.
period total_count CONFIRMED PENDING
2014 4 5 3
2015 4 5 3
Expected Result
period total_count CONFIRMED PENDING
2014 4 3 1
2015 4 2 2
Here i want to get CONFIRMED / PENDING count's for respective years, rather than getting count of all statuses.
I am not sure how to use my query as a sub query and run another query on the results.
Flowing should give you right rsult
SELECT
YEAR(t.voucher_date) as period,
COUNT(t.id) as total_count,
(SELECT COUNT(t2.id) FROM booking_global as t2 where t2.booking_status = 'CONFIRMED' and YEAR(t2.voucher_date) = YEAR(t.voucher_date)) as confirmed,
(SELECT COUNT(t3.id) FROM booking_global as t3 where t3.booking_status = 'PENDING' and YEAR(t3.voucher_date) = YEAR(t.voucher_date)) as pending
FROM booking_global t
GROUP BY YEAR(t.voucher_date)
You can have a subquery that calculates each booking_status for each year. The result of which is then joined on table booking_global. Example,
SELECT YEAR(t.voucher_date) voucher_date_year,
COUNT(t.id) total_count,
IFNULL(calc.confirmed_count, 0) confirmed_count,
IFNULL(calc.pending_count, 0) pending_count
FROM booking_global t
LEFT JOIN
(
SELECT YEAR(voucher_date) voucher_date_year,
SUM(booking_status = 'CONFIRMED') confirmed_count,
SUM(booking_status = 'PENDING') pending_count
FROM booking_global
GROUP BY YEAR(voucher_date)
) calc ON calc.voucher_date_year = YEAR(t.voucher_date)
GROUP BY YEAR(t.voucher_date)

Need a query to find consecutive orders by month

I am having trouble writing this query.
I need to get the current number of orders that were shipped in consecutive months.
Example: if the current month is November and they placed orders in July, August, September, October, November, it would return 5 for that user. If they didn't place an order in November, it would return 0 because their streak is broken.
The tables I'm concerned with are customer, order, and date.
Use a cross join between the date table and the customer table to get a row for every customer / month combination and then left join that against the order table to get the details, using group by to get the counts.
Something like this, although you will need to modify it to cope with the column names being reserved words.
SELECT customer.name, month.name, COUNT(order.id)
FROM customer
CROSS JOIN date
LEFT OUTER JOIN order
ON customer.id = order.customer_id
AND MONTH(date.date) = MONTH(order.date)
WHERE date.date BETWEEN startofdaterange AND endofdaterange
GROUP BY customer.name, month.name
Or if I have misread the question, and you need a count of the orders if they order every month in the range, or 0 if they skipped a month then something like this (not tested so expect a typo or 2, would need the table def to test):-
SELECT name, CASE WHEN MonthCount = MonthOrderCount THEN OrderCount ELSE 0 END AS ContinuousOrderMonths
FROM (
SELECT CustName, COUNT(MonthName) AS MonthCount, SUM(MonthOrderCount) AS OrderCount, SUM(CASE WHEN MonthOrderCount > 0 THEN 1 ELSE 0 END)
FROM (
SELECT customer.name AS CustName, month.name AS MonthName, COUNT(order.id) AS MonthOrderCount
FROM customer
CROSS JOIN date
LEFT OUTER JOIN order
ON customer.id = order.customer_id
AND MONTH(date.date) = MONTH(order.date)
WHERE date.date BETWEEN startofdaterange AND endofdaterange
GROUP BY customer.name, month.name )Sub1 ) Sub2
GROUP BY CustName
If you want a list of customers and a comma separated list of orders per month:-
SELECT CustName, GROUP_CONCAT(CAST(MonthsOrder AS CHAR))
FROM (
SELECT customer.name AS CustName, month.name, COUNT(order.id) AS MonthsOrder
FROM customer
CROSS JOIN date
LEFT OUTER JOIN order
ON customer.id = order.customer_id
AND MONTH(date.date) = MONTH(order.date)
WHERE date.date BETWEEN startofdaterange AND endofdaterange
GROUP BY customer.name, month.name) Sub1
GROUP BY CustName
You might have to expand this to get the month name with each one and force the order
Here you replace now and static date as per columnname :
select (case
when (month(now())=11 and
(month('2012-02-02')>=7 and month('2012-02-02')<=11))
then 5
else
0 end) as 'month'
from tablename

MySql - Selecting First And Last Row (Not Min/Max)

I have a MySql table with daily stock market data in the following order:
_date, _opening_price, _high_price, _low_price, _close_price
I'm trying to transform this data into weekly data by using:
SELECT
MAX(_date) AS _date,
WEEK(_date) AS weeknum,
_opening_price,
MAX(_high_price) AS _high_price,
MIN(_low_price) AS _low_price,
_closing_price
FROM myTable
GROUP BY weeknum ORDER BY _date;
How do I select _opening_price so that it is the first _opening_price from within that week's daily data? Likewise, how do I select _closing_price so that it is the last _closing_price within the week's daily data?
Here's an example:
For week ending 2007-01-05, the opening_price should be taken from 2007-01-03 (red) and the closing price should be taken from 2007-01-05 (green). Similarly, for week ending on 2007-01-12, opening price should be from 2007-01-08 and closing price from 2007-01-12.
Try this solution:
SELECT
MAX(a._date) weekending,
MAX(CASE WHEN a._date = b.mindate THEN a._opening_price END) openingprice,
MAX(CASE WHEN a._date = b.maxdate THEN a._closing_price END) closingprice
FROM myTable a
INNER JOIN
(
SELECT
CONCAT(YEAR(_date), '-', WEEK(_date)) weeknum,
MIN(_date) mindate,
MAX(_date) maxdate
FROM myTable
GROUP BY weeknum
) b ON a._date IN (b.mindate, b.maxdate)
GROUP BY b.weeknum