2 different AVG column in 1 table with select - mysql

I am trying to make an table that shows an AVG of pickpockets in district with markets and an AVG of pickpockets in districts without Market.
i would like to have the output like this:
district with market | district without market
----------------------------------------------
269 | 34
but instead i get this:
district with market | district without market
----------------------------------------------
269 | 269
34 | 34
this is the query i used:
select round(avg(average),0) as districts_with_markets, round(avg(average),0) as districts_without_markets
from zakkenrollerij
where wijk in (select district
from market)
union
select round(avg(average),0) as districts_with_markets, round(avg(average),0) as districts_without_markets
from zakkenrollerij
where wijk not in (select district
from market)
I hope someone can help me :D

Assuming that distict is unique in market, then you can do this with a left join and conditional aggregation:
select round(avg(case when m.district is not null then average end), 0) as districts_with_markets,
round(avg(case when m.district is null then average end), 0) as districts_without_markets
from zakkenrollerij z left join
market m
on m.district = z.wijk;
If this is not the case, then use a subquery and a flag:
select round(avg(case when hasMarketFlag then average end), 0) as districts_with_markets,
round(avg(case when not hasMarketFlag then average end), 0) as districts_without_markets
from (select z.*,
(exists (select 1
from market m
where m.district = z.wijk
)
) as hasMarketFlag
from zakkenrollerij z;

Try this:-
Select sum(dist_with_markets) as district_with_markets,
sum(dist_without_markets) as district_without_markets
from
(
select round(avg(average),0) as dist_with_markets, 0 as dist_without_markets
from zakkenrollerij
where wijk in (select district
from market)
union
select 0 as dist_with_markets, round(avg(average),0) as dist_without_markets
from zakkenrollerij
where wijk not in (select district
from market) ) a;
Hope this helps:-)

Please try the following...
SELECT ROUND( AVG( with_markets ) ) AS districts_with_markets,
ROUND( AVG( without_markets ) ) AS without_markets
FROM ( SELECT average AS with_markets
NULL AS without_markets
FROM zakkenrollerij
WHERE wijk IN ( SELECT district
FROM market )
UNION
SELECT NULL,
average
FROM zakkenrollerij
WHERE wijk NOT IN ( SELECT district
FROM market )
) AS tempTable;
This starts by forming a list of all the values of average within zakkenrollerij that qualify as within. No attempt to perform calculations is made at this stage. The second column is for those values that qualify as without - all of its values will be set to NULL at this stage.
This list is then joined vertically with its without counterpart using the UNION operator.
The joined list then has the ROUND( AVG() ) operations performed upon its columns.
If you have any questions or comments, then please feel free to post a Comment accordingly.

You are now going trough the table twice, selecting the same variable twice under different names:
round(avg(average),0) as districts_with_markets, round(avg(average),0) as districts_without_market
Instead of uniting the tables afterwards you can use CASE to select a variable with specific conditions. This should give the wanted result:
select round(avg(case when wijk in (select district from market) then average else null end),0) as districts_with_markets,
round(avg(case when wijk not in (select district from market) then average else null end),0) as districts_without_markets
from zakkenrollerij

Related

Querying Customers who have rented a movie at least once every week or in the Weekend

I have a DB for movie_rental. The Tables I have are for :
Customer Level:
Primary key: Customer_id(INT)
first_name(VARCHAR)
last_name(VARCHAR)
Movie Level:
Primary key: Film_id(INT)
title(VARCHAR)
category(VARCHAR)
Rental Level:
Primary key: Rental_id(INT).
The other columns in this table are:
Rental_date(DATETIME)
customer_id(INT)
film_id(INT)
payment_date(DATETIME)
amount(DECIMAL(5,2))
Now the question is to Create a master list of customers categorized by the following:
Regulars, who rent at least once a week
Weekenders, for whom most of their rentals come on Saturday and Sundays
I am not looking for the code here but the logic to approach this problem. Have tried quite a number of ways but was not able to form the logic as to how I can look up for a customer id in each week. The code I tried is as follows:
select
r.customer_id
, concat(c.first_name, ' ', c.last_name) as Customer_Name
, dayname(r.rental_date) as day_of_rental
, case
when dayname(r.rental_date) in ('Monday','Tuesday','Wednesday','Thursday','Friday')
then 'Regulars'
else 'Weekenders'
end as Customer_Category
from rental r
inner join customer c on r.customer_id = c.customer_id;
I know it is not correct but I am not able to think beyond this.
First, you don't need the customer table for this. You can add that in after you have the classification.
To solve the problem, you need the following information:
The total number of rentals.
The total number of weeks with a rental.
The total number of weeks overall or with no rental.
The total number of rentals on weekend days.
You can obtain this information using aggregation:
select r.customer_id,
count(*) as num_rentals,
count(distinct yearweek(rental_date)) as num_weeks,
(to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
from rental r
group by r.customer_id;
Now, your question is a bit vague on thresholds and what to do if someone only rents on weekends but does so every week. So, I'll just make arbitrary assumptions for the final categorization:
select r.customer_id,
(case when num_weeks > 10 and
num_weeks >= num_weeks_overall * 0.9
then 'Regular' -- at least 10 weeks and rents in 90% of the weeks
when weekend_rentals >= 0.8 * num_rentals
then 'Weekender' -- 80% of rentals are on the weekend'
else 'Hoi Polloi'
end) as category
from (select r.customer_id,
count(*) as num_rentals,
count(distinct yearweek(rental_date)) as num_weeks,
(to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
from rental r
group by r.customer_id
) r;
The problem with the current approach is that every rental of every customer will be treated separately. I am assuming a customer might rent more than once and so, we will need to aggregate all rental data for a customer to calculate the category.
So to create the master table, you have mentioned in the logic that weekenders are customers "for whom most of their rentals come on Saturday and Sundays", whereas regulars are customers who rent at least once a week.
2 questions:-
What is the logic for "most" for weekenders?
Are these two categories mutually exclusive? From the statement it does not seem so, because a customer might rent only on a Saturday or a Sunday.
I have tried a solution in Oracle SQL dialect (working but performance can be improved) with the logic being thus: If the customer has rented more on weekdays than on weekends, the customer is a Regular, else a Weekender. This query can be modified based on the answers to the above questions.
select
c.customer_id,
c.first_name || ' ' || c.last_name as Customer_Name,
case
when r.reg_count>r.we_count then 'Regulars'
else 'Weekenders'
end as Customer_Category
from customer c
inner join
(select customer_id, count(case when trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') then 1 end) as reg_count,
count(case when trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') then 1 end) as we_count
from rental group by customer_id) r on r.customer_id=c.customer_id;
Updated query based on clarity given in comment:-
select
c.customer_id,
c.first_name || ' ' || c.last_name as Customer_Name,
case when rg.cnt>0 then 1 else 0 end as REGULAR,
case when we.cnt>0 then 1 else 0 end as WEEKENDER
from customer c
left outer join
(select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') group by customer_id) rg on rg.customer_id=c.customer_id
left outer join
(select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') group by customer_id) we on we.customer_id=c.customer_id;
Test Data :
insert into customer values (1, 'nonsensical', 'coder');
insert into rental values(1, 1, sysdate, 1, sysdate, 500);
insert into customer values (2, 'foo', 'bar');
insert into rental values(2, 2, sysdate-5, 2, sysdate-5, 800); [Current day is Friday]
Query Output (first query):
CUSTOMER_ID CUSTOMER_NAME CUSTOMER_CATEGORY
1 nonsensical coder Regulars
2 foo bar Weekenders
Query Output (second query):
CUSTOMER_ID CUSTOMER_NAME REGULAR WEEKENDER
1 nonsensical coder 0 1
2 foo bar 1 0
This is a study of cohorts. First find the minimal expression of each group:
# Weekday regulars
SELECT
customer_id
FROM rental
WHERE WEEKDAY(`date`) < 5 # 0-4 are weekdays
# Weekend warriors
SELECT
customer_id
FROM rental
WHERE WEEKDAY(`date`) > 4 # 5 and 6 are weekends
Now we know how to get a listing of customers who have rented on weekdays and weekends, inclusive. These queries only actually tell us that these were customers who visited on a day in the given series, hence we need to make some judgements.
Let's introduce a periodicity, which then allows us to gain thresholds. We'll need aggregation too, so we're going to count the weeks that are distinctly knowable by grouping to the rental.customer_id.
# Weekday regulars
SELECT
customer_id
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
GROUP BY customer_id
# Weekend warriors
SELECT
customer_id
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
We also need a determinant period:
FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS weeks_in_period
Put those together:
# Weekday regulars
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
GROUP BY customer_id
# Weekend warriors
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
So now we can introduce our threshold accumulator per cohort.
# Weekday regulars
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(`date`) < 5
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
GROUP BY customer_id
HAVING total_weeks = weeks_as_customer
# Weekend warriors
SELECT
customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
FROM rental
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
WHERE WEEKDAY(`date`) > 4
GROUP BY customer_id
HAVING total_weeks = weeks_as_customer
Then we can use these to subquery our master list.
SELECT
customer.customer_id
, CONCAT(customer.first_name, ' ', customer.last_name) as customer_name
, CASE
WHEN regulars.customer_id IS NOT NULL THEN 'regular'
WHEN weekenders.customer_id IS NOT NULL THEN 'weekender'
ELSE NULL
AS category
FROM customer
CROSS JOIN (
SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
) AS period
LEFT JOIN (
SELECT
rental.customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(rental.`date`) < 5
GROUP BY rental.customer_id
HAVING total_weeks = weeks_as_customer
) AS regulars ON customer.customer_id = regulars.customer_id
LEFT JOIN (
SELECT
rental.customer_id
, period.total_weeks
, COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
FROM rental
WHERE WEEKDAY(rental.`date`) > 4
GROUP BY rental.customer_id
HAVING total_weeks = weeks_as_customer
) AS weekenders ON customer.customer_id = weekenders.customer_id
HAVING category IS NOT NULL
There is some ambiguity as far as whether cross-cohorts are to be left out (regulars who missed a week because they rented on the weekend-only at least once, for instance). You would need to work this type of inclusivity/exclusivity question out.
This would involve going back to the cohort-specific queries to introduce and tune the queries to explain that degree of further comprehension, and/or add other cohort cross-cutting subqueries that can be combined in other ways to establish better and/or more comprehensions at the top view.
However, I think what I've provided matches reasonably with what you've provided given this caveat.

SQL Query for sorting and getting unique count

I have a table which consists of the following details
Customer
Deal
DealStage
A
D1
Lost
A
D2
Won
A
D3
Contacted
B
D4
Conatcted
B
D5
Lost
C
D6
Lost
D
D7
Lost
I have to develop a query where I should get the unique highest stage for each customer. The Stage priority is Won > Contacted > Lost. For Example, A is having three deals which are Won, Lost, and Contacted. So I should be considering Won. Similarly Contacted for B and Lost for C and D
Is it possible to get an Output like
Customer
Highets Stage
A
Won
B
Contacted
C
Lost
D
Lost
By this, I can generate a pivot table that looks like
Stage
CustomerCount
Won
1
Contacted
1
Lost
2
Thanks in Advance
One option uses aggregation and field():
select customer,
case min(field(deal_stage, 'Won', 'Contacted', 'Lost'))
when 1 then 'Won'
when 2 then 'Contacted'
when 3 then 'Lost'
end as highest_stage
from mytable
group by customer
Actually we could combine this with elt():
select customer,
elt(
min(field(deal_stage, 'Won', 'Contacted', 'Lost')),
'Won', 'Contacted', 'Lost'
) as highest_stage
from mytable
group by customer
You can then generate the final result with another level of aggregation:
select highest_stage, count(*)
from (
select customer,
elt(
min(field(deal_stage, 'Won', 'Contacted', 'Lost')),
'Won', 'Contacted', 'Lost'
) as highest_stage
from mytable
group by customer
) t
group by highest_stage
Use windows function as follows:
select * from
(select t.*,
row_number() over (partition by customer
order by case when dealstage = 'Won' then 1
when dealstage = 'Contacted' then 2
when dealstage = 'Lost' then 3
end
) as rn
from your_table t)
where rn = 1;
These are really two different problems. I would, in fact, recommend different approaches to the two. For the first, conditional aggregation:
select customer,
coalesce(max(case when state = 'Won' then state end),
max(case when state = 'Contacted' then state end),
max(case when state = 'Lost' then state end)
) as biggest_state
from t
group by customer;
However, for your final result, I would recommend a correlated subquery:
select t.state, count(*)
from t
where t.state = (select t2.state
from t2
where t2.customer = t.customer
order by field(state, 'Won', 'Contact', 'Lost')
limit 1
)
group by t.state;
Note: This assumes that the original data does not have duplicate rows. If it does, then count(distinct) is one adjustment.

Join 2 Select Queries in a Same Table in MySql

I have a Table Name Called tbl_events and it does have following columns
Id,Event_Name, District,Branch_Type,Points
I need a Sum of Points Column Where Branch_Type=2; and Sum of Points Divided by 10 Where Branch_Type=2 after that I have to Add those Two Values And Group that Result by District and Order by Desc. I tried this Query but Seems to something wrong Can anyone help, please?
Select (t1.B_Points + t2.D_Points) as T_Points,District From
(Select Sum(Points)*.1 as B_Points ,District From tblstudents Where Branch_Type=3 group by District)t1
Left Join(Select Sum(Points) as D_points, District From tblstudents Where Branch_Type=2 group by District)t2 on
(t1.District=t2.District) Order by Desc
You need to add column alias T_Points in order by
Select (t1.B_Points + t2.D_Points) as T_Points,District
From
(
Select Sum(Points)*.1 as B_Points ,District From tblstudents
Where Branch_Type=3 group by District
)t1
Left Join
(
Select Sum(Points) as D_points, District From tblstudents
Where Branch_Type=2 group by District
)t2 on t1.District=t2.District
Order by T_Points Desc
You seem to just want conditional aggregation:
select district,
sum(case when branch_type = 3 then 0.1 * points
when branch_type = 2 then points
else 0
end) as t_points
from tblstudents
group by district;
If you want to order by the points descending, then add:
order by t_points desc
Note: This assumes that you want districts that have neither branch type. If that is not an issue, move the logic to the where clause:
select district,
sum(points * (case when branch_type = 3 then 0.1 else 1.0 end) as t_points
from tblstudents
where branch_type in (2, 3)
group by district
order by t_points desc;

MYSQL GROUP BY multiple derived tables?

I have this query which does some calculations based on some derived tables that are linked with an INNER JOIN.
At the moment I have a WHERE clause which pulls out one id at a time. But how can I make it list all the ids?
I have tried GROUP BY in various places but can't figure it out.
My query so far is as follows:
SELECT
equipment_id,
service_duration,
available_duration,
(available_duration / service_duration)*100 AS availability
FROM (
SELECT
SUM(service_end_time - service_start_time) AS service_duration
FROM(
SELECT equipment_id,
(CASE
END) AS service_start_time,
(CASE
END) AS service_end_time
FROM t1
WHERE equipment_id = 'EX123'
)AS A
) AS B
JOIN(
SELECT equipment_id,
SUM(available_end_time - available_start_time) AS available_duration
FROM (
SELECT equipment_id,
(CASE
END) AS available_start_time,
(CASE
END) AS available_end_time
FROM t2
WHERE equipment_id = 'EX123'
) AS C
) AS D
ON equipment_id=D.equipment_id
What I want to do is replace the WHERE clause with a GROUP BY to list all the ids, or similar, but getting that to work is beyond my skill level... Any help greatly appreciated :)
Try below:
SELECT
equipment_id, service_duration, available_duration,
(available_duration / service_duration)*100 AS availability
FROM
(
SELECT equipment_id,
SUM(service_end_time - service_start_time) AS service_duration
FROM
(
SELECT equipment_id,
(CASE ... END) AS service_start_time,
(CASE ... END) AS service_end_time
FROM t1
) AS A
GROUP BY equipment_id
) AS B
JOIN
(
SELECT equipment_id,
SUM(available_end_time - available_start_time) AS available_duration
FROM
(
SELECT equipment_id,
(CASE ... END) AS available_start_time,
(CASE ... END) AS available_end_time
FROM t2
) AS C
GROUP BY equipment_id
) AS D
ON equipment_id=D.equipment_id
Try this (replace my field names with your field names):
SELECT
a.emp_id,
service_duration,
available_duration
FROM
(
SELECT
emp_id,
SUM(service_end_time - service_start_time) AS service_duration
FROM
data
GROUP BY
emp_id
) a
JOIN
(
SELECT
emp_id,
SUM(available_end_time - available_start_time) AS available_duration
FROM
data
GROUP BY
emp_id
) b
ON a.emp_id = b.emp_id
GROUP BY
a.emp_id

MySQL Join Two Queries Horizontally

I have a query that works correctly to pull a series of targets and total hours worked for company A. I would like to run the exact same query for company B and join them on a common date, which happens to be grouped by week. My current query:
SELECT * FROM (
SELECT org, date,
( SELECT SUM( target ) FROM target WHERE org = "companyA" ) AS companyA_target,
SUM( hours ) AS companyA_actual
FROM time_management_system
WHERE org = "companyA"
GROUP BY WEEK( date )
ORDER BY DATE
) q1
LEFT JOIN (
SELECT org, date,
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY WEEK( date )
ORDER BY DATE
) q2
ON q1.date = q2.date
The results show all of the dates / information of companyA, however companyB only shows sporadic data. Separately, the two queries will show the exact same set of dates, just with different information in the 'target' and 'actual' columns.
companyA 2012-01-28 105.00 39.00 NULL NULL NULL NULL
companyA 2012-02-05 105.00 15.00 NULL NULL NULL NULL
companyA 2012-02-13 105.00 60.50 companyB 2012-02-13 97.50 117.50
Any idea why I'm not getting all the information for companyB?
As a side note, would anybody be able to point in the direction of converting each row's week value into a column? With companyA and companyB as the only two rows?
I appreciate all the help! Thanks.
WITH no date apparent in the target table, the summation will be constant across all weeks. So, I have performed a pre-query for only those "org" values of company A and B with a group by. This will ensure only 1 record per "org" so you don't get a Cartesian result.
Then, I am querying the time_management_system ONCE for BOTH companies. Within the field computations, I am applying an IF() to test the company value and apply when correct. The WEEK activity is the same for both in the final result, so I don't have to do separately and join. This also prevents the need of having the date column appear twice. I also don't need to explicitly add the org column names as the final column names reflect that.
SELECT
WEEK( tms.date ) as GrpWeek,
IF( tms.org = "companyA", TargetSum.CompTarget, 00000.00 )) as CompanyATarget,
SUM( IF( tms.org = "companyA", tms.hours, 0000.00 )) as CompanyAHours,
IF( tms.org = "companyB", TargetSum.CompTarget, 00000.00 )) as CompanyBTarget,
SUM( IF( tms.org = "companyB", tms.hours, 000.00 )) as CompanyBHours
from
Time_Management_System tms
JOIN ( select
t.org,
SUM( t.target ) as CompTarget
from
Target T
where
t.org in ( "companyA", "companyB" )
group by
t.org ) as TargetSums
ON tms.org = TargetSums.org
where
tms.org in ( "companyA", "companyB" )
group by
WEEK( tms.date )
order by
WEEK( tms.date )
Both of your subqueries are wrong.
Either you want this:
SELECT
org,
WEEK(date),
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY WEEK( date )
Or else you want this:
SELECT
org,
date,
( SELECT SUM( target ) FROM target WHERE org = "companyB" ) AS companyB_target,
SUM( hours ) AS companyB_actual
FROM time_management_system
WHERE org = "companyB"
GROUP BY date
The way you are doing it now is not correctly formed SQL. In pretty much any other database your query would fail immediately with an error. MySQL is more lax and runs the query but gives indeterminate results.
GROUP BY and HAVING with Hidden Columns