I will preface with Table Structures:
revshare r : contains info for a purchase including orderNo, sales, commission, itemid, EventDate
Products p: contains information around a product including a PID (Product ID) and is used to join to the Merchants table to get Merchant information.
Merchants m: contains information about the merchant the product was purchased from, including MerchantName
Question
I am trying to create a MySQL query to pull top 10 itemid's ordered by sum of commission for a given month. The entire data set I would like to get is from 2011-2013 so each year would populate 120 records (10 per month).
I created a query to pull 1 months worth of data and planned on using a UNION ALL to just create a records list with 10 records from each query (each individual query representing a months top 10 itemid's).
Query1
This query accurately returns me the top 10 itemid's based on total commission of those items in the given month period.
SELECT
m.MerchantName,
Count(r.OrderNo),
sum(r.commission)
FROM revshare r
LEFT JOIN Products p ON r.itemid = p.PID
LEFT JOIN Merchants m ON p.MID = m.MID
WHERE r.EventDate between '2011-01-01' and '2011-01-31'
GROUP by r.itemid
ORDER by 3 DESC LIMIT 10
When I try to UNION this query with another so that I can get records for the next month between '2011-02-01' and '2011-02-31' I get and error "ERROR: Incorrect usage of UNION and ORDER BY" I know this is because apparently you cannot use ORDER BY on any set of UNION'd queries but the last. I could pull the entire data set and then use Excel or Pentaho BI to show only the top 10 but that is not efficient based on the huge data sets in the revshare table.
Below is the query with the UNION ALL that doesn't work. Does anyone have any better method of pulling this data?
Any help is greatly appreciated.
Regards,
-Chris
Query 2 (doesn't work because of the ORDER BY statement)
SELECT
m.MerchantName,
Count(r.OrderNo),
sum(r.commission)
FROM revshare r
LEFT JOIN Products p ON r.itemid = p.PID
LEFT JOIN Merchants m ON p.MID = m.MID
WHERE r.EventDate between '2011-01-01' and '2011-01-31'
GROUP by r.itemid
ORDER by 3 DESC LIMIT 10
UNION ALL
SELECT
m.MerchantName,
Count(r.OrderNo),
sum(r.commission)
FROM revshare r
LEFT JOIN Products p ON r.itemid = p.PID
LEFT JOIN Merchants m ON p.MID = m.MID
WHERE r.EventDate between '2011-02-01' and '2011-02-31'
GROUP by r.itemid
ORDER by 3 DESC LIMIT 10
Ok, try this....
SELECT * FROM (
SELECT
m.MerchantName,
Count(r.OrderNo),
sum(r.commission)
FROM
revshare r
LEFT JOIN Products p ON r.itemid = p.PID
LEFT JOIN Merchants m ON p.MID = m.MID
WHERE
r.EventDate between '2011-01-01' and '2011-01-31'
GROUP by
r.itemid
ORDER by
3 DESC LIMIT 10
) AS RESULT1
UNION ALL
SELECT * FROM (
SELECT
m.MerchantName,
Count(r.OrderNo),
sum(r.commission)
FROM
revshare r
LEFT JOIN Products p ON r.itemid = p.PID
LEFT JOIN Merchants m ON p.MID = m.MID
WHERE
r.EventDate between '2011-02-01' and '2011-02-31'
GROUP by
r.itemid
ORDER by
3 DESC LIMIT 10
) AS RESULT2
Since you already started down the path of unioning queries together, here is the right approach:
select t.*
from ((SELECT '2011-01' as yyyymm, m.MerchantName, Count(r.OrderNo) as cnt, sum(r.commission) as comm
FROM revshare r LEFT JOIN
Products p
ON r.itemid = p.PID LEFT JOIN
Merchants m
ON p.MID = m.MID
WHERE r.EventDate between '2011-01-01' and '2011-01-31'
GROUP by r.itemid
ORDER by comm DESC
LIMIT 10
) union all
(SELECT '2011-02' as yyyymm, m.MerchantName, Count(r.OrderNo) as cnt, sum(r.commission) as comm
FROM revshare r LEFT JOIN
Products p
ON r.itemid = p.PID LEFT JOIN
Merchants m
ON p.MID = m.MID
WHERE r.EventDate between '2011-02-01' and '2011-02-28'
GROUP by r.itemid
ORDER by comm DESC LIMIT 10
) union all
. . .
) t
order by 1, comm desc
In other words, you need to use subqueries for the union all. Note that I also added in yyyymm to identify the month.
Related
I tried to answer this question here in the code below, but it keeps giving me an error message!
I've tried to figure out how to
Provide the name of the sales_rep in each region with the largest amount of total_amt_usd sales?
and it gave me this Error :
aggregate function calls cannot be nested
ERD picture here
could you please help me with this?
WITH
account_info AS (Select * from accounts),
orders_info AS (select * from orders),
region_info AS (select * from region),
sales_reps_info AS (select * from sales_reps)
SELECT s.name as rep_name, r.name as region_name, MAX (SUM (o.total_amt_usd)) as total
FROM orders_info o
JOIN account_info a
ON o.account_id = a.id
JOIN sales_reps_info s
ON a.sales_rep_id = s.id
JOIN region_info r
ON r.id = s.region_id
GROUP BY TOTAL, REP_NAME, R.NAME
ORDER BY 3 DESC
When you are using the whole table there is no need for WITH
SELECT s.name as rep_name, r.name as region_name, MAX (SUM (o.total_amt_usd)) as total
FROM orders o
JOIN account a
ON o.account_id = a.id
JOIN sales_reps s
ON a.sales_rep_id = s.id
JOIN region r
ON r.id = s.region_id
GROUP BY TOTAL, REP_NAME, R.NAME
ORDER BY 3 DESC
LIMIT 100;
I'm not sure what you are attempting with with since you don't actually define a Common Table Expression.
That aside, your query is invalid, you cannot nest aggregate functions and you are already getting the max 100 by ordering and limiting rows, so I think you just want
SELECT s.name as rep_name, r.name as region_name, SUM (o.total_amt_usd) as Total
FROM orders_info o
JOIN account_info a ON o.account_id = a.id
JOIN sales_reps_info s ON a.sales_rep_id = s.id
JOIN region_info r ON r.id = s.region_id
GROUP BY REP_NAME, R.NAME
ORDER BY Total DESC
LIMIT 100;
I've the above dataset, I need to report for each year the percentage of movies in that year with only female actors, and the total number of movies made that year. For example, one answer will be: 1990 31.81 13522 meaning that in 1990 there were 13,522 movies, and 31.81%
In order to get the moves with only female actors, wrote the following code:
SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title
FROM Movie a
WHERE a.title NOT IN (
SELECT b.title from Movie b
Inner Join M_cast c
on TRIM(c.MID) = b.MID
Inner Join Person d
on TRIM(c.PID) = d.PID
WHERE d.Gender='Male'
GROUP BY b.title
)
GROUP BY a.year,a.title
Order By a.year asc
The total movies in each year , can be found using the following:
SELECT a.year, count(a.title) AS Total_Movies
FROM Movie a
GROUP BY a.year
ORDER BY COUNT(a.title) DESC
Combinig the both I wrote, the following code:
SELECT z.year as Year, count(z.title) AS Total_Movies, count(x.title) as Female_movies, count(z.title)/ count(x.title) As percentage
FROM Movie z
Inner Join (
SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title
FROM Movie a
WHERE a.title NOT IN (
SELECT b.title from Movie b
Inner Join M_cast c
on TRIM(c.MID) = b.MID
Inner Join Person d
on TRIM(c.PID) = d.PID
WHERE d.Gender='Male'
GROUP BY b.title
)
GROUP BY a.year,a.title
Order By a.year asc
)x
on x.year = z.year
GROUP BY z.year
ORDER BY COUNT(z.title) DESC
However, in th output I'm seeing the years with only female movies correctly, but the count of total movies is equal to female_movies so I'm getting 1%, I tried debugging the code, but not sure where this is going wrong. Any insights would be appreciated.
You assume that your 'z' contains all movies but since you do an inner join on the female movies, they'll also only contain female movies. You could fix that with a 'left join'.
Assuming your two queries are correct, you can join on them with a 'WITH' like this:
WITH allmovies (year, cnt) as
(SELECT a.year, count(a.title) AS Total_Movies
FROM Movie a
GROUP BY a.year
ORDER BY COUNT(a.title) DESC)
,
femalemovies (year, cnt, title) as
(SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title
FROM Movie a
WHERE a.title NOT IN (
SELECT b.title from Movie b
Inner Join M_cast c
on TRIM(c.MID) = b.MID
Inner Join Person d
on TRIM(c.PID) = d.PID
WHERE d.Gender='Male'
GROUP BY b.title
)
GROUP BY a.year,a.title
Order By a.year asc)
select * from allmovies left join femalemovies on allmovies.year = femalemovies.year
You can use conditional aggregation. In a CASE expression check if no cast member that isn't female exists with a correlated subquery. If the check is successful, return something not NULL and count() that to get the number of movies with only female cast members (or none at all).
SELECT m.year,
count(*) count_all,
count(CASE
WHEN NOT EXISTS (SELECT *
FROM m_cast c
INNER JOIN person p
ON p.pid = c.pid
WHERE c.mid = m.mid
AND p.gender <> 'Female') THEN
1
END)
/
count(*)
*
100 percentage_only_female
FROM movie m
GROUP BY m.year;
Since in MySQL Boolean expressions in numerical context evaluate to 1 if true and to 0 otherwise, you could also use a sum() over the NOT EXISTS.
SELECT m.year,
count(*) count_all,
sum(NOT EXISTS (SELECT *
FROM m_cast c
INNER JOIN person p
ON p.pid = c.pid
WHERE c.mid = m.mid
AND p.gender <> 'Female'))
/
count(*)
*
100 percentage_only_female
FROM movie m
GROUP BY m.year;
That however isn't compatible with most other DBMS in contrast to the first one.
I would use two levels of aggregation:
SELECT m.MID, m.title, m.year,
COUNT(*) as num_actors,
SUM(gender = 'Female') as num_female_actors
FROM Movie m JOIN
M_cast c
ON c.MID = b.MID JOIN
Person p
ON p.PID = c.PID
GROUP BY m.MID, m.title, m.year;
Then a simple outer aggregation:
SELECT year,
COUNT(*) as num_movies,
SUM( num_actors = num_female_actors ) as num_female_only,
AVG( num_actors = num_female_actors ) as female_only_ratio
FROM (SELECT m.MID, m.title, m.year,
COUNT(*) as num_actors,
SUM(gender = 'Female') as num_female_actors
FROM Movie m JOIN
M_cast c
ON c.MID = b.MID JOIN
Person p
ON p.PID = c.PID
GROUP BY m.MID, m.title, m.year
) m
GROUP BY year;
Notes:
Use meaningful table aliases, rather than arbitrary letters. You'll note that the table aliases are abbreviations for the table names.
Do not use functions when filtering or JOINing unless necessary. I removed the TRIM(). If you need it use it. Or better yet, fix the data.
SELECT m.Year,COUNT(m.Year),x.t,
(COUNT(m.Year)*1.0/x.t*1.0)*100
FROM Movie m LEFT JOIN
(SELECT Year,COUNT(Year) AS t FROM Movie GROUP BY year) AS x
ON m.Year=x.Year
WHERE m.MID IN
(SELECT MID FROM M_Cast WHERE PID in
(SELECT PID FROM Person WHERE Gender='Female')
AND m.MID NOT IN
(SELECT MID FROM M_Cast WHERE PID in
(SELECT PID FROM Person WHERE Gender='Male'))) GROUP BY m.year
Check if this is what you're looking for.
select movie.year, count(movie.mid) as Year_Wise_Movie_Count,cast(x.Female_Cast_Only as real) / count(movie.mid) As Percentage_of_Female_Cast from movie
inner join
(
SELECT Movie.year as Year, COUNT(Movie.mid) AS Female_Cast_Only
FROM Movie
WHERE Movie.MID NOT IN (
SELECT Movie.MID from Movie
Inner Join M_cast
on TRIM(M_cast.MID) = Movie.MID
Inner Join Person
on TRIM(M_cast.PID) = Person.PID
WHERE Person.Gender!='Female'
GROUP BY Movie.MID
)
GROUP BY Movie.year
Order By Movie.year asc
) x
on x.year = movie.year
GROUP BY movie.year
ORDER BY movie.year
Output:
year Year_Wise_Movie_Count Percentage_of_Female_Cast
---- --------------------- -------------------------
1939 2 0.5
1999 66 0.0151515151515152
2000 64 0.015625
2018 104 0.00961538461538462
Note:
This was executed in SQLIte3
I've got a database set up as follows:
http://sqlfiddle.com/#!9/2c5fc
What I'm looking to do is get a list of each store, and their last budget amount (Each schedule's total in the budget * their apportionment for that schedule / 100).
Some stores might not have apportionments set for the last budget, so I need the last budget where they have an apportionment set, or NULL if no apportionment has been set or no budget exists.
I've got the following SQL query:
SELECT s.StoreID,s.CentreID, budgetcalc.amount, budgetcalc.BudgetDate FROM store as s
LEFT JOIN centre on s.CentreID = centre.CentreID
LEFT JOIN (SELECT BudgetDate, SUM(sch.Amount*appt.Percentage/100) as amount, appt.StoreID from budget b
INNER JOIN schedule as sch on b.BudgetID = sch.BudgetID
INNER JOIN apportionment as appt on sch.ScheduleID = appt.ScheduleID
GROUP BY appt.StoreID, b.BudgetID
ORDER BY STR_TO_DATE(b.BudgetDate,'%d-%m-%y') DESC
) as budgetcalc on s.StoreID = budgetcalc.StoreID
GROUP BY s.StoreID
ORDER BY s.StoreID, STR_TO_DATE(budgetcalc.BudgetDate,'%d-%m-%y') DESC;
However this has the issue of not returning the last year, it will return a previous year seemingly at random regardless of the order in which I return the subquery.
It is difficult to work out from just the table declarations, but you need to get the latest budget date for each store, then join that against your sub query that gets the budget details for each store / budget.
Untested but something like this:-
SELECT s.StoreID,
s.CentreID,
budgetcalc.amount,
budgetcalc.BudgetDate
FROM store as s
LEFT OUTER JOIN centre ON s.CentreID = centre.CentreID
LEFT OUTER JOIN
(
SELECT appt.StoreID,
MAX(BudgetDate) AS latestBudgetDate
FROM budget b
INNER JOIN schedule as sch ON b.BudgetID = sch.BudgetID
INNER JOIN apportionment as appt ON sch.ScheduleID = appt.ScheduleID
GROUP BY appt.StoreID
) latest_budget
ON s.StoreID = latest_budget.StoreID
LEFT OUTER JOIN
(
SELECT BudgetDate,
appt.StoreID,
SUM(sch.Amount*appt.Percentage/100) as amount
FROM budget b
INNER JOIN schedule as sch ON b.BudgetID = sch.BudgetID
INNER JOIN apportionment as appt ON sch.ScheduleID = appt.ScheduleID
GROUP BY appt.StoreID, b.BudgetID
) as budgetcalc
ON latest_budget.StoreID = budgetcalc.StoreID
AND latest_budget.latestBudgetDate = budgetcalc.BudgetDate
ORDER BY s.StoreID,
budgetcalc.BudgetDate DESC;
With help from Kickstart, this is the answer that works:
SELECT s.StoreID,s.CentreID, SUM(sch.Amount*appt.Percentage/100) as amount, b.BudgetDate FROM store as s
INNER JOIN centre on s.CentreID = centre.CentreID
LEFT JOIN budget as b on b.BudgetID =
(
SELECT budget.BudgetID from budget
INNER JOIN schedule on budget.BudgetID = schedule.BudgetID
INNER JOIN apportionment on schedule.ScheduleID = apportionment.ScheduleID
where CentreID = s.CentreID AND apportionment.StoreID = s.StoreID
ORDER BY BudgetDate DESC limit 1
)
LEFT JOIN schedule as sch on b.BudgetID = sch.BudgetID
LEFT JOIN apportionment as appt on sch.ScheduleID = appt.ScheduleID AND appt.StoreID = s.StoreID
GROUP BY s.StoreID
ORDER BY s.StoreID ASC;
I spent so much time googling today but i don't even know which keywords to use. So …
The project is an evaluation of a betting game (Football). I have 2 SQL Queries:
SELECT players.username, players.userid, matchdays.userid, matchdays.points, SUM(points) AS gesamt
FROM players INNER JOIN matchdays ON players.userid = matchdays.userid AND matchdays.season_id=5
GROUP BY players.username
ORDER BY gesamt DESC
And my second query:
SELECT max(matchday) as lastmd, points, players.username from players INNER JOIN matchdays ON players.userid = matchdays.userid WHERE matchdays.season_id=5 AND matchday=
(select max(matchday) from matchdays)group by players.username ORDER BY points DESC
The first one adds up the points of every matchday and shows the sum.
The second shows the points of the last gameday.
My Goal is to merge those 2 queries/tables so that the output is a table like
Rank | Username | Points last gameday | Overall points |
I don't even know where to start or what to look for. Any help would be appreciated ;)
use both query with join....use inner join if each userid have value in 2nd query also.also add userid in 2nd query also for join
SET #rank = 0;
SELECT #rank := rank + 1,
t1.username,
t2.points,
t1.gesamt
FROM (
SELECT players.username, players.userid puserid, matchdays.userid muserid, matchdays.points, SUM(points) AS gesamt
FROM players INNER JOIN matchdays ON players.userid = matchdays.userid AND matchdays.season_id=5
GROUP BY players.username
)t1
INNER JOIN
(
SELECT players.userid, max(matchday) as lastmd, points, players.username
from players INNER JOIN matchdays ON players.userid = matchdays.userid
WHERE matchdays.season_id=5 AND matchday=
(select max(matchday) from matchdays)group by players.username
)t2
ON t1.puserid = t2.userid
ORDER BY t1.gesamt
You can use conditional aggregation, i.e. sum the points only when the day is the last day:
SELECT
p.username,
SUM(case when m.matchday = (select max(matchday) from matchdays) then m.points end)
AS last_day_points,
SUM(m.points) AS total_points
FROM players p
INNER JOIN matchdays m ON p.userid = m.userid AND m.season_id = 5
GROUP BY p.userid
ORDER BY total_points DESC;
Or with a join instead of a non-correlated subquery (MySQL should come to the same execution plan):
SELECT
p.username,
SUM(case when m.matchday = last_day.matchday then m.points end) AS last_day_points,
SUM(m.points) AS total_points
FROM players p
INNER JOIN matchdays m ON p.userid = m.userid AND m.season_id = 5
CROSS JOIN
(
select max(matchday) as matchday
from matchdays
) last_day
GROUP BY p.userid
ORDER BY total_points DESC;
This query groups tickets by who they are assigned to and works out the average rounded number of days the ticket has taken to be closed.
SELECT a.id as theuser, round(avg(DATEDIFF( ta.dateClosed, t.dateAded ) * 1.0), 2) as avg
FROM tickets t join
mdl_user a
on find_in_set(a.id, t.assignedto) > 0
GROUP BY a.id ORDER BY avg ASC
I would now like to JOIN the ticketanswer table to find out the average time for first response.
The ticket could have multiple answers so i just want to get the first one.
Therefore I have tried to change the query to include this with no avail. Could anyone shed a light as to what im doing wrong?
SELECT a.id as theuser, round(avg(DATEDIFF( ta.dateAded , t.dateAded ) * 1.0), 2) as avg
FROM tickets t join
mdl_user a
on find_in_set(a.id, t.assignedto) > 0
INNER JOIN (SELECT MIN(ta.dateAded) as started FROM ticketanswer GROUP BY ta.ticketId) ta ON t.id = ta.ticketId
GROUP BY a.id ORDER BY avg ASC
Made some slight modifications to your query.
SELECT a.id as theuser, round(avg(DATEDIFF( ta.dateAded , t.dateAded ) * 1.0), 2) as avg
FROM tickets t join
mdl_user a
on find_in_set(a.id, t.assignedto) > 0
INNER JOIN (SELECT ticketid, MIN(dateAded) as started FROM ticketanswer GROUP BY ticketId) ta ON t.id = ta.ticketId
GROUP BY a.id ORDER BY avg ASC