Minimizing redundancy of MySQL query - mysql

I'm having a bit of trouble trying to reduce the redundancy of a query in MySQL. I currently have it working, but it feels like I have too much overhead because it uses a redundant subquery. What I am trying to do is use a dvd rental database to find which store location has rented out more dvd's for each month in 2005.
Here is the working query
SELECT b.month, c.store_id, b.maxRentals
FROM
(SELECT a.month, MAX(a.rentalCount) as maxRentals
FROM
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) a
GROUP BY a.month
) b
INNER JOIN
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) c
ON b.maxRentals = c.rentalCount
GROUP BY b.month;
Notice how the subquery with the alias of "c" is the exact same subquery of alias "a". I'm not sure if there's a way to get rid of this, as I can't inner join on an alias. Am I just stuck with a giant query, or is there something else I can do?

I am 90% certain this query will achieve your intentions:
SELECT MONTH(r.rental_date), i.store_id, COUNT(*)
FROM rental r
LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
Let me know how it goes!
Edit: to answer the question which store location has rented out more dvd's for each month in 2005:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id) x
GROUP BY x.rental_month, x.store_id
I was explicit by using aliases everywhere, you could probably omit some. Hopefully this helps...
Edit: Dirty hack:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
ORDER BY MONTH(r.rental_date) ASC, COUNT(*) DESC) x
GROUP BY x.rental_month
Ref:
http://kristiannielsen.livejournal.com/6745.html
But then does this satisfy you, seeing as you do already have a working query...

Related

How to fix this code? i tried WITH statement but it gave me an error

I tried to answer this question here in the code below, but it keeps giving me an error message!
I've tried to figure out how to
Provide the name of the sales_rep in each region with the largest amount of total_amt_usd sales?
and it gave me this Error :
aggregate function calls cannot be nested
ERD picture here
could you please help me with this?
WITH
account_info AS (Select * from accounts),
orders_info AS (select * from orders),
region_info AS (select * from region),
sales_reps_info AS (select * from sales_reps)
SELECT s.name as rep_name, r.name as region_name, MAX (SUM (o.total_amt_usd)) as total
FROM orders_info o
JOIN account_info a
ON o.account_id = a.id
JOIN sales_reps_info s
ON a.sales_rep_id = s.id
JOIN region_info r
ON r.id = s.region_id
GROUP BY TOTAL, REP_NAME, R.NAME
ORDER BY 3 DESC
When you are using the whole table there is no need for WITH
SELECT s.name as rep_name, r.name as region_name, MAX (SUM (o.total_amt_usd)) as total
FROM orders o
JOIN account a
ON o.account_id = a.id
JOIN sales_reps s
ON a.sales_rep_id = s.id
JOIN region r
ON r.id = s.region_id
GROUP BY TOTAL, REP_NAME, R.NAME
ORDER BY 3 DESC
LIMIT 100;
I'm not sure what you are attempting with with since you don't actually define a Common Table Expression.
That aside, your query is invalid, you cannot nest aggregate functions and you are already getting the max 100 by ordering and limiting rows, so I think you just want
SELECT s.name as rep_name, r.name as region_name, SUM (o.total_amt_usd) as Total
FROM orders_info o
JOIN account_info a ON o.account_id = a.id
JOIN sales_reps_info s ON a.sales_rep_id = s.id
JOIN region_info r ON r.id = s.region_id
GROUP BY REP_NAME, R.NAME
ORDER BY Total DESC
LIMIT 100;

Subquery left join refer to parent ID

I am trying to make a query to fetch the newest car for each user:
select * from users
left join
(select cars.* from cars
where cars.userid=users.userid
order by cars.year desc limit 1) as cars
on cars.userid=users.userid
It looks like it says Unknown column "users.userid" in where clause
I tried to remove cars.userid=users.userid part, but then it only fetches 1 newest car, and sticks it on to each user.
Is there any way to accomplish what I'm after? thanks!!
For this purpose, I usually use row_number():
select *
from users u left join
(select c.* , row_number() over (partition by c.userid order by c.year desc) as seqnum
from cars c
) c
on c.userid = u.userid and c.seqnum = 1;
One option is to filter the left join with a subquery:
select * -- better enumerate the columns here
from users u
left join cars c
on c.userid = u.userid
and c.year = (select max(c1.year) from cars c1 where c1.userid = c.userid)
For performance, consider an index on car(userid, year).
Note that this might return multiple cars per user if you have duplicate (userid, year) in cars. It would be better to have a real date rather than just the year.
Maybe there are better and more efficient way to query this. Here is my solution;
select users.userid, cars.*
from users
left join cars on cars.userid = users.userid
join (SELECT userid, MAX(year) AS maxDate
FROM cars
GROUP BY userid) as sub on cars.year = sub.maxDate;

mySQL gives syntax error on subquery with valid syntax

I'm trying to find the film that has been rented the most without using limit. I'm trying to use the following query:
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) = MAX(
SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id);
but mySQL tells me that I have a syntax error somewhere in here SELECT COUNT(r2.rental_id)
FROM rental r2, inventory however, when I run the subquery independently it returns the expected table. Am I doing something massively wrong?
relevant database schema:
film(film id, title, description, release year, language id, original language id, rental duration, rental rate, length, replacement cost, rating, special features, last update)
inventory(inventory id, film id, store id, last update)
rental(rental id, rental date, inventory id, customer id, return date, staff id, last update)
You can't use MAX() over a result set, but you can use
someValue >= ALL (subquery)
to achieve what you're attempting, because ALL requires that the preceding operator be true for all values in the set.
Try this:
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) >= ALL (
SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id);
I don't have a database to test in, but this should work:
Edited to LIMIT 1 instead of SELECT TOP 1 for MySQL)
SELECT f.title, f.film_id
FROM film f
JOIN inventory i ON f.film_id = i.film_id
JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY f.film_id
HAVING COUNT(r.rental_id) = (SELECT COUNT(r2.rental_id)
FROM rental r2, inventory i2
WHERE i2.inventory_id = r2.inventory_id
GROUP BY i2.film_id
ORDER BY COUNT(r2.rental_id) desc
LIMIT 1) s

MySQL Query not displaying correctly

I am having to set up a query that retrieves the last comment made on a customer, if no one has commented on them for more than 4 weeks. I can make it work using the query below, but for some reason the comment column won't display the latest record. Instead it displays the oldest, however the date shows the newest. It may just be because I'm a noob at SQL, but what exactly am I doing wrong here?
SELECT DISTINCT
customerid, id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments
WHERE customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
GROUP BY customerid
ORDER BY maxdate
The first "WHERE" clause is just ensuring that it shows only customers from a specific area, and that they are "past due enabled". The second makes sure that the customer has not been commented on within the last 27 days. It's grouped by customerid, because that is the number that is associated with each individual customer. When I get the results, everything is right except for the comment column...any ideas?
Join much better to nested query so you use the join instead of nested query
Join increase your speed
this query resolve your problem.
SELECT DISTINCT
customerid,id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments inner join customers on comments.customerid = customers.id
WHERE comments.pastdue='1' AND comments.hubarea='1' AND DATEDIFF(NOW(), comments.date) <= 27
GROUP BY customerid
ORDER BY maxdate
I think this might probably do what you are trying to achieve. If you can execute it and maybe report back if it does or not, i can probably tweak it if needed. Logically, it ' should' work - IF i have understood ur problem correctly :)
SELECT X.customerid, X.maxdate, co.id, c.customername, co.user, co.comment
FROM
(SELECT customerid, MAX(date) AS 'maxdate'
FROM comments cm
INNER JOIN customers cu ON cu.id = cm.customerid
WHERE cu.pastdue='1'
AND cu.hubarea='1'
AND DATEDIFF(NOW(), cm.date) <= 27)
GROUP BY customerid) X
INNER JOIN comments co ON X.customerid = co.customerid and X.maxdate = co.date
INNER JOIN customer c ON X.customerid = c.id
ORDER BY X.maxdate
You need to have subquery for each case.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN
(
SELECT DISTINCT ID
FROM customers
WHERE pastdue = 1 AND hubarea = 1
) c ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL
The first join gets the latest record for each customer.
The second join shows only customers from a specific area, and that they are "past due enabled".
The third join, which uses LEFT JOIN, select all customers that has not been commented on within the last 27 days. In this case,only records without on the list are selected because of the condition d.customerID IS NULL.
But tomake your query shorter, if the customers table has already unique records for customer, then you don't need to have subquery on it.Directly join the table and put the condition on the WHERE clause.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN customers c
ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL AND
c.pastdue = 1 AND
c.hubarea = 1
Two of your table columns are not contained in either an aggregate function or the GROUP BY clause. for example suppose that you have two data rows with the same customer id and same date, but with different comment data. how SQL should aggregate these two rows? :( it will generate an error...
try this
select customerid, id, customername, user,date, comment from(
select customerid, id, customername, user,date, comment,
#rank := IF(#current_customer = id, #rank+ 1, 1),
#current_customer := id
from comments
where customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
order by customerid, maxdate desc
) where rank <= 1

Joining 2 sql queries in MySQL

I'm trying to join 2 sql queries in one query.
The first one gets the count of rooms per hotel.
The second one gets the count of checked guests in hotel.
I'm trying to get occupancy rate per hotel.
SELECT hotel_id, count(room_id)
FROM room
group by room.hotel_id
SELECT h.hotel_id, count(k.room_id)
FROM room_reservation as kr , room as k , hotel as h
where kr.room_id = k.room_id and k.hotel_id = h.hotel_id
group by k.hotel_id
How can i do this ?
select aux.hotel_id, ((coalesce(aux2.total, 0)*1.0)/aux.total)*100 as 'ocupancy rate'
from (SELECT hotel_id, count(room_id) as 'total'
FROM room
group by room.hotel_id) aux
LEFT OUTER JOIN (SELECT h.hotel_id, COUNT(k.room_id) as 'total'
FROM room_reservation as kr
INNER JOIN room as k ON (kr.room_id = k.room_id)
INNER JOIN hotel as h ON (k.hotel_id = h.hotel_id)
GROUP BY k.hotel_id) aux2 on aux.hotel_id = aux2.hotel_id
You can definitely do this with one query. One approach is just to union together your queries.
However, I think the following does what you want in one stroke:
SELECT r.hotel_id, count(distinct k.room_id) as numrooms,
count(distinct kr.room_id) as numreserved
FROM room k left outer join
room_reservation kr
on kr.room_id = k.room_id
group by r.hotel_id
I'm not positive, without knowing more about the tables. In particular, reservations have a time component which rooms and hotels don't have. How is this incorporated into your queries?
Join all your queries, aggregate to get the number of rooms/reservations per hotel, and divide:
SELECT hotel_id,
COUNT(DISTINCT r.room_id) / CONVERT(decimal, COUNT(*)) * 100.0 AS occupancy_rate
FROM hotel h
LEFT OUTER JOIN room r ON h.hotel_id = r.hotel_id
LEFT OUTER JOIN room_reservation rr ON r.room_id = rr.room_id
GROUP BY h.hotel_id
i hope this is self-explanatory:
select hotel_id, sum(guests)/count(room_id) occupancy_level
from (
select r.hotel_id, r.room_id, count(*) guests
from room r
left join room_reservation rr on rr.room_id = r.room_id
group by r.hotel_id, r.room_id
) temp
group by hotel_id
UPDATE - inspired by #Gordon Linoff to include unreserved rooms:
select r.hotel_id, count(*) / count(distinct r.room_id) occupancy_level
from room r
left join room_reservation rr on rr.room_id = r.room_id
group by r.hotel_id, r.room_id
It can be done very simply assuming that there will always be equal or less reservations than total hotel rooms at any given time in the room_reservation table and that a hotel room will only have 0 or 1 corresponding rows in the room_reservation table as previous reservations for a room are deleted (it seems that way because in your second query, you are not doing any kind of filtration like selecting only the most recent reservations per room, etc.):
SELECT
a.hotel_id,
(COUNT(b.room_id) / COUNT(*))*100 AS occupancy_rate
FROM
room a
LEFT JOIN
room_reservation b ON a.room_id = b.room_id
GROUP BY
a.hotel_id
If you need more details about the hotel beyond just the hotel_id, an additional INNER JOIN will be required.