subquery shows more that one row group by - mysql

I am trying to get the data for the best 5 customers in a railway reservation system. To get that, I tried getting the max value by summing up their fare every time they make a reservation. Here is the code.
SELECT c. firstName, c.lastName,MAX(r.totalFare) as Fare
FROM customer c, Reservation r, books b
WHERE r.resID = b.resID
AND c.username = b.username
AND r.totalfare < (SELECT sum(r1.totalfare) Revenue
from Reservation r1, for_res f1, customer c1,books b1
where r1.resID = f1.resID
and c1.username = b1.username
and r1.resID = b1.resID
group by c1.username
)
GROUP BY c.firstName, c.lastName, r.totalfare
ORDER BY r.totalfare desc
LIMIT 5;
this throws the error:[21000][1242] Subquery returns more than 1 row
If I remove the group by from the subquery the result is:(its a tabular form)
Jade,Smith,1450
Jade,Smith,725
Jade,Smith,25.5
Monica,Geller,20.1
Rach,Jones,10.53
But that's not what I want, as you can see, I want to add the name 'Jade' with the total fare.

I just don't see the point for the subquery. It seems like you can get the result you want with a sum()
select c.firstname, c.lastname, sum(totalfare) as totalfare
from customer c
inner join books b on b.username = c.username
inner join reservation r on r.resid = b.resid
group by c.username
order by totalfare desc
limit 5
This sums all reservations of each client, and use that information to sort the resulstet. This guarantees one row per customer.
The query assumes that username is the primary key of table customer. If that's not the case, you need to add columns firstname and lastname to the group by clause.
Note that this uses standard joins (with the inner join ... on keywords) rather than old-school, implicit joins (with commas in the from clause: these are legacy syntax, that should not be used in new code.

Related

SQL Query for getting maximum value from a column Joining from Another Table

This is a slight variant of the question I asked here
SQL Query for getting maximum value from a column
I have a Person Table and an Activity Table with the following data
-- PERSON-----
------ACTIVITY------------
I have got this data in the database about users spending time on a particular activity.
I intend to get the data when every user has spent the maximum number of hours.
My Query is
Select p.Id as 'PersonId',
p.Name as 'Name',
act.HoursSpent as 'Hours Spent',
act.Date as 'Date'
From Person p
Left JOIN (Select MAX(HoursSpent), Date from Activity
Group By HoursSpent, Date) act
on act.personId = p.Id
but it is giving me all the rows for Person and not with the Maximum Numbers of Hours Spent.
This should be my result.
You have several issues with your query:
The subquery to get hours is aggregated by date, not person.
You don't have a way to bring in other columns from activity.
You can take this approach -- joins and group by, but it requires two joins:
select p.*, a.* -- the columns you want
from Person p left join
activity a
on a.personId = p.id left join
(select personid, max(HoursSpent) as max_hoursspent
from activity a
group by personid
) ma
on ma.personId = a.personId and
ma.max_hoursspent = a.hoursspent;
Note that this can return duplicates for a given person -- if there are ties for the maximum.
This is written more colloquially using row_number():
select p.*, a.* -- the columns you want
from Person p left join
(select a.*,
row_number() over (partition by a.personid order by a.hoursspent desc) as seqnum
from activity a
) a
on a.personId = p.id and a.seqnum = 1
ma.max_hoursspent = a.hoursspent;

get data based on MAX date and customer id

I have two tables: customers and contracts. The common key between them is customer_id. I need to link these two tables to represent if my fictitious business is on contract with a customer.
The customer -> contract table has a one to many relationship, so a customer can have an old contract on record. I want the latest. This is currently handled by contract_id which is auto-incremented.
My query is supposed to grab the contract data based on customer_id and the max contract_id for that customer_id.
My query currently looks like this:
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
where a.contract_id = MAX(a.contract_id);
The answer is probably ridiculously obvious and I'm just not seeing it.
Since the most recent contract will be the one with the highest a.contract_id, simply ORDER BY and LIMIT 1
SELECT * FROM(
SELECT co.*
FROM contracts co
LEFT JOIN customers c ON co.customer_id = c.customer_id
WHERE co.customer_id ='135') a
ORDER BY a.contract_id DESC
LIMIT 1
You can use NOT EXISTS() :
SELECT * FROM contracts c
LEFT JOIN customers co
ON(co.customer_id = c.customer_id)
WHERE co.customer_id = '135'
AND NOT EXISTS(SELECT 1 FROM contracts co2
WHERE co2.customer_id = co.customer_id
AND co2.contract_id > co.contract_id)
This will make sure it's the latest contract, it is dynamic for all customers, you can just remove WHERE co.customer_id = '135' and you will get all the results.
In general, you can't use an aggregation function on the WHERE clause, only on the HAVING() which will be usually combined with a GROUP BY clause.

SQL retrieving filtered value in subquery

in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL
Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.
There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2

Query the name of the top X people on transaction count from a different table

I'm messing around with the Sakila sample database in MySQL and I would like to get the top two people who rented the most movies. I've tried a few things and my most recent attempt is:
SELECT c.last_name, Count(r.rental_id)as NumberOfRentals FROM customer c
INNER JOIN rental r ON c.customer_id = r.customer_id
ORDER BY NumberOfRentals DESC
LIMIT 2
It only returns the first name in the database though...
You need a GROUP BY clause. Without having one defined, MySQL will aggregate all the rows which match the given parameters into a single row, instead of aggregating them based on a defined criteria (the last_name in this case).
SELECT c.last_name, Count(r.rental_id)as NumberOfRentals FROM customer c
INNER JOIN rental r ON c.customer_id = r.customer_id
GROUP BY c.last_name
ORDER BY NumberOfRentals DESC
LIMIT 2

Rewrite IN subquery as JOIN

I've never had good performance with IN in MySQL and I've hit a performance issue with it again.
I'm trying to create a view. The relevant part of it is:
SELECT
c.customer_id,
....
IF (c.customer_id IN (
SELECT cn.customer_id FROM customer_notes cn
), 1, 0) AS has_notes
FROM customers c;
Basically, I just want to know if the customer has a note attached to it or not. It doesn't matter how many notes. How can I rewrite this using JOIN to speed it up?
The customers table currently has 1.5 million rows so performance is an issue.
Don't you need the customer ID selected? As it stands, aren't you running the subquery once per customer, and getting a stream of true or false values with no idea which one applies to which customer?
If that is what you need, you don't need to reference the customers table (unless you keep your database in a state of semantic disintegrity and there could be entries in customer_notes for which there is no corresponding customer - but then you have bigger problems than the performance of this query); you can simply use:
SELECT DISTINCT Customer_ID
FROM Customer_Notes
ORDER BY Customer_ID;
to obtain the list of customer ID values with at least one entry in the Customer_Notes table.
If you want a list of Customer ID values and an associated true/false value, then you need to do a join:
SELECT C.Customer_ID,
CASE WHEN N.Have_Notes IS NULL THEN 0 ELSE 1 END AS Has_Notes
FROM Customers AS C
LEFT JOIN (SELECT Customer_ID, COUNT(*) AS Have_Notes
FROM Customer_Notes
GROUP BY Customer_ID) AS N
ON C.Customer_ID = N.Customer_ID
ORDER BY C.Customer_ID;
If this gives poor performance, check that you have an index on Customer_Notes.Customer_ID. If that isn't the issue, study the query plan.
Can't do ... in a view
The petty restrictions on what is allowed in a view is always a nuisance in any DBMS (MySQL is not alone in having restrictions). However, we can do it with a single regular join. I just remembered. COUNT(column) only counts non-null values, returning 0 if all values are null, so - if you don't mind getting a count rather than just 0 or 1 - you can use:
SELECT C.Customer_ID,
COUNT(N.Customer_ID) AS Num_Notes
FROM Customers AS C
LEFT JOIN Customer_Notes AS N
ON C.Customer_ID = N.Customer_ID
GROUP BY C.Customer_ID
ORDER BY C.Customer_ID;
And if you absolutely must have 0 or 1:
SELECT C.Customer_ID,
CASE WHEN COUNT(N.Customer_ID) = 0 THEN 0 ELSE 1 END AS Has_Notes
FROM Customers AS C
LEFT JOIN Customer_Notes AS N
ON C.Customer_ID = N.Customer_ID
GROUP BY C.Customer_ID
ORDER BY C.Customer_ID;
Note that the use of 'N.Customer_ID' is crucial - though any column in the table would do (but you've not divulged the names of any other columns, AFAICR) and I'd normally use something other than the joining column for clarity.
I think EXISTS suits your situation better than JOIN or IN.
SELECT
IF (EXISTS (
SELECT *
FROM customer_notes cn
WHERE c.customer_id = cn.customer_id),
1, 0) AS filter_notes
FROM customers
Try this
SELECT
CASE WHEN cn.customer_id IS NOT NULL THEN 1
ELSE 0
END AS filter_notes
FROM customers c LEFT JOIN customer_notes cn
ON c.customer_id= cn.customer_id