Group by does not provide accurate grouping

Group by does not provide accurate grouping - mysql

Attempting to create outputs that match the following screenshot:
When I attempt the following query:
SELECT t.amount, d.DOMAIN_NAME,td.month_number
FROM transaction t
JOIN transaction_date td ON t.trans_date_key = td.trans_date_key
JOIN domain d ON t.domain_key = d.domain_key
WHERE td.month_number =7
ORDER BY amount DESC;
I get the output of:
When I implement this query:
SELECT t.amount, d.DOMAIN_NAME,td.month_number
FROM transaction t
JOIN transaction_date td ON t.trans_date_key = td.trans_date_key
JOIN domain d ON t.domain_key = d.domain_key
WHERE td.month_number =7
GROUP BY domain_name
ORDER BY amount DESC;
I get the output of:
Why is my grouping only performing accurately on a few of the domain names, but not others?

You are not using GROUP BY correctly. You would need to use an aggregate function to sum the amounts. On the other hand, all non-aggregated columns should be listed in the GROUP BY clause.
Consider:
SELECT SUM(t.amount) total_amount, d.domain_name, td.month_number
FROM transaction t
INNER JOIN transaction_date td ON t.trans_date_key = td.trans_date_key
INNER JOIN domain d ON t.domain_key = d.domain_key
WHERE td.month_number = 7
GROUP BY d.domain_name, td.month_number
ORDER BY total_amount DESC;
What happens with the way you used GROUP BY is that MySQL actually picks a random record out of those that have the same domain_name. On most other RDBMS (and in non-ancient versions of MySQL), this would have generated a syntax error.

Related

WITH ROLLUP not showing sum

SELECT a.Store,
t.CM
FROM sales_data a
INNER JOIN (
SELECT Store, SUM(no_of_bill) AS CM
FROM cash_memo c
WHERE c.Bill_Date < '2018-02-28'
GROUP BY Store
) t USING (Store)
GROUP BY a.Store WITH ROLLUP
gives me the below result:
[query result][1]
I am not sure whats wrong that I am not getting the sum of 'CM' when using ROLLUP

The rollup needs to be with the group by. Try this:
SELECT Store, SUM(no_of_bill) AS CM
FROM cash_memo c
WHERE c.Bill_Date < '2018-02-28'
GROUP BY Store WITH ROLLUP;
Your query should fail with a syntax error because CM is not aggregated in the outer query and is not in the GROUP BY.
If you phrased the logic correctly, it would be:
SELECT a.Store, SUM(t.CM)
This would work and the ROLLUP would do what you expect.
Note: The JOIN seems unnecessary for the query. But if you really want it, you can just include it in the above query with no subquery.

You are using GROUP BY in the outer query without doing any aggregation.
Join the 2 tables, aggregate on the resultset returned by the join and use ROLLUP:
SELECT s.Store, SUM(c.no_of_bill) AS CM
FROM sales_data s INNER JOIN cash_memo c
ON c.Store = s.Store
WHERE c.Bill_Date < '2018-02-28'
GROUP BY s.Store WITH ROLLUP

subquery shows more that one row group by

I am trying to get the data for the best 5 customers in a railway reservation system. To get that, I tried getting the max value by summing up their fare every time they make a reservation. Here is the code.
SELECT c. firstName, c.lastName,MAX(r.totalFare) as Fare
FROM customer c, Reservation r, books b
WHERE r.resID = b.resID
AND c.username = b.username
AND r.totalfare < (SELECT sum(r1.totalfare) Revenue
from Reservation r1, for_res f1, customer c1,books b1
where r1.resID = f1.resID
and c1.username = b1.username
and r1.resID = b1.resID
group by c1.username
)
GROUP BY c.firstName, c.lastName, r.totalfare
ORDER BY r.totalfare desc
LIMIT 5;
this throws the error:[21000][1242] Subquery returns more than 1 row
If I remove the group by from the subquery the result is:(its a tabular form)
Jade,Smith,1450
Jade,Smith,725
Jade,Smith,25.5
Monica,Geller,20.1
Rach,Jones,10.53
But that's not what I want, as you can see, I want to add the name 'Jade' with the total fare.

I just don't see the point for the subquery. It seems like you can get the result you want with a sum()
select c.firstname, c.lastname, sum(totalfare) as totalfare
from customer c
inner join books b on b.username = c.username
inner join reservation r on r.resid = b.resid
group by c.username
order by totalfare desc
limit 5
This sums all reservations of each client, and use that information to sort the resulstet. This guarantees one row per customer.
The query assumes that username is the primary key of table customer. If that's not the case, you need to add columns firstname and lastname to the group by clause.
Note that this uses standard joins (with the inner join ... on keywords) rather than old-school, implicit joins (with commas in the from clause: these are legacy syntax, that should not be used in new code.

SQL beginner practice problems

Given two tables, orders (order_id, date, $, customer_id) and customers (ID, name)
Here's my method but I'm not sure if it's working & I'd like to know if there's faster/better way of solving these problems:
1) find out number of customers who made at least one order on date 7/9/2018
Select count (distinct customer_id)
From
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
) a
2) find out number of customers who did not make an order on 7/9/2018
Select count (customer_id) from customer where customer_id not in
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
)
3) find the date with most sales between 7/1 and 7/30
select date, max($)
from (
Select sum($),date from orders a
Left join customer b
On a.customer_id = b.ID
Group by date
Having date between 7/1 and 7/30
)
Thanks,

For problem 1, a valid solution might look like this:
SELECT COUNT(DISTINCT customer_id) x
FROM orders
WHERE date = '2018-09-07'; -- or is that '2018-07-09' ??
For problem 2, a valid solution might look like this:
SELECT COUNT(*) x
FROM customer c
LEFT
JOIN orders o
ON o.customer_id = x.customer_id
AND o.date = '2018-07-09'
WHERE o.crder_id IS NULL;
Assuming there are no ties, a valid solution to problem 3 might look like this:
SELECT date
, COUNT(*) sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2018-07-30'
GROUP
BY date
ORDER
BY sales DESC
LIMIT 1;

The default format for a date in MySQL is YYYY-MM-DD, although this can be customized. You have to put quotes around it, otherwise it's treated as an arithmetic expression.
And none of your queries need to join with the customer table. The customer ID is already in the orders table, and you're not returning any info about the customers (like the name or address), you're just counting them.
1) You don't need the subquery or grouping.
SELECT COUNT(DISTINCT customer_id)
FROM orders
WHERE date = '2018-07-09'
2) Again, you don't need GROUP BY in the subquery. There's also a better pattern than NOT IN to get the count of non-matching rows.
SELECT COUNT(*)
FROM customer AS c
LEFT JOIN order AS o on c.id = o.customer_id AND o.date = '2018-07-09'
WHERE o.id IS NULL
See Return row only if value doesn't exist for various patterns to do this.
3) You can't use MAX($) in the outer query because the inner query doesn't return a column with that name. But even if you fix that, it still won't work, because the date column won't necessarily come from the same row that has the maximum. See SQL select only rows with max value on a column for more explanation of this.
You don't need a subquery at all. Use a query that returns the total sales for each day, then use ORDER BY to get the highest one.
SELECT date, SUM($) AS total_sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2017-07-30'
GROUP BY date
ORDER BY total_sales DESC
LIMIT 1
If "most sales" is supposed to mean "most number of sales", replace SUM($) with COUNT(*).

mysql avg function not returning all records

I am using following query
select
*,
dealer.id As dealerID,
services.id as serviceID
from services
LEFT JOIN dealer
on services.dealer=dealer.id
LEFT JOIN reviews
ON reviews.dealer_id=dealer.id
where services.brand_id = '9' and
services.model_id='107' and
services.petrol > 0
ORDER BY services.total asc ,
AVG(reviews.rating) desc
I have 6 records and it should display 6 records instead its displaying only 1. When i remove AVG(reviews.rating) desc. It display all records.
mysql tables are
services
dealer
brand_id
model_id
petrol
id
total
dealer
id
name
reviews
id
dealer_id
rating
I am not sure where i am doing mistake. If some can help.

avg() is an aggregation function. That is, it takes data from multiple rows and summarizes it.
Without a group by, the query is an aggregation query over all the data. Such a query always returns exactly one row.
Most databases would return an error when you use select *, use an aggregation function, and have no group by. MySQL has a (mis)feature where this syntax is allowed (although on the newest versions, the default settings disallow this).
I'm not sure what you are trying to do, but avg() doesn't make sense in this context. Perhaps this does what you want:
ORDER BY services.total asc, reviews.rating desc

As already mentioned AVG() is aggregate ftn, so I have changed the desc of your order by to include to select the average values.
For future reference:
Providing snippets of raw data also helps. Creating an sql fiddle helps even more
select
*,
dealer.id As dealerID,
services.id as serviceID
from services
LEFT JOIN dealer
on services.dealer=dealer.id
LEFT JOIN reviews
ON reviews.dealer_id=dealer.id
where services.brand_id = '9' and
services.model_id='107' and
services.petrol > 0
ORDER BY services.total asc ,
(SELECT AVG(r2.rating) FROM reviews r2 RIGHT JOIN ON r2.dealer_id=dealer.id) desc

You might try:
SELECT
*,
AVG(c.rating) AS `avg__rating`,
b.id AS dealerID,
a.id AS serviceID
FROM services a
LEFT JOIN dealer b
on a.dealer = b.id
LEFT JOIN reviews c
ON c.dealer_id = b.id
WHERE a.brand_id = '9' and
a.model_id='107' and
a.petrol > 0
GROUP BY a.dealer, a.brand_id, a.model_id, a.petrol, a.id, a.total
ORDER BY a.total asc,
AVG(c.rating) desc
This adds a GROUP BY on the columns in your services table so you will get one row per services/dealer.

Is it possible to find COUNT function and then to show only certain values that are higher than 10

for example:
SELECT doctor.name
, doctor.surname
, COUNT(checkup.doctor)
FROM doctor
, checkup
WHERE doctor.id = checkup.doctor
GROUP
BY doctor.name
ORDER
BY checkup.doctor
this gives me list of all doctor that had checkup with patients, but i want to show only doctor with number of checkup more than 10 what to add to my sql

You want to limit your result rows according to an aggregation result. You'd do that in the HAVING clause.
As to your query: What is name? A unique name for every doctor? Otherwise better group by the ID - only then is this query valid according to the SQL standard. Then you are using a join syntax that we stopped using in the 1990s. Please use proper ANSI joins instead.
I prefer aggregating before joining, but that's just personal preference:
select d.name, d.surname, c.checkups
from doctor
join
(
select doctor as doctor_id, count(*) as checkups
from checkup
group by doctor
having count(*) > 10
) c on c.doctor_id = d.id
order by d.id;
You could just as well use
select d.name, d.surname, count(*) as checkups
from doctor d
join checkup c on c.doctor = d.id
group by d.id
having count(*) > 10
order by d.id;

TRY THIS: You can use HAVING clause to filter out the aggregation result and try to avoid outdated joins. Surname doesn't meaning there in GROUP BY so you use MAX to pick in select and remove from GROUP BY or leave as it is:
SELECT doctor.name
, doctor.surname
, COUNT(checkup.doctor)
FROM doctor
INNER JOIN checkup ON doctor.id = checkup.doctor
GROUP BY doctor.name, doctor.surname
HAVING COUNT(checkup.doctor) > 10

Do the fact you are using aggregation function as count If you want only the result with more then 10 COUNT(checkup.doctor)
you could filter the result using having
SELECT doctor.name,doctor.surname,COUNT(checkup.doctor)
FROM doctor
INNER JOIN checkup doctor.id=checkup.doctor
GROUP BY doctor.name
Having COUNT(checkup.doctor) > 10
ORDER BY COUNT(checkup.doctor)
the where clause work on the selecting row so don't know the result of an aggregated result .. for this SQL use the having clause that filter the result of a select

I think this will work for you:
SELECT
doctor.name, doctor.surname, checkup_stat.checkup_count
FROM
(SELECT doctor, COUNT(1) AS checkup_count FROM checkup GROUP BY doctor ORDER BY doctor) AS checkup_stat
RIGHT JOIN
doctor
ON
doctor.id = checkup_stat.doctor
WHERE
checkup_stat.checkup_count > 10;
Firstly, use a subquery to get the checkup count for every doctor.
Secondly, select the records from the subquery result where checkup count > 10.
Then, just join them together.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Group by does not provide accurate grouping - mysql

Related

WITH ROLLUP not showing sum

subquery shows more that one row group by

SQL beginner practice problems

mysql avg function not returning all records

Is it possible to find COUNT function and then to show only certain values that are higher than 10

Categories

Resources