Prevent duplicates in mysql group by with join statement

Prevent duplicates in mysql group by with join statement - mysql

I've got a problem that can't be new, but I can't figure out how to get the answer I want. It is probably something simple that I'm missing
Using mysql 5.5, I have 2 tables, 'referrals' and 'status'. I want to count referrals that have been cancelled, grouped by appt_date:
SELECT SUM(1) AS count, appt_date FROM referrals
GROUP BY appt_date
JOIN status ON referrals.id=status.referral_id
WHERE status.status_name="cancelled"
This works fine until I have a referral that gets cancelled twice. In other words, a referral with a given id that has 2 rows in the status table with matching referral_id will get counted twice.
How to count each referral record only once when doing the join operation here?
EDIT:
My real question should have been this:
SELECT SUM(quantity) AS count, appt_date FROM referrals
GROUP BY appt_date
JOIN status ON referrals.id=status.referral_id
WHERE status.status_name="cancelled"
since each referral can have its own quantity.

Try changing count to count(DISTINCT id)

So count was only a label, I should have seen that :)
The temptation is to do SUM(DISTINCT quantity) but obviously will just remove dupe "quantities", which isn't what you want. It looks like you will need to join on a derived select table. Have a look at this...
http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables

Related

Use SELECT through three table

I tried to write a query, but unfortunately I didn't succeed.
I want to know how many packages delivered over a given period by a person.
So I want to know how many packages were delivered by John (user_id = 1) between 01-02-18 and 28-02-18. John drives another car (another plate_id) every day.
(orders_drivers.user_id, plates.plate_name, orders.delivery_date, orders.package_amount)
I have 3 table:
orders with plate_id delivery_date package_amount
plates with plate_id plate_name
orders_drivers with plate_id plate_date user_id
I tried some solutions but didn't get the expected result. Thanks!

Try using JOINS as shown below:
SELECT SUM(o.package_amount)
FROM orders o INNER JOIN orders_drivers od
ON o.plate_id=od.plate_id
WHERE od.user_id=<the_user_id>;
See MySQL Join Made Easy for insight.
You can also use a subquery:
SELECT SUM(o.package_amount)
FROM orders o
WHERE EXISTS (SELECT 1
FROM orders_drivers od
WHERE user_id=<user_id> AND o.plate_id=od.plate_id);

SELECT sum(orders.package_amount) AS amount
FROM orders
LEFT JOIN plates ON orders.plate_id = orders_drivers.plate_id
LEFT JOIN orders_driver ON orders.plate_id = orders_drivers.plate_id
WHERE orders.delivery_date > date1 AND orders.delivery_date < date2 AND orders_driver.user_id = userid
GROUP BY orders_drivers.user_id
But seriously, you need to ask questions that makes more sense.
sum is a function to add all values that has been grouped by GROUP BY.
LEFT JOIN connects all tables by id = id. Any other join can do this in this case, as all ids are unique (at least I hope).
WHERE, where you give the dates and user.
And GROUP BY userid, so if there are more records of the same id, they are returned as one (and summed by their pack amount.)
With the AS, your result is returned under the name 'amount',

If you want the total of packageamount by user in a period, you can use this query:
UPDATE: add a where clause on user_id, to retrieve John related data
SELECT od.user_id
, p.plate_name
, SUM(o.package_amount) AS TotalPackageAmount
FROM orders_drivers od
JOIN plates p
ON o.plate_id = od.plate_id
JOIN orders o
ON o.plate_id = od.plate_id
WHERE o.delivery_date BETWEEN convert(datetime,01/02/2018,103) AND convert(datetime,28/02/2018,103)
AND od.user_id = 1
GROUP BY od.user_id
, p.plate_name
It groups rows on user_id and plate_name, filter a period of delivery_date(s) and then calculate the sum of packageamount for the group

SQL Query doesnt get correct count

I try to write a query that would return me for each user how many orders he made and at how many diffrent stores :
SELECT user_id,display_name,count(store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
It seem to me that this query should do it but for some reason i get same values in the orders and stores columns (and i have double checked that there should be a diffrence). Any one can help with that?
BTW, order_id is primary key, store_id and user_id are foreign keys on orders.
EDIT:
SELECT user_id,display_name,count(distinct store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
worked, can any one explain why would i have to add distinct keyword in this case?
RESOLVED:
using comments by #McAdam331 i understood that the original query counts the rows with the same user id in both counts, and since post_ids are unique, the the unique ids count is the same as row count, while store_id which can have doubles doesnt have same count as the row count. therefore using distinct solved the problem.

I would just use COUNT(*) to get number of orders, and COUNT(DISTINCT store_id) to get the number of stores:
SELECT u.user_id, u.display_name, COUNT(*) AS numOrders, COUNT(DISTINCT store_id) AS numStores
FROM orders o
JOIN users u
...
The reason your query fails is because you're omitting DISTINCT on store_id. Because for every row you have a store_id and an order_id, you have the same number of store_ids and order_ids in each group. However, that's not what you're really looking for, you're looking for the number of DISTINCT store_ids.
If it's possible for the user to have the same order_id twice (though it doesn't make sense to me) you can add the DISTINCT keyword in that count function too.

You've got it backwards. Start with users and join to orders. Something like this:
SELECT users.user_id, display_name,
count(DISTINCT store_id) as store_count,
count(order_id) as order_count
FROM users
LEFT OUTER JOIN orders ON orders.user_id=users.user_id
where users.user_id NOT IN (7,766,79)
group by users.user_id
order by order_count desc;
Note the use of DISTINCT within the store_count because you may have multiple orders from the same store.

Count specific occurence using SQL

I have a problem with SQL Select query. I need to count order's which belong to account which has one or more orders which cost equals 1.
Here is the structure:
Could anyone help with select query. The result should be 2. Many thanks for help.

You have to make two nested queries against the table. An outer one that counts the number of orders for an account and an inner one that finds the accounts that have at least one order with cost equals 1.
SELECT Account_ID, COUNT(*)
FROM Orders
WHERE Account_ID IN (SELECT Account_ID FROM Orders WHERE PRODUCT_Cost = 1)
GROUP BY Account_ID

MySQL Left Joins

EDIT: OK, think I need to be clearer - I'd like the result to show all the 'names' that appear in the table acme, against the counts (if any) from the results table. Hope that makes sense?
Having a huge issue and my brain isn't working as it should.
All I want to do is, in a single statement via a join, count the number of rows for a common field.
SELECT name, COUNT(name) as Count FROM acme
SELECT name, COUNT(name) as Total FROM results
I'm sure it should be something like this...
SELECT acme.name, COUNT(acme.name) As Count,
COUNT(results.name) as Total
FROM acme
LEFT JOIN results ON acme.name = results.name
GROUP BY name
ORDERY BY name
But it doesn't bring back the correct counts.
Thoughts, where am I going wrong...this, I know, will be very very obvious.
H.

From your feedback, this will get what you want. You need to FIRST get unique names / counts from the "ACME" file first... THEN join that to the results table for count of records from that, otherwise, you would end up with a Cartesian result of counts. If ACME had Name "X" 5 times and Results had "X" 20 times, your total would be 100. The query below will actually result with a single row showing "X", 5, 20 which is what it appears you are looking for.. (for however many names exist in ACME).
I've changed to a LEFT join in case there are names in the ACME table that DO NOT exist in the RESULTS table, it won't drop them from your final answer
select
JustACME.Name,
JustACME.NameCount,
COALESCE( COUNT( * ), 0 ) as CountFromResultsTable
from
( select a.Name
count(*) as NameCount
from
acme a
group by
a.Name ) JustACME
LEFT JOIN results r
on JustACME.Name = r.Name
group by
JustACME.Name

It looks like it's because of the join, it's screwing with your counts. Try running the join with SELECT * FROM... and look at the resulting table. The problem should be obvious from there. =D

Yes, your join (inner or outer, doesn't matter) is messing with your results.
In fact, it is likely returning the product of rows with the same name, rather than the sum.
What you want to do is sum the rows from the first table, sum the rows from the second table, and join that.
Like this:
Select name, a.count as Count, r.count as Total
From (select name, count(*) from acme group by name) a
Left join (select name, count(*) from results group by name) r using (name)

I do not see why you forbid using two statements this just complicates everything.
The only reason I see for this is to get the two results into one answer.
I do not know if the latter would work but I would try this:
SET #acount = (SELECT count(DISTINCT name) FROM acme);
SELECT count(DISTINCT name) as Total, #acount as Count FROM results
I would post this as one query and (hopefully) get back the correct results. Let me note, that it is not clear from you question if you want to know how often every name doubles or if you want to count unique names.

MYSQL Count group by rows ignoring effect of JOIN and SUM fields on Joined tables

I have 3 tables:
Orders
- id
- customer_id
Details
- id
- order_id
- product_id
- ordered_qty
Parcels
- id
- detail_id
- batch_code
- picked_qty
Orders have multiple Details rows, a detail row per product.
A detail row has multiple parcels, as 10'000 ordered qty may come from 6 different batches, so goods from batches are packed and shipped separately. The picked quantity put in each parcel for a detail row should then be the same as the ordered_qty.
... hope that makes sense.
Im struggling to write a query to provide summary information of all of this.
I need to Group By customer_id to provide a row of data per customer.
That row should contain
Their total number of orders
Their total ordered_qty of goods across all orders
Their total picked_qty of goods across all orders
I can get the first one with:
SELECT customer_id, COUNT(*) as number_of_orders
FROM Orders
GROUP BY Orders.customer_id
But when I LEFT JOIN the other two tables and add the
SELECT ..... SUM(Details.ordered_qty) AS total_qty_ordered,
SUM(Parcels.picked_qty) AS total_qty_picked
.. then I get results that dont seem to add up for the quantities, and the COUNT(*) seems to include the additional lines from the JOIN which obviously then isn't giving me the number of Orders anymore.
Not sure what to try next.
===== EDIT =======
Here's the query I tried:
SELECT
customer_id,
COUNT(*) as number_of_orders,
SUM(Details.ordered_qty) AS total_qty_ordered,
SUM(Parcels.picked_qty) AS total_qty_picked
FROM Orders
LEFT JOIN Details ON Details.order_id=Order.id
LEFT JOIN Parcels ON Parcels.detail_id=Detail.id
GROUP BY Orders.customer_id

try COUNT(distinct Orders.order_id) as number_of_orders,
as in
SELECT
customer_id,
COUNT(distinct Orders.order_id) as number_of_orders,
SUM(Details.ordered_qty) AS total_qty_ordered,
(select SUM(Parcels.picked_qty)
FROM Parcels WHERE Parcels.detail_id=Detail.id ) AS total_qty_picked
FROM Orders
LEFT JOIN Details ON Details.order_id=Order.id
GROUP BY Orders.customer_id
EDIT: added an other select with subselect

Is there any particular reason you feel the need to combine all these in one query? Simplify by breaking it up in to separate queries, and if you want a single call to get the results, put the queries in a stored procedure, using temp tables.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Prevent duplicates in mysql group by with join statement - mysql

Try changing count to count(DISTINCT id)

Related

Use SELECT through three table

SQL Query doesnt get correct count

Count specific occurence using SQL

MySQL Left Joins

MYSQL Count group by rows ignoring effect of JOIN and SUM fields on Joined tables

Categories

Resources