SQL Query doesnt get correct count - mysql

I try to write a query that would return me for each user how many orders he made and at how many diffrent stores :
SELECT user_id,display_name,count(store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
It seem to me that this query should do it but for some reason i get same values in the orders and stores columns (and i have double checked that there should be a diffrence). Any one can help with that?
BTW, order_id is primary key, store_id and user_id are foreign keys on orders.
EDIT:
SELECT user_id,display_name,count(distinct store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
worked, can any one explain why would i have to add distinct keyword in this case?
RESOLVED:
using comments by #McAdam331 i understood that the original query counts the rows with the same user id in both counts, and since post_ids are unique, the the unique ids count is the same as row count, while store_id which can have doubles doesnt have same count as the row count. therefore using distinct solved the problem.

I would just use COUNT(*) to get number of orders, and COUNT(DISTINCT store_id) to get the number of stores:
SELECT u.user_id, u.display_name, COUNT(*) AS numOrders, COUNT(DISTINCT store_id) AS numStores
FROM orders o
JOIN users u
...
The reason your query fails is because you're omitting DISTINCT on store_id. Because for every row you have a store_id and an order_id, you have the same number of store_ids and order_ids in each group. However, that's not what you're really looking for, you're looking for the number of DISTINCT store_ids.
If it's possible for the user to have the same order_id twice (though it doesn't make sense to me) you can add the DISTINCT keyword in that count function too.

You've got it backwards. Start with users and join to orders. Something like this:
SELECT users.user_id, display_name,
count(DISTINCT store_id) as store_count,
count(order_id) as order_count
FROM users
LEFT OUTER JOIN orders ON orders.user_id=users.user_id
where users.user_id NOT IN (7,766,79)
group by users.user_id
order by order_count desc;
Note the use of DISTINCT within the store_count because you may have multiple orders from the same store.

Related

count rows in one table based on the id's of the other table

I have two tables:
Orders
Orders product
I currently have the query that counts the amounts of orders:
SELECT COUNT(*) as total FROM orders WHERE DATE(`created_on`) = CURDATE()
This gives me the amount of orders from today
Now I want to change this query, so insted of that I count the orders I count the amount of products that they ordered.
The order_products table is like this:
-id
-order_id
-product_name
-product_weight
etc etc
I am only interested in the order_id because that links this table to the orders table. For every product that is ordered, 1 row is added in this table linked to the order with the order_id.
Is it possible to have one single query to select the count of that, or is that not possible? I think I have to use a LEFT JOIN for this operation, but I cannot seem to find it.
You just need to create inner join between those table and applying the relation throw foreign key then you need to add group by condition on what you are looking for
SELECT COUNT(*) as total FROM orders o inner join order_product op on(op.order_id=o.order_id) WHERE DATE(`created_on`) = CURDATE() group by op.id

Optimising MySql Query with LEFT JOINS

I am trying to get a list of customer who haven't ordered for 6months or more. I have 4 tables which I have used in the query
accounts (account_id)
stores (store_id, account_id)
customers (store_id, customer_id)
orders (order_id, customer_id, store_id)
The customer and orders table are very big, 3M and 26M rows respectively, so using left joins in my query make the query time extremely long. I believe I have index my tables correctly
here is my query i have used
SELECT cus.customer_id, MAX(o.order_date), cus.store_id, s.account_id, store_name
FROM customers cus
LEFT JOIN stores s ON s.store_id=cus.store_id
LEFT JOIN orders o ON o.customer_id=cus.customer_id AND o.store_id=cus.store_id
WHERE account_id=26 AND
(SELECT order_id
FROM orders o
WHERE o.customer_id=cus.customer_id
AND o.store_id=cus.store_id
AND o.order_date < CURRENT_DATE() - INTERVAL 6 MONTH
ORDER BY order_id DESC LIMIT 0,1) IS NOT NULL
GROUP BY cus.customer_id, cus.client_id;
I need to get the last order date and this is the reason why I have joined the orders table, however since the customers can have multiple orders it is returning multiple rows of the customer and that is why I have used the group by clause.
If anyone can assist me with my query.
Start with this:
SELECT customer_id, MAX(order_date) AS last_order_date
FROM orders
GROUP BY customer_id
HAVING last_order_date < NOW() - INTERVAL 6 MONTH;
Assuming that gives you the relevant customer_ids, then move on to
SELECT ...
FROM ( that-select-as-a-subquery ) AS old
JOIN other-tables-as-needed ON USING(customer_id)
If necessary, JOIN back to orders to get more info. Do not try to get other columns in that subquery. (That's a "groupwise max" problem.)
Your strategy of using an ordered and limited subquery on your orders table is probably responsible for your poor performance.
This subquery will generate a virtual table showing the date of the most recent order for each distinct customer. (I guess a distinct customer is distinguished by the pair customer_id, store_id).
SELECT MAX(order_date) recent_order_date,
customer_id, store_id
FROM orders
GROUP BY customer_id, store_id
Then, you can use that subquery as if it were a table in your query.
SELECT cus.customer_id, summary.recent_order_date,
cus.store_id, s.account_id, store_name
FROM customers cus
JOIN stores s ON s.store_id=cus.store_id
JOIN (
SELECT MAX(order_date) recent_order_date,
customer_id, store_id
FROM orders
GROUP BY customer_id, store_id
) summary ON summary.customer_id = cus.customer_id
AND summary.store_id = s.store_id
WHERE summary.recent_order_date < CURRENT_DATE - INTERVAL 6 MONTH
AND store.account_id = 26
This approach moves the GROUP BY to an inner query, and eliminates the wasteful ORDER BY ... LIMIT query pattern. The inner query doesn't have to be remade for every row in the outer query.
I don't understand why you used LEFT JOIN operations in your query.
And, by the way, most people, when they're new to SQL, don't have great intuition about which indexes are useful and which aren't. So, when asking for help, it's always good to show your indexes. In the meantime, read this:
http://use-the-index-luke.com/

SQL query is not retrieving all the fields

I have to tables in my database, the first one (participants) look just like that:
And I have another called votes in which I can vote for any participants.
So my problem is that I'm trying to get all the votes of each participant but when I execute my query it only retrieves four rows sorted by the COUNT of votes, And the other remaining are not appearing in my query:
SELECT COUNT(DISTINCT `votes`.`id`) AS count_id, participants.name
AS participant_name FROM `participants` LEFT OUTER JOIN `votes` ON
`votes`.`participant_id` = `participants`.`id` GROUP BY votes.participant_id ORDER BY
votes.participant_id DESC;
Retrieves:
I think the problem is that you're grouping by votes.participant_id, rather than participants.id, which limits you to participants with votes, the outer join notwithstanding. Check out http://sqlfiddle.com/#!2/c5d3d/5/0
As what i have understood from the query you gave you were selecting unique id's from the votes table and I assume that your column id is not an identity. but it would be better if that would be an identity? and if so, here is my answer.replace your select with these.
Select count (votes.participant.id) as count_id ,participants.name as participant_name
from participants join votes
on participants.id = vote.participant_id
group by participants.name
order by count_id
just let me know if it works
cheers

Prevent duplicates in mysql group by with join statement

I've got a problem that can't be new, but I can't figure out how to get the answer I want. It is probably something simple that I'm missing
Using mysql 5.5, I have 2 tables, 'referrals' and 'status'. I want to count referrals that have been cancelled, grouped by appt_date:
SELECT SUM(1) AS count, appt_date FROM referrals
GROUP BY appt_date
JOIN status ON referrals.id=status.referral_id
WHERE status.status_name="cancelled"
This works fine until I have a referral that gets cancelled twice. In other words, a referral with a given id that has 2 rows in the status table with matching referral_id will get counted twice.
How to count each referral record only once when doing the join operation here?
EDIT:
My real question should have been this:
SELECT SUM(quantity) AS count, appt_date FROM referrals
GROUP BY appt_date
JOIN status ON referrals.id=status.referral_id
WHERE status.status_name="cancelled"
since each referral can have its own quantity.
Try changing count to count(DISTINCT id)
So count was only a label, I should have seen that :)
The temptation is to do SUM(DISTINCT quantity) but obviously will just remove dupe "quantities", which isn't what you want. It looks like you will need to join on a derived select table. Have a look at this...
http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables

MySQL multiple row joining

I have an issue with joining of tables that I have not managed to solve. Somehow I have the impression that it is more simple than I think. Anyhow:
I have three tables:
orders
orderlines
payments
In "orders" every line corresponds to one order made by a customer. Every order has an order_id which is the primary key for that table. In "orderlines" I keep the content of the order, that is references to the products and services on the order. One order can, and typically has, many orderlines. Finally, in payments I store one row for every transaction made to pay for an order.
One order ideally never has more than one corresponding row in payments. But since customers are customers it is not unusual that someone pays the same invoice twice, hinting that the payments table can have two or more payments for one order.
Therefore it would be useful to create a query that joins all three tables in a relevant way, but I have not managed to do so. For instance:
SELECT orders.order_id, SUM(orderlines.amount), SUM(payments.amount)
FROM orders
LEFT JOIN orderlines
ON orders.order_id = orderlines.order_id
LEFT JOIN payments
ON orders.order_id = payments.order_id
GROUP BY orders.order_id
The purpose of this join is to find out if the SUM of the products on the order equals the SUM in payments. The problem here is that the two tables payments and orderlines "distract" each other by both causing multiple rows while joining.
Is there a simple solution to this problem?
Maybe I'm overcomplicating things, but using both tables and producing the sum would always lead too wrong results, i.e. one order has 10 orderline rows and 1 payment rows => the payment amount is going to be added 10 times. I guess you have to use subselects like this below (you didn't use anything from your table "orders" but the id, so I left it out, because all orders have orderlines):
SELECT t1.order_id, t1.OrderAmount, t2.PaymentAmount
FROM (SELECT SUM(amount) AS OrderAmount, order_id
FROM orderlines
GROUP BY order_id) AS t1
LEFT JOIN (SELECT SUM(amount) AS PaymentAmount, order_id
FROM payments
GROUP BY order_id) AS t2
ON t1.order_id=t2.order_id
I think what you want to do is get the sum of all the items, and the sum of all the payments, and then link them together. A sub-select is able to do this.
Something like: (ps I have no database on hand so it might not be valid sql)
SELECT * FROM orders
LEFT JOIN (SELECT order_id, SUM(amount) FROM orderlines GROUP BY order_id) AS ordersums
ON orders.order_id = ordersums.order_id
LEFT JOIN (SELECT order_id, SUM(amount) FROM payments GROUP BY order_id) AS paymentsums
ON orders.order_id = paymentsums.order_id;