MySQL Query - listing data from 2 tables where NOT EXISTS - mysql

I am trying to calculate point usage by customers of an online store. Over time, customers acquire points. As points are redeemed, the value of the customer.points is modified to reflect points remaining to redeem. Any additional points acquired are also added to customer.points. Because of this, the only true mechanism for determining the number of points a customer has had over the lifetime of an account is to SUM total usage with remaining points (order.total + customer.points).
The query below returns the desired results, BUT ONLY FOR THOSE CUSTOMERS WHO HAVE REDEEMED POINTS. What I would like, since ALL customers have points, is to also be able to return the points balance for those WHO HAVE NOT REDEEMED points.
SELECT customer.store_id, customer.customer_id, `order`.total + customer.points
AS allpoints, customer.firstname, customer.lastname
FROM `order`
INNER JOIN customer ON `order`.customer_id = customer.customer_id
WHERE (customer.store_id =3)
GROUP BY customer.customer_id

It sounds like you need to use an left outer join, which will return all rows for the table on the left side of the join, and only return rows for the table on the right side if records exist. This does mean that you'll need to handle null values for the order table when a record doesn't exist.
SELECT
customer.store_id,
customer.customer_id,
isnull(order.total, 0) + customer.points AS allpoints,
customer.firstname,
customer.lastname
FROM customer
LEFT OUTER JOIN order ON order.customer_id = customer.customer_id
WHERE (customer.store_id =3)
GROUP BY customer.customer_id

Related

Error Code 1111. Invalid use of group function in MySQL

The following image is the ER diagram of the database:
My task is to create a report that includes the following:
the storeID
the store name
the number of unique players that have purchased a badge from the store
the number of unique players that have not purchased a badge from the store
the total money spent at the store
the most expensive badge a player has purchased at the store
the cheapest badge a player has purchased at the store
the average price of the items that have been purchased at the store.
But when I am trying to execute the following SQL command, I am getting an error saying: Error Code 1111. Invalid use of group function.
use treasurehunters;
select rpt_category.storeId,
rpt_category.storeName,
total_purchased_user,
non_purchased_player,
total_spent,
expensive_badge,
cheapest_badge
avg_spent
from
(select badgename as expensive_badge
from badge
inner join purchase
where cost = max(cost)) rpt_data_2
inner join
(select badgename as cheapest_badge
from badge
inner join purchase
where cost = min(cost)) rpt_data_3
inner join
(select distinct count(username) as total_purchased_user,
storeid,
storename,
sum(cost) as total_spent,
average(cost) as avg_spent
from player
inner join purchase
inner join store
inner join badge
on store.storeID = purchase.storeID and
purchase.username= player.username and
purchase.badgeID = badge.badgeId) rpt_category
inner join
(select count (username) as non_purchased_player,
storeid
from player
inner join purchase
on purchase.storeid != store.storeid and
player.userername= purchase.uername ) rpt_data_1;
Now, what can I do to get rid of that error.
The cause of your error is likely that you're implying a store-level grouping without explicitly grouping on that column with a GROUP BY clause. Therefore, you're attempting to extract aggregate results that are impossible at the table-level.
You can probably resolve this by adding GROUP BY store.storeID in each of your subqueries. However, there's a lot more wrong with this query that makes it unfavorable to attempt to diagnose and resolve it.
This is all doable in a single query / grouping. Here's what your query should look like:
SELECT
store.storeID,
MAX(store.storeName) AS storeName,
COUNT(DISTINCT purchase.username) AS total_purchased_user,
MAX(player_count.players) - COUNT(DISTINCT purchase.username) AS non_purchased_user,
SUM(purchase.cost) AS total_spent,
AVG(purchase.cost) AS avg_spent,
SUBSTRING(MIN(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS cheapest_badge,
SUBSTRING(MAX(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS expensive_badge
FROM store
LEFT JOIN purchase ON store.storeID = purchase.storeID
LEFT JOIN badge ON purchase.badgeID = badge.badgeId
CROSS JOIN (SELECT COUNT(*) AS players FROM player) AS player_count
GROUP BY store.storeID;
What's happening here (working bottom-up):
GROUP BY store to ensure the results are aggregated by that, and all other metrics are calculated
FROM store / LEFT JOIN all other tables ensures we get metrics from every store, whether or not there are purchases for it
CROSS JOIN (SELECT COUNT(*) FROM players) this is a hack to give us a running total of all players that we can reference against store player-purchase counts to get the "didn't purchase" count simply and quickly, without any additional joins
COUNT(DISTINCT purchase.username) ensures that user counts are referenced from purchases. This also means we don't have to join on the players table in this main portion of the query to get purchase counts.
SUM / AVERAGE work like you had them
SUBSTRING(MIN(CONCAT... these calculations are using Scalar-Aggregate Reduction, a technique I invented to prevent the need for self-joining a query to get associated min/max values. There's more on this technique here: SQL Query to get column values that correspond with MAX value of another column?
Cheers!

Is there a SQL function to return rows only if row value is not found based on another condition

Tried doing this a few times and not getting success. I have 2 tables orders and shipments. Order number is unique in the orders table but the value in the shipments table references order.number which can occur more than 1 time as an order can have multiple shipments.
So table orders has order.number and shipments has shipments.number, shipments.order_number and shipments.stock_location
I would like to only return the order.number where no shipments within that order.number were no shipments within that order shipped by a stock location. If I apply a where statement it just is removing the lines in the data tied to the shipment.number not taking into consideration a order.number could have a shipment where shipment.stock_location did ship from my excluded warehouse.
If it makes more sense here is actually my full code which is not working. What Im attempting to do is create a mailing list for any order where it fully drop shipped on all its shipments.
select orders.number, addresses.firstname,
addresses.lastname, addresses.address1, addresses.address2,
addresses.city, states.abbr,
addresses.zipcode
from orders
join addresses on orders.ship_address_id = addresses.id
join shipments on shipments.order_id = orders.id
join states on states.id = addresses.state_id
where orders.id NOT IN (
select shipments.order_id from shipments
where shipments.stock_location_id !=1 and orders.shipment_state='shipped')
AND orders.completed_at>= {{daterangepicker1.startFormattedString}} and
orders.completed_at<= {{daterangepicker1.endFormattedString}}
group by orders.number
order by orders.completed_at DESC;
I do not want any order numbers to show if any shipments.number within that order number shipped via stock location 1
You are a little bit unclear in your request, but I believe at a high level you want to return all the orders where there is no record in the shipped location for that order. The simplest pattern I can see to do this is as follows:
SELECT orders.number
FROM orders
JOIN addresses ON orders.ship_address_id = addresses.id
JOIN states ON states.id = addresses.state_id
WHERE orders.id NOT IN (
SELECT shipments.order_id
FROM shipments
WHERE shipments.stock_location_id = 1
AND orders.shipment_state = 'shipped'
/* This part filters out all the order.ids with at least one shipment record from this location */
)
AND orders.completed_at >= '{{daterangepicker1.startFormattedString}}'
AND orders.completed_at <= '{{daterangepicker1.endFormattedString}}'
GROUP BY orders.id DESC;
This will return a list of all your orders except for ones with shipped information. Since the shipped information appears to be the only limiting caveat, this should be a clean way to address this problem.

Join With Where Clause - Nested Join/Where?

I've been trying to look for examples that better match my specific needs but I can't seem to find any.
I've got the following SQL statement, which works like a charm:
SELECT
customers.id,
customers.customer_name,
SUM(shipments.balance) AS shipmentBalance
FROM customers
LEFT JOIN shipments ON customers.id = shipments.bill_to
GROUP BY customers.id, customers.customer_name
ORDER BY shipmentBalance DESC;
But, I would like to be able to add a where condition to the JOIN, as I don't want ALL of the shipments balances being SUMMED up, rather only the ones that have balances greater than their related payment distribution amounts.
At this point, in a separate query, I can pull the shipments with balances that are greater than their payment distribution amounts using the following query:
SELECT
shipments.id,
shipments.pro_number,
shipments.balance,
SUM(payments_distributions.amount) AS Sum
FROM
shipments
LEFT JOIN payments_distributions ON shipments.pro_number = payments_distributions.shipment_id
WHERE balance > (SELECT IFNULL(SUM(payments_distributions.amount),0) FROM payments_distributions WHERE payments_distributions.shipment_id = pro_number)
GROUP BY shipments.id,shipments.pro_number;
But I'm not sure how to combine them.
Place the filter of the Shipment table in the ON clause:
SELECT
customers.id,
customers.customer_name,
SUM(shipments.balance) AS shipmentBalance
FROM customers
LEFT JOIN shipments ON customers.id = shipments.bill_to
AND balance > (SELECT IFNULL(SUM(payments_distributions.amount),0)
FROM payments_distributions
WHERE payments_distributions.shipment_id = pro_number)
GROUP BY customers.id, customers.customer_name
ORDER BY shipmentBalance DESC;

Nested SQL Statement not figuring SUM as expected

I am trying to do a quick accounting sql statement and I am running into some problems.
I have 3 tables registrations, events, and a payments table. Registrations are individual transactions, events are information about what they signed up for, and payments are payments made to events.
I would like to total the amounts paid by the registrations, put the event name and event startdate into a column, then total the amount of payments made so far. If possible I would also like to find a total not paid. I believe the bottom figures out everything except the payment amount total. The payment amount total is much larger than it should be, more than likely by using the SUM it is counting payments multiple times because of the nesting.
select
sum(`reg_amount`) as total,
event_name,
event_startdate,
(
select sum(payment_amount) as paid
from registrations
group by events.event_id
) pay
FROM registrations
left join events
on events.event_id = registrations.event_id
left join payments
on payments.event_id = events.event_id
group by registrations.event_id
First, you should use aliases so we know where all the fields come from. I'm guessing that payment_amount comes from the payments table and not from registrations.
If so, your subquery is adding up the payments from the outer table for every row in registrations. Probably not what you want.
I think you want something like this:
select sum(`reg_amount`) as total,
e.event_name,
e.event_startdate,
p.TotPayements
FROM registrations r left join
events e
on e.event_id = r.event_id left join
(select event_id, sum(payment_amount) as TotPayments
from payments
group by event_id
) p
on p.event_id = e.event_id
group by r.event_id;
The idea is to aggregate the payments at the lowest possible level, to avoid duplications caused by joining. That is, aggregate before joining.
This is a guess as to the right SQL, but it should put you on the right path.

mysql - How to fix this query?

I am really having a headache since the other day on how to fix this mysql statement to get my desired result. I also want to inform that I am new to mysql and prorgramming.
I have 4 tables CUSTOMER, CUSTOMER_ACCT_SETTING, DEBT, and PAYMENT.
Here are the 4 tables with their record so you can relate.
CUSTOMER
CUSTOMER_ACCT_SETTING
DEBT
PAYMENT
When I run this mysql statement:
SELECT C.CUSTOMER_ID, C.NAME, C.ADDRESS, C.CONTACT_NUMBER,
SUM(((CAS.INTEREST_RATE / 100) * D.AMOUNT) + D.AMOUNT) - COALESCE(SUM(P.AMOUNT), 0) AS CURRENT_BALANCE
FROM CUSTOMER C
INNER JOIN CUSTOMER_ACCT_SETTING CAS ON (C.CUSTOMER_ID = CAS.CUSTOMER_ID)
LEFT JOIN DEBT D ON (C.CUSTOMER_ID = D.CUSTOMER_ID)
LEFT JOIN PAYMENT P ON C.CUSTOMER_ID = P.CUSTOMER_ID
GROUP BY (C.CUSTOMER_ID)
ORDER BY C.NAME
The result is below:
PS: The result is ordered by name.
My question is:
1.) Why did I get a negative result on the CURRENT_BALANCE column in the first row? I am expecting the result to be around 16374.528.
My desired result is like this:
You are projecting your payments through all your debts by doing a join with both tables at the same time. So you essentially get 5 applications of your payment on customer 4 and zero applications on all the other customers. (so NULL on P.AMOUNT yields X - NULL = NULL). To see this, remove the "GROUP BY" and the "SUM" and just return your amounts paid and debited. Then if you group/sum these results manually by customer, you'll see what's going on.
To get the results you expect, you will need to use subqueries or some other mechanism like temporary tables. Something like this:
SELECT C.CUSTOMER_ID,
(SELECT SUM(P.AMOUNT) FROM PAYMENT P
WHERE P.CUSTOMER_ID = C.CUSTOMER_ID) AS TOTAL_PAID_BY_CUSTOMER
FROM CUSTOMER C
The answer to #1 is that each row in your result set has the payment attached to it. That is, for customer #1, you're getting three instances of the 8132.