Implementing Count Function In SQL Query With Inner Joins - mysql

I have a query which is the following :
select person.ID, person.personName, round(avg(TIMESTAMPDIFF(DAY,orderDate,shippedDate)),2)) as 'Average' from orders inner join person person.personID= orders.personID where shippedDate is not null group by orders.personID;
The query above outputs 10 rows. I want to add a field which would count how how many rows there are in the query above in total.
I have tried to implement the SQL COUNT function but am struggling with the syntax as it has an INNER JOIN.

If you are running MySQL 8.0, you can do a window count:
select
person.ID,
person.personName,
round(avg(timestampdiff(day, o.orderDate, o.shippedDate)),2)) average,
count(*) over() total_no_rows
from orders o
inner join person p on p.personID = o.personID
where o.shippedDate is not null
group by p.personID, o.personName
Note that I made a few fixes to your query:
table aliases make the query easier to read and write
it is a good practice to qualify all column names with the table they belong to - I made a few assumptions that you might need to review
every non-aggregated column should belong to the group by clause (this is a good practice, and a requirement in most databases)

if you are not using Mysql 8.0 you can use Subquery:
select COUNT(*) FROM (
person.ID,
person.personName,
round(avg(TIMESTAMPDIFF(DAY,orderDate,shippedDate)),2)) as 'Average' from
orders inner join person person.personID= orders.personID where shippedDate
is not null group by orders.personID
);
and if you are using MYSQL 8.0 use window function like below:
select
person.ID,
person.personName,
round(avg(timestampdiff(day, o.orderDate, o.shippedDate)),2)) average,
count(*) over() total_no_rows
from orders o
inner join person p on p.personID = o.personID
where o.shippedDate is not null
group by p.personID, o.personName

Related

Can I use sub queries and a join in the same statement in MYSQL

This problem has left me clueless. I am trying to use a Join in mysql and a subquery and I keep getting a syntax error.
The statement in question is
SELECT Customer.customer_id, Customer.name, Order.address FROM Customer
WHERE customer_id = (SELECT customer_id FROM Order WHERE customer_id = "625060836f7496e9fce3bbc6")
INNER JOIN Order ON Customer.customer_id=Order.customer_id;
I have tried to just use the query without the Subquery and it works fine.
SELECT Customer.customer_id, Customer.name, Order.address FROM Customer
INNER JOIN Order ON Customer.customer_id=Order.customer_id;
Removing the join but keeping the subquery also works.
SELECT Customer.customer_id, Customer.name, Order.address FROM Customer
WHERE customer_id = (SELECT customer_id FROM Order WHERE customer_id = "625060836f7496e9fce3bbc6")
Only using both the subquery and the join results in a syntax error
I cannot seem to find the error.
What have I done wrong here.
Thanks in advance
The secret is to get the syntax right!
When querying more than one table it's good practice to use aliases to reference them and where multiple tables share the same column names, also string literals should be delimited with single 'quotes'.
In this specific example however the subquery is superfluous, just use the string literal directly in the where clause.
SELECT c.customer_id, c.name, o.address
FROM Customer c
JOIN Order o ON c.customer_id = o.customer_id
WHERE c.customer_id = (
SELECT customer_id
FROM Order
WHERE customer_id = '625060836f7496e9fce3bbc6'
);

Mysql select query count & Distinct are not working properly

I am developing an eCommerce website using Laravel 8. I write the following script for find out total price & total quantity under a single order number. From following script getting the ERROR where is the problem please help me.
*At first I write row mysql then i will convert laravel query Builder.
SELECT COUNT (total_price) as totaPrice, COUNT (productqty) as proQnty
FROM (SELECT DISTINCT order_id FROM orderDetails)
LEFT JOIN ordertbl
ON ordertbl.id = orderDetails.order_id;
I guess you want to sum the prices and quantities, so use SUM() aggregate function.
Also you should do a LEFT join of ordertbl to orderDetails and not the other way around:
SELECT ot.id,
SUM(od.total_price) AS totaPrice,
SUM(od.productqty) AS proQnty
FROM ordertbl ot LEFT JOIN orderDetails od
ON ot.id = od.order_id
WHERE ot.id = ?
GROUP BY ot.id;
Or, without a join:
SELECT SUM(total_price) AS totaPrice,
SUM(productqty) AS proQnty
FROM orderDetails
WHERE order_id = ?;
Replace ? with the id of the order that you want.
In Your raw in missing the tablename alis for the subquery ..
Your raw query should be
SELECT COUNT(total_price) as totaPrice, COUNT(productqty) as proQnty
FROM (
SELECT DISTINCT order_id FROM orderDetails
) T
LEFT JOIN ordertbl ON ordertbl.id = T.order_id;

MySQL aliasing not working when subjoining

I have two main tables: orders and PayPal transactions. I'm trying to get only the distinct values from my PayPal transactions table. Since there is no unique identifier in my transactions table I have tried to use a subquery to retrieve them.
The problem with my query is that MySQL doesn't recognize my aliases. Therefore, it gives me an Unknown column error.
/* SQL Error (1054): Unknown column 'pp.Date' in 'field list' */
SELECT
pp.Date
FROM hub.orders o
LEFT JOIN
(SELECT p.transaction_of_interest AS ppID
FROM financial.paypal AS p
GROUP BY p.transaction_of_interest
) AS pp ON pp.ppID = o.ex_trans_id
You are not getting DATE column from PP sub-query. If you include that column it will work as you would expect. If your result set multiplying because of TRANSACTION_OF_INTEREST values are not distinct then you should use a function on P.DATE like MAX to singularize yor TRANSACTION_OF_INTEREST values.
Which PP.DATE values you are need ? Is there any condition like last date or something ?
SELECT PP.DATE
FROM HUB.ORDERS O
LEFT JOIN (SELECT P.TRANSACTION_OF_INTEREST AS PPID,P.DATE
FROM FINANCIAL.PAYPAL AS P
GROUP BY P.TRANSACTION_OF_INTEREST,P.DATE
) AS PP ON PP.PPID = O.EX_TRANS_ID
You can only refer to those fields via the derived table's alias that you included in the select list for the derived table. Since you did not include the date field in the select list, you cannot reference it.
You need to add the ¬Date¬ field to the select list in the subquery and to the group by clause as well.
SELECT
pp.Date
FROM hub.orders o
LEFT JOIN
(SELECT p.transaction_of_interest AS ppID, p.Date
FROM financial.paypal AS p
GROUP BY p.transaction_of_interest, p.Date
) AS pp ON pp.ppID = o.ex_trans_id
My best interpretation of your question is that you want distinct dates of PayPal transactions.
If you only want dates from the paypal table, doesn't this do what you want?
SELECT DISTINCT p.DATE
FROM financial.paypal p;
If the dates come from the orders table, but you only want them for PayPal transactions, then LEFT JOIN is not appropriate:
SELECT DISTINCT o.Date
FROM hub.orders o JOIN
financial.paypal p
ON pp.ppID = o.ex_trans_id

Converting Multiple subqueries with GROUP BY to JOIN

I'm working on a simple ordering system in MySQL and I came across this snag that I'm hoping some SQL genius can help me out with.
I have a table for Orders, Payments (with a foreign key reference to the Order table), and OrderItems (also, with a foreign key reference to the Order table) and what I would like to do is get the total outstanding balance (Total and Paid) for the Order with a single query. My initial thought was to do something simple like this:
SELECT Order.*, SUM(OrderItem.Amount) AS Total, SUM(Payment.Amount) AS Paid
FROM Order
JOIN OrderItem ON OrderItem.OrderId = Order.OrderId
JOIN Payment ON Payment.OrderId = Order.OrderId
GROUP BY Order.OrderId
However, if there are multiple Payments or multiple OrderItems, it messes up Total or Paid, respectively (eg. One OrderItem record with an amount of 100 along with two Payment Records will produce a Total of 200).
In order to overcome this, I can use some subqueries in the following way:
SELECT Order.OrderId, OrderItemGrouped.Total, PaymentGrouped.Paid
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId
JOIN (
SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped ON PaymentGrouped.OrderId = Order.OrderId
As you can imagine (and as an EXPLAIN on this query will show), this is not exactly an optimal query so, I'm wondering, is there any way to convert these two subqueries with GROUP BY statements into JOINs?
The following is likely to be faster with the right indexes:
select o.OrderId,
(select sum(oi.Amount)
from OrderItem oi
where oi.OrderId = o.OrderId
) as Total,
(select sum(p.Amount)
from Payment p
where oi.OrderId = o.OrderId
) as Paid
from Order o;
The right indexes are OrderItem(OrderId, Amount) and Payment(OrderId, Amount).
I don't like writing aggregation queries this way, but it can sometimes help performance in MySQL.
Some answers have already suggested using a correlated subquery, but have not really offered an explanation as to why. MySQL does not materialise correlated subqueries, but it will materialise a derived table. That is to say with a simplified version of your query as it is now:
SELECT Order.OrderId, OrderItemGrouped.Total
FROM Order
JOIN (
SELECT OrderItem.OrderId, SUM(OrderItem.Amount) AS Total
FROM OrderItem
GROUP BY OrderItem.OrderId
) OrderItemGrouped ON OrderItemGrouped.OrderId = Order.OrderId;
At the start of execution MySQL will put the results of your subquery into a temporary table, and hash this table on OrderId for faster lookups, whereas if you run:
SELECT Order.OrderId,
( SELECT SUM(OrderItem.Amount)
FROM OrderItem
WHERE OrderItem.OrderId = OrderId
) AS Total
FROM Order;
The subquery will be executed once for each row in Order. If you add something like WHERE Order.OrderId = 1, it is obviously not efficient to aggregate the entire OrderItem table, hash the result to only lookup one value, but if you are returning all orders then the inital cost of creating the hash table will make up for itself it not having to execute the subquery for every row in the Order table.
If you are selecting a lot of rows and feel the materialisation will be of benefit, you can simplifiy your JOIN query as follows:
SELECT Order.OrderId, SUM(OrderItem.Amount) AS Total, PaymentGrouped.Paid
FROM Order
INNER JOIN OrderItem
ON OrderItem.OrderID = Order.OrderID
INNER JOIN
( SELECT Payment.OrderId, SUM(Payment.Amount) AS Paid
FROM Payment
GROUP BY Payment.OrderId
) PaymentGrouped
ON PaymentGrouped.OrderId = Order.OrderId;
GROUP BY Order.OrderId, PaymentGrouped.Paid;
Then you only have one derived table.
What about something like this:
SELECT Order.OrderId, (
SELECT SUM(OrderItem.Amount)
FROM OrderItem as OrderItemGrouped
where
OrderItemGrouped.OrderId = Order.OrderId
), AS Total,
(
SELECT SUM(Payment.Amount)
FROM Payment as PaymentGrouped
where
PaymentGrouped.OrderId = Order.OrderId
) as Paid
FROM Order
PS: You win again #Gordon xD
Select o.orderid, i.total, s.paid
From orders o
Left join (select orderid, sum(amount)
From orderitem) i
On i.orderid = o.orderid
Ieft join (select orderid, sum(amount)
From payments) s
On s.orderid = o.orderid

Optimize SQL: Customers that haven't ordered for x days

I have created this SQL in order to find customers that haven't ordered for X days.
It is returning a result set, so this post is mainly just to get a second opinion on it, and possible optimizations.
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
(SELECT COUNT(*)
FROM orders o2
WHERE o2.user_id=o.user_id
AND o2.order_status=1) AS order_count,
(SELECT o4.order_created
FROM orders o4
WHERE o4.user_id=o.user_id
AND o4.order_status=1
ORDER BY o4.order_created DESC LIMIT 1) AS last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND o.user_id NOT IN
(SELECT o3.user_id
FROM orders o3
WHERE o3.user_id=o.user_id
AND o3.order_status=1
AND DATE(o3.order_created) > "2013-12-14")
Can you guys find any potential problems with this SQL? Dates are dynamically inserted.
The final SQL that I put in production, will basically only include o.order_id, i.identity_id and o.order_count - this order_count will need to be correct. The other selected fields and 'last_order' subquery will not be included, it's only for testing.
This should give me a list of users that have their last order on that particular day, and is a newsletter subscriber. I am particular in doubt about correctness of the NOT IN part in the WHERE clause, and the order_count subquery.
There are several problems:
A. Using functions on indexable columns
You are searching for orders by comparing DATE(order_created) with some constant. This is a terrible idea, because a) the DATE() function is executed for every row (CPU) and b) the database can't use an index on the column (assuming one existed)
B. Using WHERE ID NOT IN (...)
Using a NOT IN (...) is almost always a bad idea, because optimizers usually have trouble with this construct, and often get the plan wrong. You can almost always express it as an outer join with a WHERE condition that filters for misses using an IS NULL condition for a joined column (and adds the side benefit of not needing DISTINCT, because there's only ever one miss returned)
C. Leaving joins that filtering out of large portions of rows too late
The earlier you can mask off rows by not making joins the better. You can do this by joining less likely to match tables earlier in the joined table list, and by putting non-key conditions into join rather than the where clause to get the rows excluded as early as possible. Some optimizers to this anyway, but I've often found they don't
D. Avoid correlated subqueries like the plague!
You have several correlated subqueries - ones that are executed for every row of the main table. That's really an incredibly bad idea. Again sometimes the optimizer can craft them into a join, but why rely (hope) on that. Most correlated subqueries can be expressed as a join; you examples are no exception.
With the above in mind, there are some specific changes:
o2 and o4 are the same join, so o4 may be dispensed with entirely - just use o2 after conversion to a join
DATE(order_created) = "2013-12-14" should be written as order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
This query should be what you want:
SELECT
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
count(o2.user_id) AS order_count,
max(o2.order_created) AS last_order
FROM orders o
LEFT JOIN orders o2 ON o2.user_id = o.user_id AND o2.order_status=1
LEFT JOIN orders o3 ON o3.user_id = o.user_id
AND o3.order_status=1
AND o3.order_created >= "2013-12-15 00:00:00"
JOIN user_identities ui ON o.user_id=ui.user_id
JOIN identities i ON ui.identity_id=i.identity_id AND i.identity_email != ''
JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
WHERE o.order_created between "2013-12-14 00:00:00" and "2013-12-14 23:59:59"
AND o.order_status=1
AND o3.order_created IS NULL -- This gets only missed joins on o3
GROUP BY
o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email;
The last line is how you achieve the same as NOT IN (...) using a LEFT JOIN
Disclaimer: Not tested.
Can't really comment on the results as you have not posted any table declares or example data, but your query has 3 correlated sub queries which is likely to make it perform poorly (OK, one of those is for last_order and is only for testing).
Eliminating the correlated sub queries and replacing them with joins would give something like this:-
SELECT o.order_id,
o.order_status,
o.order_created,
o.user_id,
i.identity_firstname,
i.identity_email,
Sub1.order_count,
Sub2.last_order
FROM orders o
INNER JOIN user_identities ui ON o.user_id=ui.user_id
INNER JOIN identities i ON ui.identity_id=i.identity_id
AND i.identity_email!=''
INNER JOIN subscribers s ON i.identity_id=s.identity_id
AND s.subscriber_status=1
AND s.subsriber_type=e
AND s.subscription_id=1
LEFT OUTER JOIN
(
SELECT user_id, COUNT(*) AS order_count
FROM orders
WHERE order_status=1
GROUP BY user_id
) Sub1
ON o.user_id = Sub1.user_id
LEFT OUTER JOIN
(
SELECT user_id, MAX(order_created) as last_order
FROM orders
WHERE order_status=1
GROUP BY user_id
) AS Sub2
ON o.user_id = Sub2.user_id
LEFT OUTER JOIN
(
SELECT DISTINCT user_id
FROM orders
WHERE order_status=1
AND DATE(order_created) > "2013-12-14"
) Sub3
ON o.user_id = Sub3.user_id
WHERE DATE(o.order_created) = "2013-12-14"
AND o.order_status=1
AND Sub3.user_id IS NULL