SQL beginner practice problems

SQL beginner practice problems - mysql

Given two tables, orders (order_id, date, $, customer_id) and customers (ID, name)
Here's my method but I'm not sure if it's working & I'd like to know if there's faster/better way of solving these problems:
1) find out number of customers who made at least one order on date 7/9/2018
Select count (distinct customer_id)
From
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
) a
2) find out number of customers who did not make an order on 7/9/2018
Select count (customer_id) from customer where customer_id not in
(
Select customer_id from orders a
Left join customer b
On a.customer_id = b.ID
Group by customer_id,date
Having date = 7/9/2018
)
3) find the date with most sales between 7/1 and 7/30
select date, max($)
from (
Select sum($),date from orders a
Left join customer b
On a.customer_id = b.ID
Group by date
Having date between 7/1 and 7/30
)
Thanks,

For problem 1, a valid solution might look like this:
SELECT COUNT(DISTINCT customer_id) x
FROM orders
WHERE date = '2018-09-07'; -- or is that '2018-07-09' ??
For problem 2, a valid solution might look like this:
SELECT COUNT(*) x
FROM customer c
LEFT
JOIN orders o
ON o.customer_id = x.customer_id
AND o.date = '2018-07-09'
WHERE o.crder_id IS NULL;
Assuming there are no ties, a valid solution to problem 3 might look like this:
SELECT date
, COUNT(*) sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2018-07-30'
GROUP
BY date
ORDER
BY sales DESC
LIMIT 1;

The default format for a date in MySQL is YYYY-MM-DD, although this can be customized. You have to put quotes around it, otherwise it's treated as an arithmetic expression.
And none of your queries need to join with the customer table. The customer ID is already in the orders table, and you're not returning any info about the customers (like the name or address), you're just counting them.
1) You don't need the subquery or grouping.
SELECT COUNT(DISTINCT customer_id)
FROM orders
WHERE date = '2018-07-09'
2) Again, you don't need GROUP BY in the subquery. There's also a better pattern than NOT IN to get the count of non-matching rows.
SELECT COUNT(*)
FROM customer AS c
LEFT JOIN order AS o on c.id = o.customer_id AND o.date = '2018-07-09'
WHERE o.id IS NULL
See Return row only if value doesn't exist for various patterns to do this.
3) You can't use MAX($) in the outer query because the inner query doesn't return a column with that name. But even if you fix that, it still won't work, because the date column won't necessarily come from the same row that has the maximum. See SQL select only rows with max value on a column for more explanation of this.
You don't need a subquery at all. Use a query that returns the total sales for each day, then use ORDER BY to get the highest one.
SELECT date, SUM($) AS total_sales
FROM orders
WHERE date BETWEEN '2018-07-01' AND '2017-07-30'
GROUP BY date
ORDER BY total_sales DESC
LIMIT 1
If "most sales" is supposed to mean "most number of sales", replace SUM($) with COUNT(*).

Related

Optimising MySql Query with LEFT JOINS

I am trying to get a list of customer who haven't ordered for 6months or more. I have 4 tables which I have used in the query
accounts (account_id)
stores (store_id, account_id)
customers (store_id, customer_id)
orders (order_id, customer_id, store_id)
The customer and orders table are very big, 3M and 26M rows respectively, so using left joins in my query make the query time extremely long. I believe I have index my tables correctly
here is my query i have used
SELECT cus.customer_id, MAX(o.order_date), cus.store_id, s.account_id, store_name
FROM customers cus
LEFT JOIN stores s ON s.store_id=cus.store_id
LEFT JOIN orders o ON o.customer_id=cus.customer_id AND o.store_id=cus.store_id
WHERE account_id=26 AND
(SELECT order_id
FROM orders o
WHERE o.customer_id=cus.customer_id
AND o.store_id=cus.store_id
AND o.order_date < CURRENT_DATE() - INTERVAL 6 MONTH
ORDER BY order_id DESC LIMIT 0,1) IS NOT NULL
GROUP BY cus.customer_id, cus.client_id;
I need to get the last order date and this is the reason why I have joined the orders table, however since the customers can have multiple orders it is returning multiple rows of the customer and that is why I have used the group by clause.
If anyone can assist me with my query.

Start with this:
SELECT customer_id, MAX(order_date) AS last_order_date
FROM orders
GROUP BY customer_id
HAVING last_order_date < NOW() - INTERVAL 6 MONTH;
Assuming that gives you the relevant customer_ids, then move on to
SELECT ...
FROM ( that-select-as-a-subquery ) AS old
JOIN other-tables-as-needed ON USING(customer_id)
If necessary, JOIN back to orders to get more info. Do not try to get other columns in that subquery. (That's a "groupwise max" problem.)

Your strategy of using an ordered and limited subquery on your orders table is probably responsible for your poor performance.
This subquery will generate a virtual table showing the date of the most recent order for each distinct customer. (I guess a distinct customer is distinguished by the pair customer_id, store_id).
SELECT MAX(order_date) recent_order_date,
customer_id, store_id
FROM orders
GROUP BY customer_id, store_id
Then, you can use that subquery as if it were a table in your query.
SELECT cus.customer_id, summary.recent_order_date,
cus.store_id, s.account_id, store_name
FROM customers cus
JOIN stores s ON s.store_id=cus.store_id
JOIN (
SELECT MAX(order_date) recent_order_date,
customer_id, store_id
FROM orders
GROUP BY customer_id, store_id
) summary ON summary.customer_id = cus.customer_id
AND summary.store_id = s.store_id
WHERE summary.recent_order_date < CURRENT_DATE - INTERVAL 6 MONTH
AND store.account_id = 26
This approach moves the GROUP BY to an inner query, and eliminates the wasteful ORDER BY ... LIMIT query pattern. The inner query doesn't have to be remade for every row in the outer query.
I don't understand why you used LEFT JOIN operations in your query.
And, by the way, most people, when they're new to SQL, don't have great intuition about which indexes are useful and which aren't. So, when asking for help, it's always good to show your indexes. In the meantime, read this:
http://use-the-index-luke.com/

SQL retrieving filtered value in subquery

in this cust_id is a foreign key and ords returns the number of orders for every customers
SELECT cust_name, (
SELECT COUNT(*)
FROM Orders
WHERE Orders.cust_id = Customers.cust_id
) AS ords
FROM Customers
The output is correct but i want to filter it to retrieve only the customers with less than a given amount of orders, i don't know how to filter the subquery ords, i tried WHERE ords < 2 at the end of the code but it doesn't work and i've tried adding AND COUNT(*)<2 after the cust_id comparison but it doesn't work. I am using MySQL

Use the HAVING clause (and use a join instead of a subquery).....
SELECT Customers.cust_id, Customers.cust_name, COUNT(*) ords
FROM Orders, Customers
WHERE Orders.cust_id = Customers.cust_id
GROUP BY 1,2
HAVING COUNT(*)<2
If you want to include people with zero orders you change the join to an outer join.

There is no need for a correlated subquery here, because it calculates the value for each row which doesn't give a "good" performance. A better approach would be to use a regular query with joins, group by and having clause to apply your condition to groups.
Since your condition is to return only customers that have less than 2 orders, left join instead of inner join would be appropriate. It would return customers that have no orders as well (with 0 count).
select
cust_name, count(*)
from
customers c
left join orders o on c.cust_id = o.cust_id
group by cust_name
having count(*) < 2

Calculate Sum() but excluding JOIN

In the sales table, there is a point field. I want to sum(point) when grouping by sales.submit_date but that wont add up correctly because it will duplicate the records from JOIN
SELECT
COUNT(DISTINCT sales.sales_id) as TotalSales,
COUNT(DISTINCT sales_lines.id) as TotalLiness
FROM `sales`
JOIN sales_lines ON sales_lines.sales_id = sales.sales_id
GROUP BY sales.submit_date
SQL above, this will count the number of sales in the sales table and also count number of lines in the sales_lines (number of lines matched to sales_lines.sales_id = sales.sales_id). This seem to work fine.
How do I sum(`sales.point') in the sales only?

You could aggregate sales_lines up to the sales grain.
SELECT
S.submit.date,
,sum(s.point)
,COUNT(s.sales_id) as TotalSales
,SUM(SL.SalesLines) as TotalLines
FROM
sales S
INNER JOIN
(Select
sales_id
,count(distinct id) as SalesLines
FROM
sales_lines
GROUP BY
sales_id) SL
ON S.sales_id = SL.sales_id3
GROUP BY
s.submit_date

How to get all order ID which not paid in SQL Server 2008?

I want to get all order id numbers for selected customer which not paid till now, my data show as following:
What I want is Write a SELECT statement that answers this question:
select orderID
from order
where customer id = #custID
and Total cashmovementValue
for current order id
is less than total (sold quantity * salePrice )
for current order id
How to do it?
Thanks.

You need to compare the sum of each order line with the sum of each payment per order. GROUP BY and a few sub-queries is what you need to get the job done.
Something like this should work:
SELECT
O.OrderID
FROM [Order] O
INNER JOIN (
-- Add up cost per order
SELECT
OrderID,
SUM(SoldQuantity * P.SalePrice) AS Total
FROM OrderLine
INNER JOIN Product P ON P.ProductID = OrderLine.ProductID
GROUP BY OrderID
) OL ON OL.OrderID = O.OrderID
LEFT JOIN (
-- Add up total amount paid per order
SELECT
OrderID,
SUM(CashMovementValue) AS Total
FROM CashMovement
GROUP BY OrderID
) C ON C.OrderID = O.OrderID
WHERE
O.CustomerID = #custID
AND ( C.OrderID IS NULL OR C.Total < OL.Total )
EDIT
I've just noticed you're not storing the sale price on each order line. I've updated my answer accordingly, but this is a very bad idea. What will happen to your old orders if the price of an item changes? It is okay (and actually best practice) to denormalise the data by storing the price at the time of sale on each order line.

Returning min and max order dates per Client

I have a a table of clients. I have another of their (many) orders. I want to return the client with a min(order_date) and max(order_date) ... e.g. the date of the first and last order. I've started with the following, but it's returning the date of the very first order in the table (rather than first order per client).
thanks in advance
SELECT dbo.job.job_no,
wo_begin_dt = ( SELECT MIN(dbo.work_order.wo_begin_dt)
FROM dbo.job LEFT OUTER JOIN dbo.work_order
ON dbo.job.job_no = dbo.work_order.job_no)
FROM dbo.job
ORDER BY dbo.job.job_no

Without knowing your table structure, you need something like:
SELECT ClientField, MIN(OrderDate), MAX(OrderDate)
FROM ClientTable C
INNER JOIN OrderTable O
ON O.ClientID = C.ClientID
GROUP BY ClientField

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL beginner practice problems - mysql

Related

Optimising MySql Query with LEFT JOINS

SQL retrieving filtered value in subquery

Calculate Sum() but excluding JOIN

How to get all order ID which not paid in SQL Server 2008?

Returning min and max order dates per Client

Categories

Resources