MySql query code to fetch unique data from single ID - mysql

How do i List the CUSTNUMs and NAMES of any customer who has only ordered chemical [NUMBER].
ORDERS TABLE
+---------+--------+------------+------+
| CUSTNUM | CHEMNO | DATE | QTY |
+---------+--------+------------+------+
| 123456 | 1234 | 2000-00-00 | 35 |
+---------+--------+------------+------+
CUSTOMER TABLE
+---------+-----------+-----------+
| CUSTNUM | NAME | LOCATION |
+---------+-----------+-----------+
| 123456 | AmChem | New York |
+---------+-----------+-----------+

You could join the CUSTOMER and ORDERS tables containing orders for a particular <chemno> with a subquery for the custnum that buy only a product:
SELECT
CUSTNUM, NAME
FROM
CUSTOMER c
INNER JOIN
ORDERS o ON o.CUSTNUM = c.CUSTNUM and o.CHEMNO = <chemno>
INNER JOIN
( SELECT
CUSTNUM
FROM
ORDERS
GROUP BY
CUSTNUM
HAVING
COUNT(DISTINCT CHEMNO) = 1 ) t ON t.CUSTNUM = o.CUSTNUM

I will approach this with one join between both tables, then grouping by the column CUSTNUM of the ORDERS table and finally adding the required conditions on the HAVING clause, like this:
SELECT
o.CUSTNUM,
c.NAME
FROM
ORDERS AS o
INNER JOIN
CUSTOMER AS c ON c.CUSTNUM = o.CUSTNUM
GROUP BY
o.CUSTNUM
HAVING
( COUNT(DISTINCT o.CHEMNO) = 1 AND MIN(o.CHEMNO) = <some_chemno> )

OK, slow day...
SELECT DISTINCT x.custnum
FROM orders x
LEFT
JOIN orders y
ON y.custnum = x.custnum
AND y.chemno <> x.chemno
WHERE x.chemno = 9377
AND y.order_id IS NULL;
The rest of this task has been left as an exercise for the reader

Related

How do I join multiple (four) tables using sql with conditions?

I am trying to create an SQL query that conditionally pulls data from multiple tables.
I have four tables:
orders
+------+------------+------------+
| id | date_added | currency |
+------+------------+------------+
| 1 | 2018-07-23 | 1 |
+------+------------+------------+
order_items
+------+------------+------------+---------------+---------------+
| id | order_id | price | product_id | product_type |
+------+------------+------------+---------------+---------------+
| 1 | 1 | 100.00 | 1 | ticket |
+------+------------+------------+---------------+---------------+
order_data
+------+--------------+---------------+
| id | order_id | ext_order_ref |
+------+--------------+---------------+
| 1 | 1 | ABC |
+------+--------------+---------------+
products
+------+------------+------------+
| id | date | product_id |
+------+------------+------------+
| 1 | 2020-03-12 | 1 |
+------+------------+------------+
| 2 | 2020-03-18 | 2 |
+------+------------+------------+
| 3 | 2020-03-20 | 3 |
+------+------------+------------+
I need to output orders with the following conditions:
Each order in a row with total (calculated from order items with matching order id)
The 'ext_order_ref' from the order_data table that matches that order
Only include order items that have a specific product type
Only include orders with products from a particular date range
Preferred output would look like this:
+------------+------------+--------------+
| order_id | total | ext_order_ref|
+------------+------------+--------------+
| 1 | 100 | ABC |
+------------+------------+--------------+
My current query is basically like this; please advise
SELECT
orders.id as order_id,
SUM(order_items.price) as total,
order_data.ext_order_ref
FROM orders
INNER JOIN order_data
ON orders.id = order_data.ext_order_ref
RIGHT JOIN order_items
ON orders.id = order_items.order_id
LEFT JOIN products
ON order_items.product_id = products.product_id
WHERE order_items.product_type = 'ticket' AND products.date BETWEEN '2020-03-12' AND '2020-03-18'
GROUP BY orders.id
It almost works, but not quite. The date particularly is causing issue.
Thanks in advance!
First decide the driving table for the query and form the query based on the driving table.
Driving table is the one primary table from which other tables join.
More information on driving tables from askTom
In your case, the driving table is Orders table. You are switching between RIGHT OUTER JOIN and LEFT OUTER JOIN. This will cause confusion in the resultset.
I have modified the query. See whether it works.
SELECT
orders.id as order_id,
SUM(order_items.price) as total,
order_data.ext_order_id
FROM orders
INNER JOIN order_data
ON orders.id = order_data.ext_order_id
LEFT OUTER JOIN order_items
ON orders.id = order_items.order_id
LEFT OUTER JOIN products
ON order_items.product_id = products.product_id
WHERE order_items.product_type = 'ticket' AND products.date BETWEEN '2020-03-12' AND '2020-03-18'
GROUP BY orders.id
I don't understand why you would want outer joins at all. If I follow the conditions correctly:
SELECT o.id as order_id,
SUM(oi.price) as total,
od.ext_order_id
FROM orders o INNER JOIN
order_data od
ON o.id = od.ext_order_id INNER JOIN
order_items oi
ON o.id = oi.order_id INNER JOIN
products p
ON oi.product_id = p.product_id
WHERE oi.product_type = 'ticket' AND
p.date >= '2020-03-12' AND
p.date < '2020-03-19'
GROUP BY o.id, od.ext_order_id;
Note the use of table aliases so the query is easier to write and read.

Aggregate values under multiple conditions

Given the following tables
+-----------+-------------+----------+
| tours | tour_user | tag_tour |
+-----------+-------------+----------+
| id | tour_id | tag_id |
| startdate | user_id | tour_id |
+-----------+-------------+----------+
I want to achieve a result set like this:
+-----------------+----------------+----------------+
| DATE(startdate) | COUNT(user_id) | COUNT(tour_id) |
+-----------------+----------------+----------------+
| 2017-12-01 | 55 | 32 |
+-----------------+----------------+----------------+
Described in words the amount of users paticipated on a tour and the amount of tours should be aggregated by days.
Additionaly the count of tour and user participation should be filterable via tags which are attachted to tours via tag_tour table (many-to-may-relation).
E.g. I want only the tour and user count of tours which have tag_id 1 AND 2 attaches.
Currently I go with this Query:
SELECT DATE(tours.start) AS acum_date,
COUNT(tour_user.user_id) AS guide_assignments,
A.tour_count
FROM `tour_user`
LEFT JOIN `tours` ON `tours`.`id` = `tour_user`.`tour_id`
LEFT JOIN
(SELECT DATE(tours.start) AS tour_date,
COUNT(DISTINCT tours.id) AS tour_count
FROM tours
GROUP BY DATE(tours.start)) AS A ON `A`.`tour_date` = DATE(tours.start)
GROUP BY `acum_date`
ORDER BY `acum_date` ASC
The problem with this is, that only the total tour/user count is returned and not the filtered one.
The base query is:
select t.startdate, count(tu.user_id) as num_users, count(distinct t.id) as num_tours
from tours t left join
tour_user tu
on tu.tour_id = t.id
group by t.startdate;
In this case, I would recommend using exists for filtering the tags:
select t.startdate, count(tu.user_id) as num_users, count(distinct t.id) as num_tours
from tours t left join
tour_user tu
on tu.tour_id = t.id
where exists (select 1 from tour_tags tt where tt.tour_id = t.tid and tt.tag_id = <tag1>) and
exists (select 1 from tour_tags tt where tt.tour_id = t.tid and tt.tag_id = <tag2>)
group by t.startdate;

Get all rows from table - with the latest row from another table, with another table based on the latest row

I need to get all the details from the orders table, with the latest status ID in the orders statuses table, and then the name of that status from the states table.
orders
id | customer | product
-----------------------
1 | David | Cardboard Box
Order_to_statuses
id | order | status | updated_at
--------------------------------
1 | 1 | 1 | 2017-05-30 00:00:00
2 | 1 | 3 | 2017-05-28 00:00:00
3 | 1 | 4 | 2017-05-29 00:00:00
4 | 1 | 2 | 2017-05-26 00:00:00
5 | 1 | 5 | 2017-05-05 00:00:00
order_states
id | name
---------
1 | Pending
2 | Paid
3 | Shipped
4 | Refunded
In this instance, I would need to get the customer and product, with the latest status ID from the order statuses table, and then the name of that state.
How can I do this?
I'd break this down by first getting the max(updated_at) for each order, then work to everything else you need. You can get the max date for each order by using subquery:
select
s.`order`,
s.`status`,
s.updated_at
from order_to_statuses s
inner join
(
select
`order`,
max(updated_at) as updated_at
from order_to_statuses
group by `order`
) m
on s.`order` = m.`order`
and s.updated_at = m.updated_at
Once you get this you now have the order, the status id, and the most recent date. Using this you can then JOIN to the other tables, making your full query:
select
o.customer,
o.product,
ots.updated_at,
os.name
from orders o
inner join
(
select
s.`order`,
s.`status`,
s.updated_at
from order_to_statuses s
inner join
(
select
`order`,
max(updated_at) as updated_at
from order_to_statuses
group by `order`
) m
on s.`order` = m.`order`
and s.updated_at = m.updated_at
) ots
on o.Id = ots.`order`
inner join order_states os
on ots.`status` = os.id;
See a demo
It may have some typo, but the idea of the query should be something like this:
select orders.id, orders.customer, orders.product,
order_to_status.status, staus.name
from orders, order_to_status, status
where orders.id = order_to_status.order
and order_to_status.status = status.id
and order_to_status.updated_at in (
SELECT MAX(order_to_status.updated_at)
FROM order_to_status
where order_to_status.order = orders.id
group by order_to_status.order
);
I ussually don't use joins but with joins it should be like this:
select orders.id, orders.customer, orders.product,
order_to_status.status, staus.name
from orders
JOIN order_to_status ON orders.id = order_to_status.order
JOIN status ON order_to_status.status = status.id
where
order_to_status.updated_at in (
SELECT MAX(order_to_status.updated_at)
FROM order_to_status
where order_to_status.order = orders.id
group by order_to_status.order
);
Note I added a group by I had missed.
EDIT 2
I had an error in the subquery condition.
changed to where order_to_status.order = orders.id
also moved the group by after the where clause.

How to select the SUM of the multiplication of two different table fields specifying the value of other two fields?

Based on this table schema:
products
+----+------+-------+--------+--------------+-------+-------+------+-------+
| Id | Name | Price | Detail | Product_type | Image | Color | Size | Stock |
+----+------+-------+--------+--------------+-------+-------+------+-------+
order_details
+----+------------+--------+------+-------+----------+
| Id | Product_id | Amount | Size | Color | Order_id |
+----+------------+--------+------+-------+----------+
orders
+----+-----------+------------+----------+
| Id | Client_id | Date_start | Date_end |
+----+-----------+------------+----------+
How can I select the SUM() (if this function it's even necessary) of products.Price * order_details.Amount specifying the client and the order id?
I've tried with this query, among others:
SELECT SUM((SELECT pr.Price FROM products pr WHERE pr.Id = od.Product_id) * od.Amount) AS Total
FROM order_details od
WHERE (SELECT o.Client_id FROM orders o WHERE o.Id = $order) = $client
But it's returning a wrong result and I can't figure out how to do it. Also please note I want to use subqueries.
Thanks.
Dno't use a subselect, use a join:
SELECT orders.Id, SUM(products.Price * order_details.amount)
FROM orders
LEFT JOIN orders_details ON orders.Id = order_details.Order_id
LEFT JOIN products ON products.Id = order_details.Product_id
GROUP By orders.Clien_id, orders_details.Product_id

Making large SQL query efficicent

I'm stuck on a rather complex query.
I'm looking to write a query that shows the "top five customers" as well as some key metrics (counts with conditions) about each of those customers. Each of the different metrics uses a totally different join structure.
+-----------+------------+ +-----------+------------+ +-----------+------------+
| customer | | | metricn | | | metricn_lineitem |
+-----------+------------+ +-----------+------------+ +-----------+------------+
| id | Name | | id | customer_id| |id |metricn_id |
| 1 | Customer1 | | 1 | 1 | | 1 | 1 |
| 2 | Customer2 | | 2 | 2 | | 2 | 1 |
+-----------+------------+ +-----------+------------+ +-----------+------------+
The issue this is that I always want to group by this customer table.
I first tried to put all of my joins into the original query, but the query was abysmal with performance. I then tried using subqueries, but I couldn't get them to group by the original hospital id.
Here's a sample query
SELECT
customer.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = customer_id
) as metric_2
FROM customer
GROUP BY customer.name
SORT BY COUNT(metric1.id) DESC
LIMIT 5
Any advice? Thanks!
SELECT name, metric_1, metric_2
FROM customer AS c
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_1
FROM metric1 AS m
INNER JOIN metric1_lineitem AS l ON m.id = l.metric1_id
GROUP BY customer_id) m1
ON m1.customer_id = c.customer_id
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_2
FROM metric2 AS m
INNER JOIN metric2_lineitem AS l ON m.id = l.metric2_id
GROUP BY customer_id) m1
ON m2.customer_id = c.customer_id
ORDER BY metric_1 DESC
LIMIT 5
You should also avoid using COUNT(columnname) when you can use COUNT(*) instead. The former has to test every value to see if it's null.
Although your data structure may be lousy, your query may not be so bad, with two exceptions. I don't think you need the aggregation on the outer level. Also, the "correlation"s in the where clause (such as metric1.customer_id = customer_id) are not doing anything, because customer_id is coming from the local tables. You need metric1.customer_id = c.customer_id:
SELECT c.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN
metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = c.customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN
metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = c.customer_id
) as metric_2
FROM customer c
ORDER BY 1 DESC
LIMIT 5;
How can you make this run faster? One way is to introduce indexes. I would recommend metric1(customer_id), metric2(customer_id), metric1_lineitem(metric1_id) and metric2_lineitem(metric2_id).
This may be faster than the aggregation method (proposed by Barmar) because MySQL is inefficient with aggregations. This should allow the aggregations to take place only using indexes instead of the base tables.