sql SELECT query for 3 tables - mysql

I have 3 tables:
1. products(product_id,name)
2. orders(id,order_id,product_id)
3. factors(id,order_id,date)
I want to retrieve product names(products.name) where have similar order_id on a date in two last tables.
I use this query for this purpose:
select products.name
from products
WHERE products.product_id ~IN
(
SELECT distinct orders.product_id FROM orders WHERE
order_id IN (select order_id FROM factors WHERE
factors.datex ='2017-04-29') GROUP BY product_id
)
but no result. where is my mistake? how can I resolve that? thanks

Your query should be fine. I am rewriting it to make a few changes to the structure, but not the logic (this makes it easier for me to understand the query):
select p.name
from products p
where p.product_id in (select o.product_id
from orders o
where o.order_id in (select f.order_id
from factors f
where f.datex = '2017-04-29'
)
) ;
Notes on the changes:
When using multiple tables in a query, always qualify the column names.
Use table aliases. They make queries easier to write and to read.
SELECT DISTINCT and GROUP BY are unnecessary in IN subqueries. The logic of IN already handles (i.e. ignores) duplicates. And by explicitly including the operations, you run the risk of a less efficient query plan.
Why might your query not work?
factors.datex has a time component. If so, then this will work date(f.datex) = '2017-04-29'.
There are no factors on that date.
There are no orders that match factors on that date.
There are no products in the orders that match the factors on that date.

In factors table column name is date so it should be -
factors.date ='2017-04-29'
You have written -
factors.datex ='2017-04-29'

Related

Using join and group with sum

I am fairly new to MySQL and have this theoretical problem given to me. I am given these tables
customers
---------------
id
name
country
order_date
orders
---------------
id
order_number
order_type
customers_order_details
---------------
id
customer_id
order_id
price
A customer can have multiple different orders. I need to retrieve the customers with the largest total price spent, with the total price must be at least 100. Is my approach correct?
SELECT c.id, c.name AS customer_name, c.country , SUM(d.price) AS total_price
FROM customers c
JOIN customers_order_details d
ON c.id = d.customer_id
GROUP BY customer_name,
HAVING total_price >= 100
ORDER BY total_price DESC;
I ask due to not sure since I was told for GROUP BY that I needed to add all columns specified but feel that using the name is more than adequate
It looks almost correct.
Grouping by only customers.name isn't right though. Besides that this will throw an error on more tightly configured MySQL servers or newer versions or even DBMS from other vendors, what happens if there are two or more different customers with the same name, say some "John Smith"s? They're all aggregated in the same group giving false figures!
The safest bet is just to group by all columns not being an argument to an aggregation function. That would be customers.id, customers.name and customers.country in this case. In some DBMS you can also group by just a tuple of columns all the columns not given to an aggregation function are dependent of. If customers.id is declared as primary key, that would fulfill that rule and you could just group by it. But I'm not really sure if MySQL does implement that shortcut or in which versions or configurations. So you should better go with all the columns here.
Side note: The schema design is a little weird. Why are the order details directly linked to customers and not the orders themselves are linked to the customers? As it is now an order can have multiple details belonging to different customers. That may be right in your use case, but it's not the usual thing you would expect. Maybe you should revise that.
Your code looks quite fine. I would jus recommend aggregating by the primary key of the customer table rather than by the name:
SELECT c.id, c.name AS customer_name, c.country , SUM(d.price) AS total_price
FROM customers c
JOIN customers_order_details d ON c.id = d.customer_id
GROUP BY c.id
HAVING SUM(d.price) >= 100
ORDER BY total_price DESC;
This makes the code a valid aggregation query; all non-aggregated columns in the select clause are functionally dependent on the column in the group by clause.
As a side note: using column aliases in the HAVING clause is a MySQL extension to the SQL standard. You can use that feature, or phrase the HAVING clause in pure ANSI SQL, repeating the aggregate expression.

How can I improve this inner join query?

My database has 3 tables. One is called Customer, one is called Orders, and one is called RMA. The RMA table has the info regarding returns. I'll include a screen shot of all 3 so you can see the appropriate attributes. This is the code of the query I'm working on:
SELECT State, SKU, count(*)
from Orders INNER JOIN Customer ON Orders.Customer_ID = Customer.CustomerID
INNER JOIN RMA ON Orders.Order_ID = RMA.Reason
Group by SKU
Order by SKU
LIMIT 10;
I'm trying to get how much of each product(SKU) is returned in each state(State). Any help would really be appreciated. I'm not sure why, but anytime I include a JOIN statement, my query takes anywhere from 5 minutes to 20 minutes to process.
[ Customer table]
!2[ RMA table]
!3
Your query should look like this:
SELECT c.State, o.SKU, COUNT(*)
FROM Orders o INNER JOIN
Customer c
ON o.Customer_ID = c.CustomerID JOIN
RMA
ON o.Order_ID = RMA.Order_Id
GROUP BY c.State, o.SKU
ORDER BY SKU;
Your issue is probably the incorrect JOIN condition between Orders and RMA.
If you have primary keys properly declared on the tables, then this query should have good-enough performance.
Given you are joining with an Orders table I'm going to assume this table contains all the orders that the company has ever done. This can be quite large and would likely cause the slowness you are seeing.
You can likely improve this query if you place some constraint on the Orders you are selecting, restricting what date range you use is common way to do this. If you provide more information about what the query is for and how large the dataset is everyone will be able to provide better guidance as to what filters would work best.

Join query result with itself in MySQL

Let's say I have a query:
select product_id, price, price_day
from products
where price>10
and I want to join the result of this query with itself (if for example I want to get in the same row product's price and the price in previous day)
I can do this:
select * from
(
select product_id, price, price_day
from products
where price>10
) as r1
join
(
select product_id, price, price_day
from products
where price>10
) as r2
on r1.product_id=r2.product_id and r1.price_day=r2.price_day-1
but as you can see I am copying the original query, naming it a different name just to join its result with itself.
Another option is to create a temp table but then I have to remember to remove it.
Is there a more elegant way to join the result of a query with itself?
self join query will help
select a.product_ID,a.price
,a.price_day
,b.price as prevdayprice
,b.price_day as prevday
from Table1 a
inner join table1 b
on a.product_ID=b.product_ID and a.price_day = b.price_day+1
where a.price >10
You could do a bunch of things, just a few options could be:
Just let mysql handle the optimization
this will likely work fine until you hit many rows
Make a view for your base query and use that
could increase performance but mostly increases readability (if done right)
Use a table (non temporary) and insert your initial rows in there. (unfortunately you cannot refer to a temporary table more than once in a query)
this will likely be more expensive performance wise until a certain number of rows is reached.
Depending on how important performance is for your situation and how many rows you need to work with the "best" choice would change.
Just to get duplicates in the same row?
select product_id as id1, price as price1, price_day as priceday1, product_id as id2, price as price2, price_day as priceday2,
from products
where price>10

sql nested query with group by

I was reading some tutorials about group by clause, i faced the following problem and don't know why it was solved like that, the table is as follows:
the requirement is to select the most expensive product in each category, and the following query was the answer:
SELECT
categoryID, productID, productName, MAX(unitprice)
FROM
products A
WHERE
unitprice = (
SELECT
MAX(unitprice)
FROM
products B
WHERE
B.categoryId = A.categoryID)
GROUP BY categoryID;
i don't know why the above query was the answer, why it wasn't just:
SELECT
categoryID, productID, productName, MAX(unitprice)
FROM
products
GROUP BY categoryID;
also, if the first query is the right one, why MAX function exists in the outer and inner query, isn't it enough to exist in the inner query?
thanks.
The second query will produce an error because it is not possible to have columns in the select clause whitout grouping by them in the Group by clause (unless they are subject to the aggregation).
Therefore you need to first find the highest unit price in each category and then find which product has that uniprice. You can actually accomplish this in many ways. This first query is one of them.
From your picture it looks as others have mentioned that you are using mysql, the MYSQL optimiser doesn't like subqueries very much and it would horrible to run over lots of data, best habit is to use joins where possible (if you look at query plans in postgres, oracle or mssql it will re-write sub-queries as joins 90% of the time)
The second query will run on default mysql as it will group by the missed columns you missed.
Below is an example:
SELECT
A.categoryID, A.productID, A.productName, B.max_unitprice
FROM products A
JOIN (
SELECT
max(unit price) as max_unitprice,
categoryId
FROM products
GROUP BY categoryId) B
ON B.categoryId = A.categoryID
SELECT p.*
FROM products p
WHERE NOT EXISTS ( SELECT 'p2'
FROM products p2
WHERE p2.categoryId = p.categoryId
AND p2.unitPrice > p.unitPrice
)

Get all the products, and putting a specific order's product first

I have 3 tables: products, orders and orderLines(order_id, product_id).
I have an sql query to figure out which seems nearly impossible to do in only one query.
Is there a way to have in only one query:
All the products but showning a specific order's products first;
which means that: for an order A: show product1, product2.. present in orderA's orderLines first, than the following products (not ordered) are shown next.
PS:
I know it's possible to achieve this with a union of two queries, but it would be better to have it done in only one query.
You can put a subquery in the order by clause. In this case, an exists subquery is what you need:
select p.*
from products p
order by (exists (select 1
from orderlines ol
where p.productid = ol.productid and o.orderid = ORDERA
)
) desc;