SQL Inner Join 2 Tables - mysql

Hoping to get some help with this, I have made a few attempts at an inner join that shows all 'Product' information from the product table for, any product that has sold more than 10 units using an inner join.
PRODUCT TABLE (Columns)
P_CODE, P_DESCRIPT, P_INDATE, P_QOH, P_MIN, P_PRICE, P_DISCOUNT, V_CODE
LINE TABLE (Columns) this table shows the lines/information for each
invoice
INV_NUMBER, LINE NUMBER, P_CODE, LINE_UNITS, LINE_PRICE, LINE_TOTAL
I understand that I have to make the join using the common key attribute (p_code) but I cannot figure out how to do the sum within the inner join.
Here is my most recent attempt:
SELECT * PRODUCT FROM PRODUCT
INNER JOIN line
ON product.p_code = line.p_code
WHERE sum(line_units) >=10
AND line.p_code = product.p_code;
Error: near "product"; syntax error
Any help would be appreciated,
Thank you.

Looks like you have the table name PRODUCT within the SELECT section. And the sum() needs to happen within the SELECT section along with the extra HAVING clause at the end.
SELECT *, sum(line_units) as line_units_sum FROM product
INNER JOIN line ON product.p_code = line.p_code
WHERE line.p_code = product.p_code
HAVING line_units_sum >= 10

The requirement
Show all product information from the product table for any product that has sold more than 10 units.
The solution
Because you only want to build the projection from the product table, and you don't need any column from the line table, you can also use a correlated subquery like the following one:
SELECT *
FROM product
WHERE 10 < (
SELECT COUNT(*)
FROM line
WHERE line.p_code = product.p_code
)
The database optimizer might choose to use a JOIN internally if the cost of the JOIN is lower than other alternatives. So, it does not mean that the query will do row-by-row processing for the outer table records. Only the execution plan can tell how the query is executed by the database engine.

Related

MySQL: Optimizing Sub-queries

I have this query I need to optimize further since it requires too much cpu time and I can't seem to find any other way to write it more efficiently. Is there another way to write this without altering the tables?
SELECT category, b.fruit_name, u.name
, r.count_vote, r.text_c
FROM Fruits b, Customers u
, Categories c
, (SELECT * FROM
(SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r
WHERE b.fruit_id = r.fruit_id
AND u.customer_id = r.customer_id
AND category = "Fruits";
This is your query re-written with explicit joins:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN
(
SELECT * FROM
(
SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r on r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
CROSS JOIN Categories c
WHERE c.category = 'Fruits';
(I am guessing here that the category column belongs to the categories table.)
There are some parts that look suspicious:
Why do you cross join the Categories table, when you don't even display a column of the table?
What is ORDER BY fruit_id, count_vote DESC, r_id supposed to do? Sub query results are considered unordered sets, so an ORDER BY is superfluous and can be ignored by the DBMS. What do you want to achieve here?
SELECT * FROM [ revues ] GROUP BY fruit_id is invalid. If you group by fruit_id, what count_vote and what r.text_c do you expect to get for the ID? You don't tell the DBMS (which would be something like MAX(count_vote) and MIN(r.text_c)for instance. MySQL should through an error, but silently replacescount_vote, r.text_cbyANY_VALUE(count_vote), ANY_VALUE(r.text_c)` instead. This means you get arbitrarily picked values for a fruit.
The answer hence to your question is: Don't try to speed it up, but fix it instead. (Maybe you want to place a new request showing the query and explaining what it is supposed to do, so people can help you with that.)
Your Categories table seems not joined/related to the others this produce a catesia product between all the rows
If you want distinct resut don't use group by but distint so you can avoid an unnecessary subquery
and you dont' need an order by on a subquery
SELECT category
, b.fruit_name
, u.name
, r.count_vote
, r.text_c
FROM Fruits b
INNER JOIN Customers u ON u.customer_id = r.customer_id
INNER JOIN Categories c ON ?????? /Your Categories table seems not joined/related to the others /
INNER JOIN (
SELECT distinct fruit_id, count_vote, text_c, customer_id
FROM Reviews
) r ON b.fruit_id = r.fruit_id
WHERE category = "Fruits";
for better reading you should use explicit join syntax and avoid old join syntax based on comma separated tables name and where condition
The next time you want help optimizing a query, please include the table/index structure, an indication of the cardinality of the indexes and the EXPLAIN plan for the query.
There appears to be absolutely no reason for a single sub-query here, let alone 2. Using sub-queries mostly prevents the DBMS optimizer from doing its job. So your biggest win will come from eliminating these sub-queries.
The CROSS JOIN creates a deliberate cartesian join - its also unclear if any attributes from this table are actually required for the result, if it is there to produce multiples of the same row in the output, or just an error.
The attribute category in the last line of your query is not attributed to any of the tables (but I suspect it comes from the categories table).
Further, your code uses a GROUP BY clause with no aggregation function. This will produce non-deterministic results and is a bug. Assuming that you are not exploiting a side-effect of that, the query can be re-written as:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN Reviews r
ON r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
ORDER BY r.fruit_id, count_vote DESC, r_id;
Since there are no predicates other than joins in your query, there is no scope for further optimization beyond ensuring there are indexes on the join predicates.
As all too frequently, the biggest benefit may come from simply asking the question of why you need to retrieve every single row in the tables in a single query.

SQL Query to Filter a Table using two tables

I currently have 4 SQL tables that look like this:
CustomersTable, RegistrationTable, OrdersTable and OffersTable
enter image description here
I need to write a SELECT statement that retrieves all customers from the CustomersTable (all the fields) that contain rows that match the RegistrationTable Or rows that match the OrdersTable with status "closed", in the result table shouldn't display duplicate customers.
As you realized, CustomersTable and RegistrationTable have the field in common "customerId", but between CustomersTable and OrdersTable there is no field in common. However there is another table (OffersTable) which has the fields "customerId" and "ID", to query information to Customers and Orders table respectively. Remember that a customer who appears in OfferTable not necessarily will appear in OrderTable or just the status is NOT "Closed"
So based on my example tables above, if I were to run the query, it would return the following result:
enter image description here
In the result table shouldn't display duplicate customers.
I really appreciate your help.
Thanks for your time !!
Note - I am using MySQL
Try Using "Union" and "inner join" with every table Like below:
Select Customers.* from Customers inner join Registration on Customers. customerId= Registration.customerId
union
Select Customers.* from Customers inner join offers on Customers.customerId=offers.customerId
inner join Orders on orders.Id= offers.Id and Orders.Status='closed'
I would think exists or in, given what you want. Your description of the table is a bit cumbersome -- which is why sample data in the question is so helpful.
The resulting query would look something like this:
select c.*
from customers c
where exists (select 1 from registrations r where r.customerid = c.customerid) or
exists (select 1
from offers o join
orders oo
on o.id = oo.orderid
where o.customerid = c.customerid and
oo.status = 'closed'
);
The column names may not be quite right.

SQL subquery going heywire

I am trying to fetch some combined result from two separate individual tables.
The transaction_fact table has around 3.6 million rows and translation_table has around 300000 rows.
Now i want a sum of amount for all transactions grouped by location and the product within that location. But as the fact table has only location id and product id and i would like the names in the result , I am using sub query.
My query is as follows:
SELECT
( SELECT translation
FROM translation_table
WHERE dim_name LIKE 'location_dim'
AND lang_id LIKE 'es'
AND dim_id LIKE CAST(o.loc_id AS CHAR(50))
AND field_name LIKE 'city') AS Location
, ( SELECT product_name
FROM prod_dim
WHERE prod_id = o.prod_id) AS Product
, SUM(amount)
FROM transaction_fact o
GROUP
BY loc_id
, prod_id
ORDER
BY loc_id
, prod_id;
But this query is not returning anything , just keeps on processing.
I waited for about one and half hour but still no result.
Please tell me what might be going wrong.
Joining the tables should eliminate the need for subqueries and give some performance boost. If not you may need to provide more details on the table structure before we can help. Something like this should get you started:
SELECT t.translation AS Location, p.product_name AS Product, SUM(o.amount) AS Total
FROM transaction_fact o
INNER JOIN translation_table t ON CAST(o.loc_id AS char(50)) = t.dim_id
INNER JOIN prod_dim p ON p.prod_id = o.prod_id
WHERE t.dim_name = 'location_dim'
AND t.lang_id = 'es'
AND t.field_name = 'city'
GROUP BY t.translation, p.product_name
ORDER BY o.loc_id, o.prod_id;
Notes: I've changed the LIKEs to =, as LIKE is for when you want to match on a pattern that includes wildcards.
The CAST that is used in the join to translation_table is not ideal. If you could do away with that you'd get better performance.

MySQL Outer Join Giving Max Join Size Error

I thought I knew how to do a simple outer join, but it appears that I am wrong. I am new to MySQL, but I do have Oracle experience.
I have two tables that I want to query. The first table is a members table. The second table is called purchases. Purchases contains a row for each item a member purchases.
The members table contains a little more than 2700 rows. The purchases table contains a little less than 130,000 rows.
I eventually want to get a list of all members with a count of their unique item purchases. Here is my query:
select mem.member_id
,mem.name
,count(distinct pur.item_id)
from members mem
left outer join purchases pur on mem.member_id = pur.member_id
I get the following error when I execute the query:
1104 - The SELECT would examine more than MAX_JOIN_SIZE rows; check your WHERE and use SET SQL_BIG_SELECTS=1 or SET SQL_MAX_JOIN_SIZE=# if the SELECT is okay
The MAX Join Size is currently set to 7 million.
What am I not understanding here?
Your query looks fine, but if that's obviously failing, you might try the following
select
m.member_id,
m.`name`,
coalesce( cnts.UniqItems, 0 ) as UniqItems
from
members m
left join ( select p.member_id, count( distinct p.item_id ) as UniqItems
from purchases p
group by p.member_id ) cnts
on m.member_id = cnts.member_id
After writing, I think the problem may be the reserved word "NAME" for the column and should probably just need to be wrapped in tic marks to differentiate the column vs reserved word.

MySQL query multiple table inner join [duplicate]

This question already has an answer here:
Syntax error due to using a reserved word as a table or column name in MySQL
(1 answer)
Closed 8 years ago.
I have this MySQL query but it seems to be getting an error while I try to run it. Since I'm a newbie I'd like some advice of what I should do to correct it. I just want to show the name, quantity and order date of the orders that has 1 or more pending products. Thanks a lot!
select product.name, order_details.quantity, order.date from product,order_details,order
inner join order on order_details.order_id=order.id
inner join product on order_details.product_id=product.id
inner join customer on order.cust_id=costumer.id WHERE order.pending=>1
You have a table called order. This word has special significance in SQL. Your options are to rename the table, or quote it whenever you want to query from it.
Easiest solution is to change.
inner join order ....
to
inner join `order`
Be sure to use back-quotes around the table name.
You have a table named 'order', which is a reserved word in SQL.
One solution is to prefix the table name with the database name as explained in Craic Computing blog
Another one is to wrap the table name with the ` character as you can read in this StackOverflow question
You can try something like :
SELECT product.name, order_details.quantity, `order`.date
FROM product
INNER JOIN order_details ON product.id = order_detail.product_id
INNER JOIN `order` ON `order`.id = order_detail.order_id
WHERE `order`.pending >= 1
As said in other answers, orderis a reserved keyword in SQL, surround it with backquotes.
Maybe you should store the pending information in the order_detail table (1 if pending, 0 if not), in order to keep track of which product is still pending instead of incrementing/decrementing the order.pending field.
In this case, you could make the following query :
SELECT product.name, order_details.quantity, `order`.date
FROM product
INNER JOIN order_details ON product.id = order_detail.product_id
INNER JOIN `order` ON `order`.id = order_detail.order_id
WHERE `order_detail`.pending = 1
Which would return all the products still pending in your orders instead of every product from orders in which maybe only one is pending.