I have two tables
Prodline
Prodline_nbr
Prodline_name
prodline_color_code
prodline_discount
product
prod_nbr
prod_name
prod_price
prod_cost
i want to list name for each product, sale price, product line name, and color code for products whose price is no more than $200 and sort them descending order by sale price.
Here is my code
select PRODLINE_NAME,PRODLINE_COLOR_CODE,PROD_NAME, PROD_PRICE
from prodline, product
where PRODLINE_NBR in (select PRODLINE_NBR from prodline)
and prod_price<=200
order by PROD_PRICE desc;
Is this not correct? I've been told there is a better way by using some sorta join?
second attempt
select PRODLINE_NAME,PRODLINE_COLOR_CODE,PROD_NAME, PROD_PRICE
from prodline, product
where PRODLINE_NBR = PROD_NBR
and prod_price<=200
order by PROD_PRICE desc;
Could anyone comment please?
Thank you.
The second one is correct. The first one is wrong, because it just multiply two tables, i guess this is not enough to achieve your goal.
Comparing your two scripts, the only difference is a condition in the where clause:
the first uses PRODLINE_NBR in (select PRODLINE_NBR from prodline), which makes little sense, since PRODLINE_NBR is a field of prodline, of course every PRODLINE_NBR should in it. With this condition or not makes nothing different.
the second uses PRODLINE_NBR = PROD_NBR, which is a typical method to join two tables from multiplying them. If you want to connect o two tables through an equation among two fields, you need it.
Second style is the best! There are many ways to produce same outputs. but efficiency matters.
this links are good to learn about HOW JOIN and INDEX work.
http://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html
http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_selectivity
http://www.keithjbrown.co.uk/vworks/mysql/mysql_p5.php
Related
I am working on some practice interview Questions and am struggling with this:
You are working with a company that sells goods to customers, and they'd like to keep track
of the unique items each customer has bought. The database is composed of two tables:
Customers and Orders. The two table schemas are given below. We want to know what
unique items were purchased by a specific customer, Wilbur, and when they were
purchased. What is the correct query that returns the customer first name, item
purchased, and purchase date with recent purchases first?
Tables: https://imgur.com/a/D47R1KU
My answer so far is
However I am getting an incorrect message as its Printing wilbur,oranges,2019-06-10
and wilbur,oranges,2018-06-10 instead of just the one with the more recent date. Please see the picture for the two tables referenced by the question. Thanks!
Between the where clause and ORDER BY, try:
GROUP BY FirstName, Item
And to get the most recent date, select MAX(PurchaseDate).
The query you are looking for is as follows.
This uses group by to indicate which columns should be grouped together, and for the column that's not grouped, how to choose which value of many to use, in this case the max value.
Note also the use of explicit, clear, SQL-92 modern join syntax and meaningful column aliases to show which table each column originates from. Distinct is not needed since each group is already unique.
Select c.FirstName, o.Item, Max(o.PurchaseDate) PurchaseDate
from Customers c
join Orders o on o.PersonId=p.PersonId
where c.FirstName = 'Wilbur'
group by c.firstName, o.Item
order by Max(o.PurchaseDate) desc;
I am working on a database called classicmodels, which I found at: https://www.mysqltutorial.org/mysql-sample-database.aspx/
I realized that when I executed an Inner Join between 'payments' and 'orders' tables, a 'cartesian explosion' occurred. I understand that these two tables are not meant to be joined. However, I would like to know if it is possible to identify this just by looking at the relational schema or if I should check the tables one by one.
For instance, the customer number '141' appears 26 times in the 'orders table', which I found by using the following code:
SELECT
customerNumber,
COUNT(customerNumber)
FROM
orders
WHERE customerNumber=141
GROUP BY customerNumber;
And the same customer number (141) appears 13 times in the payments table:
SELECT
customerNumber,
COUNT(customerNumber)
FROM
payments
WHERE customerNumber=141
GROUP BY customerNumber;
Finally, I executed an Inner Join between 'payments' and 'orders' tables, and selected only the rows with customer number '141'. MySQL returned 338 rows, which is the result of 26*13. So, my query is multiplying the number of times this 'customer n°' appears in 'orders' table by the number of times it appears in 'payments'.
SELECT
o.customernumber,
py.amount
FROM
customers c
JOIN
orders o ON c.customerNumber=o.customerNumber
JOIN
payments py ON c.customerNumber=py.customerNumber
WHERE o.customernumber=141;
My questions is the following:
1 ) Is there a way to look at the relational schema and identify if a Join can be executed (without generating a combinatorial explosion)? Or should I check table by table to understand how the relationship between them is?
Important Note: I realized that there are two asterisks in the payments table's representation in the relational schema below. Maybe this means that this table has a composite primary key (customerNumber+checkNumber). The problem is that 'checkNumber' does not appear in any other table.
This is the database's relational schema provided by the 'MySQL Tutorial' website:
Thank you for your attention!
This is called "combinatorial explosion" and it happens when rows in one table each join to multiple rows in other tables.
(It's not "overestimation" or any sort of estimation. It's counting data items multiple times when it should only count them once.)
It's a notorious pitfall of summarizing data in one-to-many relationships. In your example each customer may have no orders, one order, or more than one. Independently, they may have no payments, one, or many.
The trick is this: Use subqueries so your toplevel query with GROUP BY avoids joining one-to-many relationships serially. In the query you showed us, that's happening.
You can this subquery to get a resultset with just one row per customer. (try it.)
SELECT customernumber,
SUM(amount) amount
FROM payments
GROUP BY customernumber
Likewise you can get the value of all orders for each customer with this
SELECT c.customernumber,
SUM(od.qytOrdered * od.priceEach) amount
FROM orders o
JOIN orderdetails od ON o.orderNumber = od.orderNumber
GROUP BY c.customernumber
This JOIN won't explode in your face because customer can have multiple orders, and each order can have multiple details. So it's a strict hierarchical rollup.
Now, we can use these subqueries in the main query.
SELECT c.customernumber, p.payments, o.orders
FROM customers c
LEFT JOIN (
SELECT c.customernumber,
SUM(od.qytOrdered * od.priceEach) orders
FROM orders o
JOIN orderdetails od ON o.orderNumber = od.orderNumber
GROUP BY c.customernumber
) o ON c.customernumber = o.customernumber
LEFT JOIN (
SELECT customernumber,
SUM() payment
FROM payments
GROUP BY customernumber
) p on c.customernumber = p.customernumber
Takehome tricks:
A subquery IS a table (a virtual table) that can be used whereever you might mention a table or a view.
The GROUP BY stuff in this query happens separately in two subqueries, so no combinatorial explosions.
All three participants in the toplevel JOIN have either one or zero rows per customernumber.
The LEFT JOINs are there so we can still see customers with (importantly for a business) no orders or no payments. With the ordinary inner JOIN, rows have to match both sides of the ON conditions or they're omitted from the resultset.
Pro tip Format your SQL queries fanatically carefully: They are really verbose. Adm. Grace Hopper would be proud. That means they get quite long and nested, putting the Structured in Structured Query Language. If you, or anybody, is going to reason about them in future, we must be able to grasp the structure easily.
Pro tip 2 The data engineer who designed this database did a really good job thinking it through and documenting it. Aspire to this level of quality. (Rarely reached in the real world.)
In this particular case, your behavior should depend on the accounting style being supported by the database, and this does not appear to be "open item" style accounting ie when an order is raised for 1000 there does not need to be a payment against it for 1000.. This is perhaps unusual in most consumer experience because you will be quite familiar with open item style ordering from Amazon - you buy a 500 dollar tv and a 500 dollar games console, the order is a thousand dollars and you pay for it, the payment going against the order. However, you're also familiar with "balance forward" accounting if you paid for that order using your credit card because you make similar purchases every day for a month and hen you get a statement from your bank saying you owe 31000 and you pay a lump of money, doesn't even have to be 31k. You aren't expected to make 31 payments of 1000 to your bank at the end of the month. Your bank allocate it to the oldest items on the account (if they're nice, or the newest items if they're not) and may eventually charge you interest on unpaid transactions
1 ) Is there a way to look at the relational schema and identify if a Join can be executed
Yes, you can tell looking at the schema- customer has many orders, customer makes many payments, but there is no relation between the order and payment tables at all so we can see there is no attempt to directly attach a payment to an order. You can see that customer is a parent table of payment and order, and therefore enjoys a relationship with each of them but they do not relate to each other. If you had Person, Car and Address tables, a person has many addresses during their life, and many cars but it doesn't mean there is a relationship between cars and addresses
In such a case it simply doesn't make sense to join payments to customers to orders because they do not relate that way. If you want to make such a join and not suffer a Cartesian explosion then you absolutely have to sum one side or the other (or both) to ensure that your joins are 1:1 and 1:M (or 1:1 and 1:1). You cannot arrange a join that is a pair of 1:M.
Going back to the car/person/address example to make any meaningful joins, you have to build more information into the question and arrange the join to create the answer. Perhaps the question is "what cars did they own while they lived at" - this flattens the Person:Address relationship to 1:1 but leaves Person:Car as 1:M so they might have owned many cars during their time in that house. "What was the newest car they owned while living at..." might be 1:1 on both sides if there is a clear winner for "newest" (though if they bought two cars manufactured at identical times...)
Which side you sum in your orders case will depend on what you want to know, but in this case I'd say you usually want to know "which orders haven't been paid for" and that's summing all payments and rolling summing all orders then looking at what point the rolling sum exceeds the sum of payments.. those are the unpaid orders
Take a look again at your database graph (the one that was present in the first iteration of your question). See the lines between tables have 3 angled legs on one end - that's the many end. You can start at any table in the graph and join to other tables by walking along the relationship. If you're going from the many end to the one end, and assuming you've picked out a single row in the start table (a single order) you can always walk to any other table in the many->one direction and not increase your row count. If you walk the other way you potentially increase your row count. If you split and walk two ways that both increase row count you get a Cartesian explosion. Of course, also you don't have to only join on relation lines, but that's out of scope for the question
ps: this is easier to see on the db diagram than the ERD in the question because the database purely concerns itself with the columns that are foreign keyed. The ERD is saying a customer has zero or one payments with a particular check number but the database will only be concerned with "the customer ID appears once in the customer table and multiple times in the payment table" because only part of the compound primary key of payment is keyed to the customer table. In other words, the ERD is concerned with business logic relations too, but the db diagram is purely how tables relate and they aren't necessarily aligned. For this reason the db diagrams are probably easier to read when walking round for join strategies
After seeing the answers of Caius Jard and O.Jones (please, check their replies), which kindly helped me to clarify this doubt, I decided to create a table to identify which customers paid for all orders they made and which ones did not. This creates a pertinent reason to join 'orders', 'orderdetails', 'payments' and 'customers' tables, because some orders may have been cancelled or still may be 'On Hold', as we can see in their corresponding 'status' in the 'orders' table. Also, this enables us to execute this join without generating a 'combinatorial explosion'.
I did this by using the CASE statement, which registers when py.amount and amount_in_orders match, don't match or when they are NULL (customers which did not make orders or payments):
SELECT
c.customerNumber,
py.amount,
amount_in_orders,
CASE
WHEN py.amount=amount_in_orders THEN 'Match'
WHEN py.amount IS NULL AND amount_in_orders IS NULL THEN 'NULL'
ELSE 'Don''t Match'
END AS Match
FROM
customers c
LEFT JOIN(
SELECT
o.customerNumber, SUM(od.quantityOrdered*od.priceEach) AS amount_in_orders
FROM
orders o
JOIN orderdetails od ON o.orderNumber=od.orderNumber
GROUP BY o.customerNumber
) o ON c.customerNumber=o.customerNumber
LEFT JOIN(
SELECT customernumber, SUM(amount) AS amount
FROM payments
GROUP BY customerNumber
) py ON c.customerNumber=py.customerNumber
ORDER BY py.amount DESC;
The query returned 122 rows. The images below are fractions of the generated output, so you can visualize what happened:
For instance, we can see that the customers identified by the numbers '141', '124', '119' and '496' did not pay for all the orders they made. Maybe some of them where cancelled or maybe they simply did not pay for them yet.
And this image shows some of the columns (not all of them) that are NULL:
I have a table of tripadvisor data. There are columns (restaurant, rank, score, user_name,review_stars,review_date,user_reviews....) and other columns that are not useful for my question...
I am trying to return each restaurant with how many 3-star reviews they have and list them using the rank column from high to low.
I know i can use count, i was thinking of count if ( review_stars=3) and then order by rank to return it... i am stuck and any help would be appreciated. Thank you.
What you are wanting to accomplish is counting how many 3 star reviews each restaurant has ..
You really are almost there -- Your original question really does contains all the answers, just written out in long-hand. Think about what you are asking:
"I need to SELECT the restaurants that have 3 star reviews and count how many there are per restaurant" -- The basic syntax you are missing is GROUP BY -- Which groups all of your results per restaurant to a single row.
SELECT restaurant, count(*) as 3_star_count from table where review_stars = '3' GROUP BY restaurant
This is a basic example. But from what you are asking .. This syntax should get you the number of 3 star reviews for each restaurant.
I would recommend that you look into SQL clauses and what they mean as #Alexis stated in an earlier comment. The WHERE clause and the GROUP BY clause (especially this one) are what you want to understand here.
I have 2 Tables in MySQL
First is Sale and second is Purchase I want to see a Stock
this is My sale table that contains
Date Quantity ProductName
this is My Purchase Table that Contains sane attributes as sale have
I perform a Query but i Can not get my desire output
SELECT sale.Date, sale.ProductName, SUM(sale.StockQuantityIn) as StockIN,
SUM(purchase.StockQuantityout) as Stockout,
(SUM(sale.StockQuantityIn)-SUM(purchase.StockQuantityout)) as stock
from sale
join purchase on purchase.ProductName=sale.ProductName
GROUP BY sale.Date,purchase.Date ,purchase.ProductName,sale.ProductName
this is the Query and Result is noy my desired output
You cannot join the two tables as you did.
When you join two tables, you can imagine (it is not what really happens, but it helps to understand the result) that all the possible combinations of their rows are evaluated and later filtered based on the "where" constraints.
Try to simulate this process and you will see that the result is what you would expect.
It's not very good to JOIN, since each line in each table will appear as many times as matched in second table. This is why you get all these extra stockOut.
I would UNION 2 queries, one on sale, one on purchase, with ProductName, Date, and StockIn (which would be negative for the query on purchase)
Then I would GROUP BY ProductName and Date and SUM StockIn
I am fairly new to Databases and I am just beginning to understand the DML/queries, I have two tables, one named customer this contain customer data and one named requested_games, this contains games requested by the customers, I would like to write a query that will return the customers that have requested more than two games, so far when I run the query, I don't get the desired result, not sure if I'm doing it right.
Can anyone assist with this thanks,
Below is a snippet of the query
select customers.customer_name, wants_list.requested_game, wants_list.wantslists_id,count(wants_list.customers_ID)
from customers, wants_list
where customers.customers_ID = wants_list.customers_id
and wants_list.wantslists_id = wants_list.wantslists_id
and wants_list.requested_game > '2';
just include a HAVING clause
GROUP BY customers_ID
HAVING COUNT(*) > 2
depending on how you have your data setup you may need to do
HAVING COUNT(wants_list.requested_game) > 2
This is how I like to describe how a query works maybe itll help you visualize how the query executes :)
SELECT is making an order at a restaurant....
FROM is the menu you want to order from....
JOIN is what sections of the menu you want to include
WHERE is any customization you want to make to your order (aka no mushrooms)....
GROUP BY (and anything after) is after the order has been completed and is at your table...
GROUP BY tells your server to bring your types of food together in groups
ORDER BY is saying what dishes you want first (aka i want my entree then dessert then appetizer ).
HAVING can be used to pick out any mushrooms that were accidentally left on the plate....
etc..
I would like to write a query that will return the customers that
have requested more than two games
For this to happen you need to do the following
First you need to use GROUP BY to group the games based on customers (customers_id)
Then you need to use HAVING clause to get customers who requested more than two games
Then make this a SUBQUERY if you need more information on the customer like name
Finally you use a JOIN between customers and the sub query (temp) to display more information on the customer
Like the following query
SELECT customers.customer_id, customers.customer_name, game_count
FROM (SELECT customer_id, count(wantslists_id) AS game_count
FROM wants_list
GROUP BY customer_id
HAVING count(requested_game) > '2') temp
JOIN customers ON customers.customer_id = temp.customer_id