I apologise in advance if this might seem simple as my assignment needs to be passed in 2 hours time and I don't have enough time to do some further research as I have another pending assignment to be submitted tonight. I only know the basic MYSQL commands and not these types. And this is one of the final questions left unanswered and is making me go nuts even if i have already read the JOIN documentation . Please help.
Say I have 4 tables
_______________ _______________ _______________ _______________
| customers | | orders | | ordered_items | | products |
|_______________| |_______________| |_______________| |_______________|
|(pk)customer_id| | (pk)order_id | | (pk)id | |(pk)product_id |
| first_name | |(fk)customer_id| | (fk)order_id | | name |
| last_name | | date | |(fk)product_id | | description |
| contact_no | | | | quantity | | price |
|_______________| |_______________| |_______________| |_______________|
How would i be able to query all the products ordered by (eg: customer_id = '5')
I only know basic SQL like straight forward queries on 1 table and joins from 2 different, but since its 4 different tables having different relations to one another, how would i be able to get all the products ordered by a particular customer id?
Because its like get all the products from ordered products where order_id = (* orders by customer_id = 5).
But what can be an optimised and best practice way in doing this type of query
You only need to join 3 tables - orders, order_items, and products:
SELECT DISTINCT products.*
FROM products
JOIN order_items USING (product_id)
JOIN orders USING (order_id)
WHERE orders.customer_id = 35
As many have mentioned, you would do yourself a big favor by learning about table JOINS. There isn't much difference in the syntax between joining 2 tables to joining 4 or more.
SQLFiddle is a highly recommended resource for practicing and sharing your queries.
This is a comment because you appear to be new to SQL. You need to learn basic syntax for queries (which is why you are getting downvoted).
But you also ask about form. The data structure is actually pretty well laid out. I do have two comments. First, you should be consistent about how you name the id columns. For Ordered_Items, the id should be ordered_item_id.
Second, you should avoid using SQL special words for columns names and table names. Instead of date, use OrderDate.
Related
I have a problem which I think might be solved with proper use of left outer join, but I'm unable to construct suitable query. OTOH, there may also be some other, more clever solution with SQL. In addition, this could easily be solved with some programming, but I want to avoid that and find as "clean" solution as possible.
Background: let's say I'm creating a website that lists some car brands and the user can select which ones he owns/has owned (I'm not really doing that, but this example illustrates the point). In addition, for the selected ones he can optionally enter some additional info about them, e.g. year and some free text like specific model, comments or whatever. In addition, the information entered is stored in a relational database (MySQL in my case) and the user can retrieve and change his answers later.
Let's say there are two database tables:
BRAND
------------
ID INT
NAME VARCHAR(50)
OWNED
------------
ID INT
BRAND_ID INT
OWNER_ID INT
YEAR INT
COMMENT VARCHAR(100)
(here BRAND_ID + OWNER_ID is an unique index, so there can be only one row, and thus one year & comment for each BRAND/OWNER combination)
The data in these tables may look something like this:
BRAND
--------------
ID | NAME
--------------
1 | Cadillac
2 | Chevrolet
3 | Dodge
4 | Ford
OWNED
-----------------------------------------
ID | BRAND_ID | OWNER_ID | YEAR | COMMENT
-----------------------------------------
1 | 1 | 1 | null | 70's Fleetwood
2 | 2 | 1 | 2000 | Crappy car
3 | 2 | 2 | null | I really liked it
4 | 4 | 2 | 1999 | null
Now, to facilitate easy creation of the web page, what I would like to do is to with one SELECT display all brands in table BRAND, and for each BRAND to know whether current user has owned it or not, and if he has, also list his year and comment (if any). In other words, something like this (assuming current user is 2):
NAME | OWNER_ID | YEAR | COMMENT
-------------------------------------
Cadillac | null | null | null
Chevrolet | 2 | null | I really liked it
Dodge | null | null | null
Ford | 2 | 1999 | null
I tried doing something like:
select NAME, OWNER_ID, YEAR, COMMENT from BRAND left join OWNED on
BRAND.ID = OWNED.BRAND_ID where OWNER_ID = 2 or OWNER_ID = null
but that fails because 1 owns a Cadillac and thus Cadillac is left from the result. OTOH if I omit the where clause, I will get two rows for Chevrolet, which is also not desirable.
So, if there is a clean solution with SQL (either with or without left outer join), I'd like to know how to do it?
I am guessing you want this:
select NAME, OWNER_ID, YEAR, COMMENT
from BRAND left join
OWNED o
on BRAND.ID = OWNED.BRAND_ID and OWNER_ID = 2 ;
Seems like what you might actually want is a list of the owners, followed by what they owned and the details. You can that by adjusting the owner id at the bottom of this one:
SELECT owned.owner_id, brand.id, brand.name, owned.year, owned.comment
FROM owned
INNER JOIN brand
ON owned.brand_id = brand.id
WHERE owned.owner_id = 2
Tested here: http://sqlfiddle.com/#!9/5b52ba
I've searched the other related threads, but I don't think I'm looking for a UNION or an OUTER JOIN. What I'm trying to do is pretty simple in theory. I have two tables in two different databases, both with roughly the same data. I'm trying to present them together so that we can compare them. The field names are different, but the data is very similar.
Imagine something like this:
table 'foo':
id first_name last_name dept_name
+---+----------+---------------+-------------+
| 1|Bob |Boberson | Accounting |
| 2|Steven |McStevens | Sales |
| 3|Jane |Janeston | Support |
+---+----------+---------------+-------------+
table 'bar':
person_id first last department_id
+----------+----------+---------------+--------------+
| 1|Bob |Boberson | 2|
| 2|Doug |Dugger | 5|
| 3|Jane |Janeston | 3|
+----------+----------+---------------+--------------+
and I'm trying to end up with something like this:
person_id first last department
+----------+----------+---------------+--------------+
| foo_1|Bob |Boberson | Accounting |
| foo_2|Steven |McStevens | Sales |
| foo_3|Jane |Janeston | Support |
| bar_1|Bob |Boberson | Accounting |
| bar_2|Doug |Dugger | IT |
| bar_3|Jane |Janeston | Support |
+----------+----------+---------------+--------------+
It's easy enough to get the two tables to resemble each other with two separate selects using 'as' to change the column names, concat's for various fields, and doing the appropriate join to fill in the 'department' fields. But, I can't do a 'join' and keep that logic in place. I really need to do a select statement for each table. There's probably a simple solution here, but I'm not seeing it.
EDIT: You guys are correct, this is a pretty standard case for a UNION. I was thinking that UNIONS always add columns for some reason. Thanks.
I think you can use UNION if you don't want to trim any duplicate names out then that will get data from both tables in 1 query, you can get the table name as part of your query to prefix the person_id if you want
This needs testing/improving but:
(SELECT foo AS table, f.id AS person_id, f.first_name AS first, f.last_name AS last, f.dept_name FROM foo AS f)
UNION
(SELECT bar AS table, b.person_id, b.first, b.last, FROM bar AS b)
This looks exactly like a UNION
Have a look as SQL Fiddle, I didn't bother doing a lookup for dept ID, but this should give you the basic idea.
while trying to learn sql i came across "Learn SQL The Hard Way" and i started reading it.
Everything was going fine then i thought ,as a way to practice, to make something like given example in the book (example consists in 3 tables pet,person,person_pet and the person_pet table 'links' pets to their owners).
I made this:
report table
+----+-------------+
| id | content |
+----+-------------+
| 1 | bank robbery|
| 2 | invalid |
| 3 | cat on tree |
+----+-------------+
notes table
+-----------+--------------------+
| report_id | content |
+-----------+--------------------+
| 1 | they had guns |
| 3 | cat was saved |
+-----------+--------------------+
wanted result
+-----------+--------------------+---------------+
| report_id | report_content | report_notes |
+-----------+--------------------+---------------+
| 1 | bank robbery | they had guns |
| 2 | invalid | null or '' |
| 3 | cat on tree | cat was saved |
+-----------+--------------------+---------------+
I tried a few combinations but no success.
My first thought was
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report,note
WHERE report.id = note.report_id
but this only returns the ones that have a match (would not return the invalid report).
after this i tried adding IF conditions but i just made it worse.
My question is, is this something i will figure out after getting past basic sql
or can this be done in simple way?
Anyway i would appreciate any help, i pretty much lost with this.
Thank you.
EDIT: i have looked into related questions but havent yet found one that solves my problem.
I probably need to look into other statements such as join or something to sort this out.
You need to get to the chapter on OUTER JOINS, specifically, a LEFT JOIN
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report
LEFT JOIN note ON report.id = note.report_id
Note the ANSI-92 JOIN syntax as opposed to using WHERE x=y
(You can probably do it using the older syntax you were using WHERE report.id *= note.report_id, if I recall the old syntax correctly, but I'd recommend the above syntax instead)
You are doing a join. The kind of join you have is an inner join, but you want an outer join:
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report
LEFT JOIN note on report.id = note.report_id
Note that the LEFT table is the one that will supply the missing values.
I am a beginner at using mysql and I am trying to learn the best practices. I have setup a similar structure as seen below.
(main table that contains all unique entries) TABLE = 'main_content'
+------------+---------------+------------------------------+-----------+
| content_id | (deleted) | title | member_id |
+------------+---------------+------------------------------+-----------+
| 6 | | This is a very spe?cal t|_st | 1 |
+------------+---------------+------------------------------+-----------+
(Provides the total of each difficulty and joins id --> actual name) TABLE = 'difficulty'
+---------------+-------------------+------------------+
| difficulty_id | difficulty_name | difficulty_total |
+---------------+-------------------+------------------+
| 1 | Absolute Beginner | 1 |
| 2 | Beginner | 1 |
| 3 | Intermediate | 0 |
| 4 | Advanced | 0 |
| 5 | Expert | 0 |
+---------------+-------------------+------------------+
(This table ensures that multiple values can be inserted for each entry. For example,
this specific entry indicates that there are 2 difficulties associated with the submission)
TABLE = 'lookup_difficulty'
+------------+---------------+
| content_id | difficulty_id |
+------------+---------------+
| 6 | 1 |
| 6 | 2 |
+------------+---------------+
I am joining all of this into a readable query:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN difficulty ON difficulty.difficulty_id
IN (SELECT difficulty_id FROM main_content, lookup_difficulty WHERE lookup_difficulty.content_id = main_content.content_id )
INNER JOIN member ON member.member_id = main_content.member_id
The above works fine, but I am wondering if this is good practice. I practically followed the structure laid out Wikipedia's Database Normalization example.
When I run the above query using EXPLAIN, it says: 'Using where; Using join buffer' and also that I am using 2 DEPENDENT SUBQUERY (s) . I don't see any way to NOT use sub-queries to achieve the same affect, but then again I'm a noob so perhaps there is a better way....
The DB design looks fine - regarding your query, you could rewrite it exclusively with joins like:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN lookup_difficulty ON main_content.id = lookup_difficulty.content_id
INNER JOIN difficulty ON difficulty.id = lookup_difficulty.difficulty_id
INNER JOIN member ON member.member_id = main_content.member_id
If the lookup_difficulty provides a link between content and difficulty I would suggest you take out the difficulty_id column from your main_content table. Since you can have multiple lookups for each content_id, you would need some extra business logic to determine which difficulty_id to put in your main_content table (or multiple entries in the main_content table for each difficulty_id, but that goes against normalization practices). For ex. the biggest value / smallest value / random value. In either case, it does not make much sense.
Other than that the table looks fine.
Update
Saw you updated the table :)
Just as a side-note. Using IN can slow down your query (IN can cause a table-scan). In any case, it used to be that way, but I'm sure that these days the SQL compiler optimizes it pretty well.
Just after some opinions on the best way to achieve the following outcome:
I would like to store in my MySQL database products which can be voted on by users (each vote is worth +1). I also want to be able to see how many times in total a user has voted.
To my simple mind, the following table structure would be ideal:
table: product table: user table: user_product_vote
+----+-------------+ +----+-------------+ +----+------------+---------+
| id | product | | id | username | | id | product_id | user_id |
+----+-------------+ +----+-------------+ +----+------------+---------+
| 1 | bananas | | 1 | matthew | | 1 | 1 | 2 |
| 2 | apples | | 2 | mark | | 2 | 2 | 2 |
| .. | .. | | .. | .. | | .. | .. | .. |
This way I can do a COUNT of the user_product_vote table for each product or user.
For example, when I want to look up bananas and the number of votes to show on a web page I could perform the following query:
SELECT p.product AS product, COUNT( v.id ) as votes
FROM product p
LEFT JOIN user_product_vote v ON p.id = v.product_id
WHERE p.id =1
If my site became hugely successful (we can all dream) and I had thousands of users voting on thousands of products, I fear that performing such a COUNT with every page view would be highly inefficient in terms of server resources.
A more simple approach would be to have a 'votes' column in the product table that is incremented each time a vote is added.
table: product
+----+-------------+-------+
| id | product | votes |
+----+-------------+-------+
| 1 | bananas | 2 |
| 2 | apples | 5 |
| .. | .. | .. |
While this is more resource friendly - I lose data (eg. I can no longer prevent a person from voting twice as there is no record of their voting activity).
My questions are:
i) am I being overly worried about server resources and should just stick with the three table option? (ie. do I need to have more faith in the ability of the database to handle large queries)
ii) is their a more efficient way of achieving the outcome without losing information
You can never be over worried about resources, when you first start building an application you should always have resources, space, speed etc. in mind, if your site's traffic grew dramatically and you never built for resources then you start getting into problems.
As for the vote system, personally I would keep the votes like so:
table: product table: user table: user_product_vote
+----+-------------+ +----+-------------+ +----+------------+---------+
| id | product | | id | username | | id | product_id | user_id |
+----+-------------+ +----+-------------+ +----+------------+---------+
| 1 | bananas | | 1 | matthew | | 1 | 1 | 2 |
| 2 | apples | | 2 | mark | | 2 | 2 | 2 |
| .. | .. | | .. | .. | | .. | .. | .. |
Reasons:
Firstly user_product_vote does not contain text, blobs etc., it's purely integer so it takes up less resources anyways.
Secondly, you have more of a doorway to new entities within your application such as Total votes last 24 hr, Highest rated product over the past 24 hour etc.
Take this example for instance:
table: user_product_vote
+----+------------+---------+-----------+------+
| id | product_id | user_id | vote_type | time |
+----+------------+---------+-----------+------+
| 1 | 1 | 2 | product |224.. |
| 2 | 2 | 2 | page |218.. |
| .. | .. | .. | .. | .. |
And a simple query:
SELECT COUNT(id) as total FROM user_product_vote WHERE vote_type = 'product' AND time BETWEEN(....) ORDER BY time DESC LIMIT 20
Another thing is if a user voted at 1AM and then tried to vote again at 2PM, you can easily check when the last time they voted and if they should be allowed to vote again.
There are so many opportunities that you will be missing if you stick with your incremental example.
In regards to your count(), no matter how much you optimize your queries it would not really make a difference on a large scale.
With an extremely large user-base your resource usage will be looked at from a different perspective such as load balancers, mainly server settings, Apache, catching etc., there's only so much you can do with your queries.
If my site became hugely successful (we can all dream) and I had thousands of users voting on thousands of products, I fear that performing such a COUNT with every page view would be highly inefficient in terms of server resources.
Don't waste your time solving imaginary problems. mysql is perfectly able to process thousands of records in fractions of a second - this is what databases are for. Clean and simple database and code structure is far more important than the mythical "optimization" that no one needs.
Why not mix and match both? Simply have the final counts in the product and users tables, so that you don't have to count every time and have the votes table , so that there is no double posting.
Edit:
To explain it a bit further, product and user table will have a column called "votes". Every time the insert is successfull in user_product_vote, increment the relevant user and product records. This would avoid dupe votes and you wont have to run the complex count query every time as well.
Edit:
Also i am assuming that you have created a unique index on product_id and user_id, in this case any duplication attempt will automatically fail and you wont have to check in the table before inserting. You will just to make sure the insert query ran and you got a valid value for the "id" in the form on insert_id
You have to balance the desire for your site to perform quickly (in which the second schema would be best) and the ability to count votes for specific users and prevent double voting (for which I would choose the first schema). Because you are only using integer columns for the user_product_vote table, I don't see how performance could suffer too much. Many-to-many relationships are common, as you have implemented with user_product_vote. If you do want to count votes for specific users and prevent double voting, a user_product_vote is the only clean way I can think of implementing it, as any other could result in sparse records, duplicate records, and all kinds of bad things.
You don't want to update the product table directly with an aggregate every time someone votes - this will lock product rows which will then affect other queries which are using products.
Assuming that not all product queries need to include the votes column, you could keep a separate productvotes table which would retain the running totals, and keep your userproductvote table as a means to enforce your user voting per product business rules / and auditing.