Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am just setting up a website which has an order and a chat correspondence for each order. I have an orders table with each order specifying a single user.
If I add a relationship to the table how does that effect it? Would I still query the database using JOIN LEFT method?
Why should I use it?
I would like to confirm that my ORDERS table has user_id field and my users table does NOT have order_id field, is that correct thinking?
I have yet to do the same to chat feature on each order since I am still trying to learn what fields I need in that table to correctly query with php.
You are talking about two different concepts here:
Data input: Relationships is one of the methods used to ensure data integrity. It restricts what data can be inserted to the database.
Data retrieval: The LEFT JOIN that you mentioned is on of the methods used to retrieve the data from the database. It, kind of, filters the data so you get what you want instead of returning the entire table(s).
You don't have to use any of those two. They are there to help you, but they are not required and you can achieve similar results by other means. More importantly, they are completely unrelated. If you use relationships, you are not required to use LEFT JOIN or any other joins. And if you don't use relationships, you can still use joins and get the same results.
If you don't use relationships, your app can dictates how the data are entered to the database. If you don't use joins, you can use sub-queries for example. Which way to go is really greatly dependant on your requirements, but probably for most scenarios, using relationships and joins is the way to go.
For example the two queries below are equivalent. The first one uses joins, and the second one does not:
SELECT users.name, orders.id
FROM users
INNER JOIN orders ON users.id = orders.user_id
SELECT users.name, orders.id
FROM users, orders
WHERE users.id = orders.user_id
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I will use this database to build a website using Laravel Framework.
from my experience, when you are developing an online store, the information about orders should be stored separately, not with relations
let me give you an example:
i order product A, my order is being processed, meanwhile, you delete product A from your database (different reasons), if you have the product_id in my order, what will happen?.
Also, you should make an intersection table for users and payment details, they may have more credit cards. An intersection table for users and delivery addresses would also be easier to manage than a text colum
As others mentioned this is too broad to answer.
But I can give you some pointers to remember.
Normalization
This is the primary purpose of relationships. Basically this is not duplicating data, so you wouldn't put a state in every address you would have a state table and put it's id in the address.
You have some of this, but you can over normalize too. Like the payment status. Likely these wont change much over time so you could use an ENUM field which is basically a text field with a list of acceptable values in it.
Pretty much if you have a One to One relationship, you don't really need another table for it. The only reason I can think of was back when InnoDB didn't have full text support, then you could make a MyIsam table and a InnoDB table to kind of use both their benefits. Otherwise it just makes things harder because you have an extra join, and you have to make user that you tables dont become a Many to One relationship.
An example in yours is Orders and Delivery Addresses, this would probably be a one to one, it's not like you can deliver the same item to more then one address. Your probably thinking, I can reuse those addresses for different orders, but if you read below you might see why that is not always a good idea. Which is not saying you cant use them again, you just probably shouldn't allow edits to them and that can cause a whole cascade of issues.
Consistancy
When dealing with things like orders, you should bake as much data into the table as you can. This is de-normalization. But the reason is that, products can be deleted addresses can change etc. You don't want these things affecting your orders later. So by baking that data in you don't have to worry about not being able to change those things. Obviously you still want some links like to the User but you may want to bake in the email they used for that order, that way latter if they say I didn't get the email, you can know what email was used not the one they currently have, maybe they changed it.
Hierarchies
This is specifically for the category table, you may want to look at some of the hierarchy models like nested set, or adjacency list. This will eliminate tables and allow more nesting levels. It's quite a bit harder to setup, but it's way more flexible.
Another choice is to use something like a tag system, where you have a list of tags and associate those with the products through a many to many relationship. I think we all know what tags are but if your not sure, than look at the tags on Stack Overflow. These can help improve search results and help tie related products together even if they are in different categories. For example you could have
veggies > potatoes
utensils > potato peelers
They are related, but you probably won't put them in the same category.
You could even use both!
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Consider a trip itinerary. There are 20 possible stops on a tour. A standard tour involves stops 1 through 20 in order. However, each user may create their own tour consisting of 5 or more stops in any order with possibility for repeats. What is the most efficient way to model this in a database?
If we use a join table
user_id, stop_id, order
we would have millions of records very quickly but we could easily pull the stop & user attributes on queries.
If we stored the stops as an array,
user_id, stop_id_array_in_order
we have a much smaller, non-normalized table and we cannot easily access the stop attributes.
Are there other options that allow for accessing of parent attributes while minimizing table size?
I would define the entities and create tables for them with the relations between them in separate tables as you described in the first example:
users table
tours table
stops table
tours_users table (a User can go to a Tour more than once)
stops_order table: stop_id, order, tours_users_id
For querying the tables, for any user you want to check their tour you can achieve this with the tours_users table , if the stops needs to be retrieved , you can easily join the tours_users table with the stops_order table through the tours_users_id.
If the tables are indexed correctly, there should be no problem with performance and you will be using the relational database engine as you supposed to.
You're thinking that saving some space will help you. It won't. It's also arguable how much space you'd actually save.
You'd also be using an unordered data structure - that's something you don't want. You want ordered structure (table) which can relate to other records - and that's exactly the reason why we normalize tables - so we can extrapolate all kinds of data without altering physical location. The other benefit is that ordered structures can be indexed and we can reduce the amount of time finding the records. Tradeoff is spending space to keep the index records.
However, millions, billions - even trillions of rows are ok. Just imagine how difficult it would be querying a structure where an array is saved as a comma separated list in a column (or multiple columns). It would be a nightmare to write a query, and performance would go down linearly as amount of records goes up.
TL;DR: keep it normalized.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
In building a web app recently, I started thinking about the information returned from a query I was making:
Find the user information and (for simplicity sake) the associated phone numbers tied to this user. Something as simple as:
SELECT a.fname, a.lname, b.phone
FROM users a
JOIN users_phones b
ON (a.userid = b.userid)
WHERE a.userid = 12345;
No problem here (yes I'm preventing injection, etc, not the point of this question). When I think about the data that is returned though, I am returning (potentially) several rows of information with that users name on each one. Let's say that single user has 1000 phone numbers associated with it. That's a first name and last name being returned each call a lot. Let's also assume I want to return a lot more than just the first name and last name of that user and in fact I'm starting to return quite a bit of extra rows which I really only needed once.
Are there circumstances in which it is "more appropriate" to make multiple calls to a database?
e.g.
SELECT firstname, lastname
FROM users
WHERE userid = 12345;
SELECT phone
FROM users_phones
WHERE userid = 12345;
If the answer is yes, is there a good/proper method of determining when to use multiple queries versus a single one?
I think that really depends on your use case. In the example you gave, it seems to make sense to return it as two queries, especially if you're passing that info back to a mobile device where you want to make sure you send them as little data as possible (not everyone has unlimited data.....)
I'd probably stick a DISTINCT in those queries as well if that's going to make a difference based on your tables.
A query with a JOIN may be slower than two independent queries. It really depends on the type of access you're doing.
For your example, I'd go with the two query approach. These queries could be executed in parallel, they could be cached, and there's no real reason to JOIN other than for arbitrary presentation concerns.
You'll also want to be concerned about returning duplicate data. In your example it looks like fname and lname would be repeated for each and every phone number, resulting in a lot of data being transmitted that's actually not useful. This is because of the one-to-many relationship you've described.
Generally you'll want to JOIN if it means sending less data, or because the two queries are not independent.
This should be driven by the application. Basically, you retrieve in one query all the information needed in one place. If you take this question page as an example, you see your user ID, the reputation counter, and the badge counters. There's no need to retrieve other user profile information when I first display the question page.
Only when one clicks on the user ID the rest of the profile may be queried, and may be not even all of it, as there are several tabs on the profile page.
However, if your application is guaranteed to access all 1000 phone numbers at once, along with the user's name, then you probably should fetch them all together.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have reviewed some q&a but thought something specific to my subject would help me get off the fence.
I have an app that calculates pricing based on several different formulas and hundreds of different material types.
user A may use formula A and material A, B, C
user B uses formula A and material A, B, C, + they want to add a
material that no one else uses Material unique_A
when user A is on the app he doesn't want to see user B's unique material.
I was thinking of using a unique table of materials for each user so that it is "faster??? more efficient??? to grab the list of materials, instead of trying to set up some sort of off, on function that grabbed only the materials the user wants from one global table.
Which way is better? One table or a unique table for each user?
You can have a table of all materials.
materials = (id, name, other attributes...)
and a table of users:
myusers = (id, name, etc....)
then you can have a table that basically represents the many to many relationship between these two:
user_materials = (user_id, material_id)
You can then select the specific materials used by a user by joining these tables. Application wise, this arrangement is better than trying to create a table for each user. Queries will become difficult. This way you also have answer to the question: Which users are using material A?
Unless you have very few users, each with his own stable non changing items,
I don't see any sense in doing this.
Plus , most likely you will not get into performance issues
if you are talking about a domain of users and materials.
It's not like there are millions of either , right?
One "best practice" for databases is to reduce information duplicity. Actually variations of that exists for just about any field of theory there is.
It would mean however that your approach of a unique table per user would not be a good idea.
Not only would it duplicate data, but maintaining such a database would become a gigantic task as the number of users increases.
I would prefer to have a global table of materials, a table of users and a table over which user want's which materials.
The 'one-table-approach' can be considered better because it reduces complexity, both in database and in the code which should access the database, and duplication of information.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I wanted to ask if View are really worth using.
From my understanding a view is really just a query and each time you query the view the view then runs its own query again to get fresh/uptodate data.
This sounds to me like 2 queries are run.
wouldn't it be faster to just run the query required and skip the view?
Please note: I would be using simple views but even it they were quite complex I assume the same principle applies.
My type of view - say 3 tables with 6 columns each - and 2 columns of each time is added into the view with a couple of maths equations to refine the data a touch.
What do others do? Skip or use them?
Typically Views are set up to make selects easier to understand and at the same time give guidance to the database engine on how to optimize the query. By creating a view you tell the database engine that you're going to be selecting from this frequently and to spend more time optimizing the query plan so selects from the view will be faster. The upside of this is that when it comes time to parse the query and plan the query you'll save some execution time because the optimization has already been performed. It could be as little as a few miliseconds you save, or potentially very large (for very large result sets)
You're correct that views are not designed to be a performance benefit in MySQL.
What they are designed to do is make other queries built on them to be simpler to read, and to make sure that other users and programmers have a better chance at using the data correctly. Think of them as a way to virtually de-normalize the data without taking the size/performance hit of actually de-normalizing the data.
Just as the most simple case, let's just take orders and line items. Each order has a line item.
The orders table might have the following columns:
ID
Status
Created_at
Paid_on
And the line_items table might have the following columns:
LI_ID
order_id
sku_id
quantity
price
What you'll find, when writing code and queries is that you are going to be doing the following join all the time -
orders
join line_items on line_items.order_id = orders.id
This could be simplified by creating a view:
create view 'order_lines' as
select * from orders
join line_items on line_items.order_id = orders.id
So your query would go from:
select orders.id, sum(price) from orders
join line_items on line_items.order_id = orders.id
where created_at >= '2011-12-01' and created_at < '2012-01-01
group by orders.id;
to:
select id, sum(price) from order_lines
where created_at >= '2011-12-01' and created_at < '2012-01-01
group by id;
The DB will execute both of these exactly the same way, but one is easier to read. Admittedly in this case, not MUCH easier to read, but easier to read and code.
The query optimizer is usually able to combine the view query with the query that uses the view in such a way that only a single query is run, so the objection you have to views doesn't really apply.
See also:
MySQL VIEW as performance troublemaker
View vs. Table Valued Function vs. Multi-Statement Table Valued Function
Should I use a view, a stored procedure, or a user-defined function?
Regards
Views can be provided to applications or users that needs a straight-forward view of data that isn't necessarily in one table (or limited fields from one table). That means they don't have to understand the data and how it relates -- they just get the data they need. You create the complex query, optimize it, and they just use the resulting view.