Is it ever "ok" to use multiple queries instead of one? [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
In building a web app recently, I started thinking about the information returned from a query I was making:
Find the user information and (for simplicity sake) the associated phone numbers tied to this user. Something as simple as:
SELECT a.fname, a.lname, b.phone
FROM users a
JOIN users_phones b
ON (a.userid = b.userid)
WHERE a.userid = 12345;
No problem here (yes I'm preventing injection, etc, not the point of this question). When I think about the data that is returned though, I am returning (potentially) several rows of information with that users name on each one. Let's say that single user has 1000 phone numbers associated with it. That's a first name and last name being returned each call a lot. Let's also assume I want to return a lot more than just the first name and last name of that user and in fact I'm starting to return quite a bit of extra rows which I really only needed once.
Are there circumstances in which it is "more appropriate" to make multiple calls to a database?
e.g.
SELECT firstname, lastname
FROM users
WHERE userid = 12345;
SELECT phone
FROM users_phones
WHERE userid = 12345;
If the answer is yes, is there a good/proper method of determining when to use multiple queries versus a single one?

I think that really depends on your use case. In the example you gave, it seems to make sense to return it as two queries, especially if you're passing that info back to a mobile device where you want to make sure you send them as little data as possible (not everyone has unlimited data.....)
I'd probably stick a DISTINCT in those queries as well if that's going to make a difference based on your tables.

A query with a JOIN may be slower than two independent queries. It really depends on the type of access you're doing.
For your example, I'd go with the two query approach. These queries could be executed in parallel, they could be cached, and there's no real reason to JOIN other than for arbitrary presentation concerns.
You'll also want to be concerned about returning duplicate data. In your example it looks like fname and lname would be repeated for each and every phone number, resulting in a lot of data being transmitted that's actually not useful. This is because of the one-to-many relationship you've described.
Generally you'll want to JOIN if it means sending less data, or because the two queries are not independent.

This should be driven by the application. Basically, you retrieve in one query all the information needed in one place. If you take this question page as an example, you see your user ID, the reputation counter, and the badge counters. There's no need to retrieve other user profile information when I first display the question page.
Only when one clicks on the user ID the rest of the profile may be queried, and may be not even all of it, as there are several tabs on the profile page.
However, if your application is guaranteed to access all 1000 phone numbers at once, along with the user's name, then you probably should fetch them all together.

Related

Which would be better table structure considering - faster execution and optimal structure? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am implementing multi-login system in a web application in Laravel. In the application a user can register with multiple social platforms and all those accounts should be considered as one.
I have 2 ways to implement this:
Method I:
users table
- id
- username
- email
- google_id
- facebook_id
- github_id
- twitter_id
- ...other columns
Setting all google_id, facebook_id, twitter_id by default NULL. Saving each value based on when user registers with that platform through the application.
Method II:
users table
id
username
email
.. other columns
social-login table
id
user_id
social_type //facebook or google or twitter etc
uid // unique identifier returned from each platform
Using Method I, I am getting better performance as I need to execute queries just on one table but method II is providing better table structure.
Which method should I use? Consider the fact that there would lot of requests and in all requests we are going to get uid and not user_id to fetch any information of the user.
The queries would be something like below:
SELECT * FROM users WHERE id=2;
SELECT * FROM `social-login` WHERE user_id=2;
SELECT * FROM users as U JOIN `social-login` as S ON S.user_id=U.id WHERE U.id=2
I would go with option 2.
Reasons
You could always "add new social types without changing the schema." whereas in 1st option you would have to add another column for that social_type.
Normalized database as it prevents data inconsistencies.
I'd like to add one more suggestion.
Create another table
social_type table
id | name
1 Facebook
2 Google
And refer it's id in the social-login table
id | user_id | social_type_id | uuid
Benefits
The database has control over the types. With this user can only choose options that exist in the system and are valid for the system. So basically you control it. Whereas earlier user could send any random value qawqdq (no doubt that you could have other checks) and it would store it.
With the previous approach, there might be a possibility that some rows might have "facebook" in lowercase, some "FACEBOOK" in upper case, some "FaceBook". So this helps in keeping one particular value across all the entries.
Again, performance would be affected by this as it would add an extra join. So it's completely optional. I'd suggest you do it this way if performance is not affected by a huge amount and is not that big of a concern to you.
I would choose option 2.
It is the cleaner structure
You can store the data better
I don't think that the time loss can be that problematic for your performance. I would only choose option 1 if the time loss is significantly high.

How can i Improve this database , and the relations between the tables is it fine? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I will use this database to build a website using Laravel Framework.
from my experience, when you are developing an online store, the information about orders should be stored separately, not with relations
let me give you an example:
i order product A, my order is being processed, meanwhile, you delete product A from your database (different reasons), if you have the product_id in my order, what will happen?.
Also, you should make an intersection table for users and payment details, they may have more credit cards. An intersection table for users and delivery addresses would also be easier to manage than a text colum
As others mentioned this is too broad to answer.
But I can give you some pointers to remember.
Normalization
This is the primary purpose of relationships. Basically this is not duplicating data, so you wouldn't put a state in every address you would have a state table and put it's id in the address.
You have some of this, but you can over normalize too. Like the payment status. Likely these wont change much over time so you could use an ENUM field which is basically a text field with a list of acceptable values in it.
Pretty much if you have a One to One relationship, you don't really need another table for it. The only reason I can think of was back when InnoDB didn't have full text support, then you could make a MyIsam table and a InnoDB table to kind of use both their benefits. Otherwise it just makes things harder because you have an extra join, and you have to make user that you tables dont become a Many to One relationship.
An example in yours is Orders and Delivery Addresses, this would probably be a one to one, it's not like you can deliver the same item to more then one address. Your probably thinking, I can reuse those addresses for different orders, but if you read below you might see why that is not always a good idea. Which is not saying you cant use them again, you just probably shouldn't allow edits to them and that can cause a whole cascade of issues.
Consistancy
When dealing with things like orders, you should bake as much data into the table as you can. This is de-normalization. But the reason is that, products can be deleted addresses can change etc. You don't want these things affecting your orders later. So by baking that data in you don't have to worry about not being able to change those things. Obviously you still want some links like to the User but you may want to bake in the email they used for that order, that way latter if they say I didn't get the email, you can know what email was used not the one they currently have, maybe they changed it.
Hierarchies
This is specifically for the category table, you may want to look at some of the hierarchy models like nested set, or adjacency list. This will eliminate tables and allow more nesting levels. It's quite a bit harder to setup, but it's way more flexible.
Another choice is to use something like a tag system, where you have a list of tags and associate those with the products through a many to many relationship. I think we all know what tags are but if your not sure, than look at the tags on Stack Overflow. These can help improve search results and help tie related products together even if they are in different categories. For example you could have
veggies > potatoes
utensils > potato peelers
They are related, but you probably won't put them in the same category.
You could even use both!

Separate table for banned users? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
What would be the "correct" way of organizing banned users?
Should I simply add a new column in the existing users table called is_banned that acts as a boolean or should I create a new table called banned_users that acts as a pivot table with the user_id?
The same question goes for administrators. Should I create a new table for site admins or just create a new column called is_admin?
What about performance of the two options?
Thanks.
What happens with the next type of users - add another table? Better not.
You could add a new column called type or something like that. One way would be it containing a number indicating the type like
1 = normal user
2 = admin
3 = banned
or you could even add another table called user_types that refer to it, but that would only be necessary if you have the types changing over time.
If you need to combine types - users having multiple types at once, then you could make the column a bit field.
When do you need seperate tables?
When these different users would have different attributes and the tables for each type of users would differ.
You need to think about how the banning concept will play out in the real world. Do you just want a flag? What about when they were banned and by whom? past banning history? a response mechanism for the banned? A list of complaints, with user/date/reason?
Data models are the most difficult part of a system to evolve, so you want to think about all manner of possible futures, even stuff you don't have on the roadmap just yet.
You might decide, for efficiency, that you want a ban table and a banned column. But there's a price to be paid for that too, since you're now capturing the same fact in multiple places.
The issues are subtle and sometimes complex. Don't accept blanket one-size-fits-all answers.
The scalable solution that satisfies multitude of criteria would be this:
table that contains user data, users
table that contains roles - roles
junction table that connects the two, user2roles
You keep the user data separate from their actual role in your app - every user will at least have name and last name, those are not related to their permissions or roles.
You will most likely need to add more roles. For example, one role is being an admin. Another role is being banned. Another role can be being banned for a week, 2 weeks etc - basically you can add those as you go, without needing to alter your tables to support future functionality.
Your application (php, python, whatever) collects the data and then acts upon those roles.
Now you have a system that's got established relations, that you can scale and that's easy to understand by kids in kindergarden.
This is a simplified system that mixes permissions with roles, you can further expand it but IMO it's better to keep it simple.

Strategy to display very wide table [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I have a table that has 23 columns of data in it that I need to display. It's obviously unreasonably wide, and I am looking for a strategy to make it a little bit more manageable for my users.
That sounds like it would be better off just making the data available as CSV so that users can download it and read it in their favorite spreadsheet program. I know from experience that nothing made our users(1) happier (with one notable exception) than when we added this option in, and it's really quite easy to do. (Yeah, putting everything in a slick web interface is a nice goal, but sometimes you get better results by not working nearly so hard.)
(1) Our users are scientists. Physicists, in particular, but I'm told that biologists are the same. Your users might be different; check!
I think that 99% of times user is not interested in that many data at the same time, so try to split it somehow:
Try to show couple of main columns, and use jQuery and popups to show details for every row including other data from other columns.
Possibly not all users are interested in all columns. Show columns that are common to all users, and put an option above to show / hide additional columns
If none applies, then just show all 23 columns with horizontal splitter, no other option. If you really do this for some complex reporting purposes, perhaps provide ability to reorder columns so that users can put columns that they are interested in side to side or something.
However, I'm certain that your report can be splitted, either in many more specific reports targeting only parts of that data, or some other way...
What normally is done in databases (as you could see your table like one), is to split it up.
Especially if you have a lot of copied rows, like e.g. 10 columns are equal for all rows.
Example:
Table with customers having bought something.
The first 10 columns are for the customer's name, address, telephone etc.
If these 10 columns are equal for every customer then you can move it to a customer table and use an ID (or other unique column) to split it.
However, if all 23 columns do not have such repeating values than maybe the best thing you can do is create some kind of column selection or multiple tables with only certain columns shown.
E.g. Suppose you have a customer table with 23 information column about that customer, you can have one table with address information, one with company information etc.

What is the most efficient way to store a sort-order on a group of records in a database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Assume PHP/MYSQL but I don't necessarily need actual code, I'm just interested in the theory behind it.
A good use-case would be Facebook's photo gallery page. You can drag and drop a photo on the page, which fires an Ajax event to save the new sort order. I'm implementing something very similar.
For example, I have a database table "photos" with about a million records:
photos
id : int,
userid : int,
albumid : int,
sortorder : int,
filename : varchar,
title : varchar
Let's say I have an album with 100 photos. I drag/drop a photo into a new location and the Ajax event fires off to save on the server.
Should I be passing the entire array of photo ids back to the server and updating every record? Assume input validation by "WHERE userid=loggedin_id", so malicious users can only mess with the sort order of their own photos
Should I be passing the photo id, its previous sortorder index and its new sortorder index, retrieve all records between these 2 indices, sort them, then update their orders?
What happens if there are thousands of photos in a single gallery and the sort order is changed?
What about just using an integer column which defines the order? By default you assign numbers * 1000, like 1000, 2000, 3000.... and if you move 3000 between 1000 and 2000 you change it to 1500. So in most cases you don't need to update the other numbers at all. I use this approach and it works well. You could also use double but then you don't have control about the precision and rounding errors, so rather don't use it.
So the algorithm would look like: say you move B to position after A. First perform select to see the order of the record next to A. If it is at least +2 higher than the order of A then you just set order of B to fit in between. But if it's just +1 higher (there is no space after A), you select the bordering records of B to see how much space is on this side, divide by 2 and then add this value to the order of all the records between A and B. That's it!
(Note that you should use transaction/locking for any algorithm which contains more than a single query, so this applies to this case too. The easiest way is to use InnoDB transaction.)
Store as a linked list, sortorder is a foreign key reference to the next photo_id in the set.
this would probably be a 'linked list' construct.
To me the second method of updating is the way to go (update only the range that changes). You are mentioning "What happens if there are thousands of photos in a single gallery ...", and to me that is never going to happen. Lets take your facebook example. Facebook doesn't show thousands of photos on one page, they split it up to about 10-20 per page.
The way I'd do this in a nonrelational database is to store a list of photo IDs on the 'album' entity/record, in the order desired. Reordering the photos results in reordering the list, and only a single database write.
Some SQL databases (Eg, PostgreSQL) have native list datatypes, but MySQL doesn't. You could serialize the list as a string or binary on MySQL.
3rd-normal-form trained database gurus will scream at you that this is a terrible approach, but RDBMSes are optimized for OLAP type queries, where query flexibility is more important than read performance. Webapps are best written with a 'write heavy, read light' strategy in mind, and this sort of denormalization is exactly in line with that.