Testing my website for database performance & traffic handling [closed] - mysql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am developing a website using Django1.7 on python3.4 along with MySql as database engine. For the next 15-20 days I am planning to test it. The site is something like linkedin in terms of functionality and complexity and I am expecting to get around 20-30 thousand users in the next 6 months.
I have learnt about MySql during the development of my website only. I am using Django-debug-toolbar and have tried to reduce the query time and reduce the number of joins. I have a few questions:
what tools can be used to create http request to automatically fill the database and also query various pages.
Is django-debug-toolbar enough to do the profiling and optimization considering increasing number of requests from multiple user.
Whether should I work on reducing the number of database hits or the size of querysets django would be caching and how the use of RAM is going to affect the performance of the website.
Considering that I have no prior experience of database administration and handling a website, how should i determine whether the website is performing up to the mark. Please share the bestpractices as i am quite unfamiliar.

The single biggest factor in SQL performance is the number of rows in the tables you use. You should figure out how to load 50 thousand fake users and 1 million fake nodes into your test database.
Then guess which of your pageviews will be most common. Find a free load testing tool on the net (there are quite a few) and use it to hit that page hard on your server.
Figure out which queries are slow. Add appropriate indexes, or if you must, redesign your data base, and get those queries to be fast.
Then guess at your second most popular pageview and repeat. Keep going until you run out of time.
Keep in mind that this is guesswork. As your service ramps up in users, you need to keep an eye on the pageviews your real users prefer, and keep an eye on those slow queries.
This will, if you add users at the rate you plan, take a sizeable fraction of your time during your first year in operation.
Read a web site called http://use-the-index-luke.com/

Related

Should I use many database in one application? [duplicate]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We are in the planning stages for a new multi-tenant SaaS app and have hit a deciding point. When designing a multi-tenant application, is it better to go for one monolithic database that holds all customer data (Using a 'customer_id' column) or is it better to have an independent database per customer? Regardless of the database decisions, all tenants will run off of the same codebase.
It seems to me that having separate databases makes backups / restorations MUCH easier, but at the cost of increased complexity in development and upgrades (Much easier to upgrade 1 database vs 500). It also is easier / possible to split individual customers off to separate dedicated servers if the situation warrants the move. At the same time, aggregating data becomes much more difficult when trying to get a broad overview of how customers are using the software.
We expect to have less than 250 customers for at least a year after launch, but they will be large customers and more will follow afterward.
As this is our first leap into SaaS, we are definitely looking to do this right from the start.
This is a bit long for a comment.
In most cases, you want one database with a separate customer id column in the appropriate tables. This makes it much easier to maintain the application. For instance, it much easier to replace a stored procedure in one database than in 250 databases.
In terms of scalability, there is probably no issue. If you really wanted to, you could partition your tables by client.
There are some reasons why you would want a separate database per client:
Access control: maintaining access control at the database level is much easier than at the row level.
Customization: customizing the software for a client is much easier if you can just work in a single environment.
Performance bottlenecks: if the data is really large and/or there are really large numbers of transactions on the system, it might be simpler (and cheaper) to distribute databases on different servers rather than maintain a humongous database.
However, I think the default should be one database because of maintainability and consistency.
By the way, as for backup and restore. If a client requires this functionality, you will probably want to write custom scripts anyway. Although you could use the database-level backup and restore, you might have some particular needs, such as maintaining consistency with data not stored in the database.

Database Design: When should I use multiple database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Background:
I'm trying to build the backend services of an app. This app has rooms where a user can join in to. When a user joins a room, s/he can exchange some data with other users through socket. Outside of the room, the user can view the processed data from the transactions that happened inside the rooms s/he joined in to. Room list, room informations, and the data transactions inside the room should be stored in a database.
My idea is to create a project with one database, however, an experienced developer suggested to the database into two:
One that uses MongoDB for storing data transactions happening inside the room.
One that uses MySQL for storing and returning room list, room information and analytics of what happened inside a room.
Problem I see with using multiple database:
I did some research and from what I understand, multiple database is not recommended but could be implemented if data are unrelated. Displaying analytics will need to process data transactions that happened inside a room and also display the room's information. If I use the two database approach, I will need to retrieve data from both database in order to achieve this.
Question:
I personally think it's easier to use a single database approach since I don't see the data inside and outside of the room as 'unrelated'. Am I missing an important point on when to use multiple database?
Thanks in advance. Have a good day.
You can look at this problem from two perspectives; technical and practical.
Technically, if your back-end is expected to become very complex or scaled, it is recommended to break it down into multiple microservices, each in charge of a tiny task. Ideally, these small services should help you achieve separation of concerns, so each service only works with one piece of the data. If other services need to read and/or modify that piece of data, they have to go through the service in charge.
In this case, depending on the data each service is dealing with, you can pick the proper database. For instance, if you have transactional data, you can use MySQL, MongoDB for large schemaless content, or Elasticsearch if you want to perform a text search.
Practically, it is recommended to start small with one service and database (if you prefer to develop your app sooner), and then, over time break it down into multiple services as you need to add and/or improve features.
There are multiple points to keep in mind. First, if you expect to have a large user base, you should start development with the right architecture from the beginning to avoid scaling issues. Second, sometimes one database cannot perform the task you need. For example, it would be very inefficient to do a text search in MySQL. Finally, there is no absolute right way of doing things. However you do one thing, another friend might show up tomorrow and ask you why you did not do it his/her way. The most important thing is to start doing and then, learning and improving along the way. Facebook was started with MySQL and it worked fine initially. Could it survive one database type today? I suspect the answer is no, but would it have made sense for them to add all the N databases that they have now back then?
Good luck developing!

MySQL performance issues in my mind [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
First of all, I'm not a very experienced developer, I'm making mid-size apps in PHP, MySQL and Javascript.
There is something though which is making it hard for me to design a MySQL InnoDB database before each project. And that is the performance. I'm always quite worried about if I'm creating a normalized database scheme that when I'll have to join a couple of tables (like 5-6) together (there are usually a few many-to-many, many-to-one relationships between them) it will affect the performance a LOT (in negative) when each of these 5-6 tables has around 100k rows.
These projects that I usually have is creating analytics platforms. Therefore I'm expecting around 100M of clicks in total and I usually have to join this table to many others (each around 100k of rows) to get some data displayed. I'm usually making summarized tables of the clicks but cannot do the same for the other tables.
I'm not quite sure if I have to worry about future performance in this stage. Currently, I am actively managing a few of these applications with 30M+ clicks and tables that I join to this Clicks table with 40k+ rows. The performance is pretty bad - a select operation usually takes more than 10-20s to complete while I believe I have proper indexing, innodb_buffer_pool_size also.
I've read a lot about the key to having an optimized database is the design. That's why I'm usually thinking about the DB scheme a LOT before creating it.
Do I really have to worry about creating DB schemes where I'll have to Join 5-6 many-to-many/many-to-one/one-to-many tables or it's quite usual and MySQL should be able to easily handle this load?
Is there anything else that I should consider before creating a DB scheme?
My usual server setup is having a MySQL Server with 4GB RAM + 2 vCPUs, to serve the DB and a WebServer with 4GB RAM + 2 vCPUs. Both of them are using Ubuntu's 16.04 release and using the latest MySQL (5.7.21) and PHP7-fpm.
Gordon is right. RDBMSs are made to handle your kind of workload.
If you're using virtual machines (cloud, etc) to host your stuff, you can generally increase your RAM, vCPU count, and IO capacity simply by spending more money. But, usually, throwing money at DBMS peformance problems is less helpful than throwing better indexes at them.
At the scale of 100M rows, query performance is a legitimate concern. You will, as your project develops, need to revisit your DBMS indexing to optimize the queries you're actually using. So plan on that. The thing is, you cannot and will not know until you get lots of data what your actual performance issues will be.
Read this for a preview of what's coming: https://use-the-index-luke.com/ .
One piece of advice: partitioning of tables generally doesn't solve performance problems except under very specific circumstances.
Look up this acronym: YAGNI.
And go do your project. Spend your present effort getting it working.

Multitenancy: Single database or database per tenant? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We are in the planning stages for a new multi-tenant SaaS app and have hit a deciding point. When designing a multi-tenant application, is it better to go for one monolithic database that holds all customer data (Using a 'customer_id' column) or is it better to have an independent database per customer? Regardless of the database decisions, all tenants will run off of the same codebase.
It seems to me that having separate databases makes backups / restorations MUCH easier, but at the cost of increased complexity in development and upgrades (Much easier to upgrade 1 database vs 500). It also is easier / possible to split individual customers off to separate dedicated servers if the situation warrants the move. At the same time, aggregating data becomes much more difficult when trying to get a broad overview of how customers are using the software.
We expect to have less than 250 customers for at least a year after launch, but they will be large customers and more will follow afterward.
As this is our first leap into SaaS, we are definitely looking to do this right from the start.
This is a bit long for a comment.
In most cases, you want one database with a separate customer id column in the appropriate tables. This makes it much easier to maintain the application. For instance, it much easier to replace a stored procedure in one database than in 250 databases.
In terms of scalability, there is probably no issue. If you really wanted to, you could partition your tables by client.
There are some reasons why you would want a separate database per client:
Access control: maintaining access control at the database level is much easier than at the row level.
Customization: customizing the software for a client is much easier if you can just work in a single environment.
Performance bottlenecks: if the data is really large and/or there are really large numbers of transactions on the system, it might be simpler (and cheaper) to distribute databases on different servers rather than maintain a humongous database.
However, I think the default should be one database because of maintainability and consistency.
By the way, as for backup and restore. If a client requires this functionality, you will probably want to write custom scripts anyway. Although you could use the database-level backup and restore, you might have some particular needs, such as maintaining consistency with data not stored in the database.

MySQL: How many queries per page is too many? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 4 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I'm new to MySQL and something that's quickly becoming obvious to me is that it feels considerably easier to create several database queries per page as opposed to a few of them.... but I don't really have a feel for how many queries might be too many, or at what point I should invest more precious time to combining queries, spending time figuring out clever joins, etc.
I'm therefore wondering if there are some kind of "mental benchmarks" experienced folks here use with regard to number of queries per page, and if so, how many might be too many?
I understand that the correct answer in any context is related to what's needed to satisfy an application's functional requirements. However, on projects where client requirements may be flexible or not properly set, or on projects where you as the developer have full control (e.g. sites you develop for yourself), you may be able to negotiate between functionality and performance... basically, to just cut trivial features if coding requirements impact performance and you're unable to optimise it any further.
I would appreciate any views on this.
Thanks
There's no set number, "page" is arbitrary enough - one could be doing one database task while another could have 2 dozen widgets each with their own task.
One good rule of thumb though: the moment you put a SELECT inside a loop that's processing rows of another SELECT, stop. It might seem fast enough early on, but data tends to grow and those nested loops will grow exponentially with it, so expect it to become a bottleneck at some point. Even if the single query ends up being significantly slower, you'll be better off in the long run (and there's always stored procs, query caching, etc).
It depends how often the page is used, the latency between the app server and database server, and a lot of other factors.
For a page which only displays data, my gut feeling is that 100 is too many. However, there are some cases where that may be acceptable.
In practice you should only optimise where necessary, which means you optimise the pages that people use the most, and ignore the minor ones.
In particular, the pages are not available to the public and the (few) authorised users hardly ever use them, there is no incentive to make them faster.
If there is a real performance problem which you believe comes from having too many queries, enable the general query log (which may make performance worse, I'm afraid) and analyse the most common queries with a view to eliminating them.
You might find that there are some "low hanging fruits" - simple queries on rarely changing data which are called on most popular pages, which you can easily eliminate (for example, have your app server fetch the data on a cron job into a local file and read it from there). Or even "lower hanging fruits" like queries which are completely unnecessary.
The difficulty with trying to combine multiple queries is that it tends to go against code-reuse and code maintainability, so you should only do it if it is ABSOLUTELY necessary; it doesn't sound like you have enough data yet to make that determination.