Can MYSQL queries dramaticly slow my website? - mysql

Hey, I'm doing a website that requires me to use about 5 mysql "SELECT * FROM" queries on each site and I'd like to know if it can slow the download speed in any way.

Yes.
Here are some useful links to help you understand how to measure MySQL performance and make changes to improve it:
Linux Mag MySQL Tuning article
MySQL Docs (Memory Use)
MySQL Performance Cookbook

MySQL will have no impact on the download speed (i.e., the time it takes for HTML content to get from your server to a visitor's web browser). However, they may create a delay between when your server gets the request and when it can send that HTML. Here's the sequence of events:
Visitor sends a request: "Please send me example.com/some-page"
Your server does some work to generate what some-page is supposed to look like and produce appropriate HTML
Your server sends that page to the visitor
MySQL doesn't affect #1 or #3, but of course it's a key part of what's happening in #2.
The big question is: how much of an impact will it have. If your five SELECT queries are each selecting one row from a table with only a hundred rows, the total impact on performance will be negligible.
If, on the other hand, each query is doing complex JOINs and subqueries on large tables, you could easily notice a difference.
The easiest way to get a sense of this impact is to connect directly to your MySQL server (i.e., not through your PHP script) and run those queries to see how long they take. If they're running slowly, you can always come back to StackOverflow for advice on how to make a particular query run more efficiently.

Sure. Especially if the tables contain lots or rows.

If you have queries that take a long time, then the page will appear to take longer to load. Once the server has finished creating the HTML to send to the client (where the queries happen), the download speed will depend on how big the content of the page is.

Related

Which would be more efficient, having each user create a database connection, or caching?

I'm not sure if caching would be the correct term for this but my objective is to build a website that will be displaying data from my database.
My problem: There is a high probability of a lot of traffic and all data is contained in the database.
My hypothesized solution: Would it be faster if I created a separate program (in java for example) to connect to the database every couple of seconds and update the html files (where the data is displayed) with the new data? (this would also increase security as users will never be connecting to the database) or should I just have each user create a connection to MySQL (using php) and get the data?
If you've had any experiences in a similar situation please share, and I'm sorry if I didn't word the title correctly, this is a pretty specific question and I'm not even sure if I explained myself clearly.
Here are some thoughts for you to think about.
First, I do not recommend you create files but trust MySQL. However, work on configuring your environment to support your traffic/application.
You should understand your data a little more (How much is the data in your tables change? What kind of queries are you running against the data. Are your queries optimized?)
Make sure your tables are optimized and indexed correctly. Make sure all your query run fast (nothing causing a long row locks.)
If your tables are not being updated very often, you should consider using MySQL cache as this will reduce your IO and increase the query speed. (BUT wait! If your table is being updated all the time this will kill your server performance big time)
Your query cache is set to "ON". Based on my experience this is always bad idea unless your data does not change on all your tables. When you have it set to "ON" MySQL will cache every query. Then as soon as they data in the table changes, MySQL will have to clear the cached query "it is going to work harder while clearing up cache which will give you bad performance." I like to keep it set to "ON DEMAND"
from there you can control which query should be cache and which should not using SQL_CACHE and SQL_NO_CACHE
Another thing you want to review is your server configuration and specs.
How much physical RAM does your server have?
What types of Hard Drives are you using? SSD is not at what speed do they rotate? perhaps 15k?
What OS are you running MySQL on?
How is the RAID setup on your hard drives? "RAID 10 or RAID 50" will help you out a lot here.
Your processor speed will make a big different.
If you are not using MySQL 5.6.20+ you should consider upgrading as MySQL have been improved to help you even more.
How much RAM does your server have? is your innodb_log_buffer_size set to 75% of your total physical RAM? Are you using innodb table?
You can also use MySQL replication to increase the read sources of the data. So you have multiple servers with the same data and you can point half of your traffic to read from server A and the other half from Server B. so the same work will be handled by multiple server.
Here is one argument for you to think about: Facebook uses MySQL and have millions of hits per seconds but they are up 100% of the time. True they have trillion dollar budget and their network is huge but the idea here is to trust MySQL to get the job done.

Slow remote queries

I am working on a Rails application and was using SQLite during development and the speed had been very fast.
I have started to use a remote MySQL database hosted by Amazon and am getting very slow query times. Besides trying to optimize the remote database, is there anything on the Rails side of things I can do?
Local database access vs. remote will show a significant difference in speed. Since you've not provided any specifics I can't zero in on the issue but I can make a suggestion:
Try caching your queries and views as much as possible. This will reduce the amount of queries you need to do. This works well especially for static data like menus.
Optimization is the key. Make sure you eliminate as many unnecessary queries as you can, and those queries you make only request the fields you need using the select method.
Profile the various components involved. The database server itself is one of them. The network latency is another. While for the second one probably there is little you can do, probably you can tweak alot the first part. Starting from profiling the queries and going to tweaking the server itself.
Knowing where to look for will help you start with the best approach. As for caching, always keep that in mind, but that can prove to be quite problematic depending on the nature of your application.

How significant is the "amount of queries used" to site performance/speed?

I've been wondering how significant is the amount of queries used to site performance/speed.
I have two websites, running with different engines (one with IPB and another with MyBB). The IPB one has less queries used (only 14 on average), but it runs slower than MyBB with more queries used (average on 20).
I thought this is because the IPB is heavily-modded, so I run a fresh-install on my localhost.
But it still results the same. The IPB (which has less queries) runs slower than MyBB (with more queries).
This makes me wonder, so how is queries used affecting the site performance? Is it significant?
Well quantity of queries is one factor. But you also have to consider each individual query, does it do joins? Lots of Math? Queries within queries? Do the tables have indexes etc.
Take a look at these links on optimisation, knowing how to optimise something can tell you what can cause it to slow down.
http://www.fiftyfoureleven.com/weblog/web-development/programming-and-scripts/mysql-optimization-tip
http://msdn.microsoft.com/en-us/library/ff650689.aspx
http://hungred.com/useful-information/ways-optimize-sql-queries/
I believe this question has many answers. There are many things to consider. For example, querying a db which will return A LOT of lines will take more time than a DB which has ust a few records. Another thing to consider is the number of queries, just as you are doing within your post. Another case i can think of is how you do the queries. If you are using jquery or javascript ajax calls to make the queries they will take a lot more time.
Sheer number of queries per page does not correlate with site responsiveness and speed.
What does matter is:
What those queries actually are?
How well DB server can cope with the load that they generate?
Are they executed serially or in parallel?
It depends on what You consider 'significant'.
Let's consider two scenarios:
We have 'lots' of queries that take quite some time to execute
It's bad... Just because it takes much time to process them all
We have 'lots' of queries that take very little time to execute
If Your database server is on different machine as web server it might be a problem due to communication overhead. Web server and database server will most probably spend more time on communication then on processing each query (think of network latency).
If Your database server is on the same machine as web server it might not affect site performance much as communication between web server and database server will be very quick. BUT there are other things to consider. For example You might be locking some tables for update/select queries A LOT and it will decrease site performance considerably.
It's always better to execute less queries.
You should remember that on an average website ~80% is about frontend performance and only 20% about backend performance.
So for website performance it is in most cases more relevant to optimize frontend performance. Here you can make the big points quickly.
Well, of course there are scenarios with sites heavily displaying LOTS of data coming from a DB. Here it is worth to think about optimizing backend performance. And optimizing backend performance means optimizing database stuff in most cases.
A good book about this is "High Performance Web Sites" from Steve Souders.

What would be the best DB cache to use for this application?

I am about 70% of the way through developing a web application which contains what is essentially a largeish datatable of around 50,000 rows.
The app itself is a filtering app providing various different ways of filtering this table such as range filtering by number, drag and drop filtering that ultimately performs regexp filtering, live text searching and i could go on and on.
Due to this I coded my MySQL queries in a modular fashion so that the actual query itself is put together dynamically dependant on the type of filtering happening.
At the moment each filtering action (in total) takes between 250-350ms on average. For example:-
The user grabs one end of a visual slider, drags it inwards, when he/she lets go a range filtering query is dynamically put together by my PHP code and the results are returned as a JSON response. The total time from the user letting go of the slider until the user has recieved all data and the table is redrawn is between 250-350ms on average.
I am concerned with scaleability further down the line as users can be expected to perform a huge number of the filtering actions in a short space of time in order to retrieve the data they are looking for.
I have toyed with trying to do some fancy cache expiry work with memcached but couldn't get it to play ball correctly with my dynamically generated queries. Although everything would cache correctly I was having trouble expiring the cache when the query changes and keeping the data relevent. I am however extremely inexperienced with memcached. My first few attempts have led me to believe that memcached isn't the right tool for this job (due to the highly dynamic nature of the queries. Although this app could ultimately see very high concurrent usage.
So... My question really is, are there any caching mechanisms/layers that I can add to this sort of application that would reduce hits on the server? Bearing in mind the dynamic queries.
Or... If memcached is the best tool for the job, and I am missing a piece of the puzzle with my early attempts, can you provide some information or guidance on using memcached with an application of this sort?
Huge thanks to all who respond.
EDIT: I should mention that the database is MySQL. The siite itself is running on Apache with an nginx proxy. But this question is related purely to speeding up and reducing the database hits, of which there are many.
I should also add that the quoted 250-350ms roundtrip time is fully remote. As in from a remote computer accessing the website. The time includes DNS lookup, Data retrieval etc.
If I understand your question correctly, you're essentially asking for a way to reduce the number of queries against the database eventhough there will be very few exactly the same queries.
You essentially have three choices:
Live with having a large amount of queries against your database, optimise the database with appropriate indexes and normalise the data as far as you can. Make sure to avoid normal performance pitfalls in your query building (lots of ORs in ON-clauses or WHERE-clauses for instance). Provide views for mashup queries, etc.
Cache the generic queries in memcached or similar, that is, without some or all filters. And apply the filters in the application layer.
Implement a search index server, like SOLR.
I would recommend you do the first though. A roundtrip time of 250~300 ms sounds a bit high even for complex queries and it sounds like you have a lot to gain by just improving what you already have at this stage.
For much higher workloads, I'd suggest solution number 3, it will help you achieve what you are trying to do while being a champ at handling lots of different queries.
Use Memcache and set the key to be the filtering query or some unique key based on the filter. Ideally you would write your application to expire the key as new data is added.
You can only make good use of caches when you occasionally run the same query.
A good way to work with memcache caches is to define a key that matches the function that calls it. For example, if the model named UserModel has a method getUser($userID), you could cache all users as USER_id. For more advanced functions (Model2::largerFunction($arg1, $arg2)) you can simply use MODEL2_arg1_arg2 - this will make it easy to avoid namespace conflicts.
For fulltext searches, use a search indexer such as Sphinx or Apache Lucene. They improve your queries a LOT (I was able to do a fulltext search on a 10 million record table on a 1.6 GHz atom processor, in less than 500 ms).

How many queries are too many?

I have to run one time 10 mysql queries for one person in one page. Is it very bad? I have quite good hosting, but still, can it break or something? Thank you very much.
Drupal sites typically make anywhere from 150 to 400+ queries per request. The total time spent querying the database is still under 1s - it's not the number that kills the server, but the quality/complexity of the queries (and possibly the size of the dataset they search through).
I can't tell what queries you're talking about but on most sites 10 is not much at all.
If you're concerned with performance, you can always see how long your queries take to execute in a database management program, such as MySQL Workbench.
10 fast queries can be better than 1 slow one. Define what's acceptable in terms of response time, throughput, in normal and peek traffic conditions, and measure if these 10 queries are a problem or not (i.e. don't respect your expectations).
If they are, then try to change your design and find a better solution.
How many queries are too many?
I will rephrase your question:
Is my app fast enough?
Come up with a business definition of "fast enough" for your application (based on business/user requirements), come up with a way to model all your usage scenarios and expected load, create simulations of that load and profile (trace/time) it.
This approach amounts to an educated guess. Anything short of it is pure speculation, and worthless.
If your application is already in production, and is working well in most cases, you can get feedback from users to determine pain points. From there, you can model those pain points and corresponding load, and profile.
Document your results. Once you make improvements to your application, you have a tool to determine if the optimizations you made achieved your goals.
When new to development as I assume you are. I recommend focusing on the most logical and obvious way to avoid over-processing. That is usually the avoidance of repeating a query by caching its first execution and checking for cached results before running queries.
After that don't spend too much time thinking about the number of queries and focus on well-written code. That means a good use of classes, methods and functions. While still having much to learn, you do not want to over-complicate every interaction with the database.
Enjoy what you are doing and keep it neat. That will result in easier to debug code which in itself can lead to better performance when you have the knowledge to take your code further. The performance of an application can be improved very quickly if the original work is well-written.
It depends on how much CPU cycles will the sum of the queries use.
1 query can consume way more CPU cycles than 100. It all depends on their contents.
You could begin by optimizing them following this guide: http://beginner-sql-tutorial.com/sql-query-tuning.htm
I think its not a problem. 10 Queries are not so much for a site. Less is better no question but when you have 3000 - 5000 then you should think about your structure.
And when you go in one query through a table with millions of rows without an index then are 10 to much.
I have seen a Typo3 site with a lot of extensions that make 7500 requests with the cache. This happens when you install and install and don't look at what happens.
But you can look that you make logical JOIN's over the tables that you have less queries.
Well there are big queries and small trivial queries. Which ones are yours? Generally, you should try to fetch the data in as few queries as possible. The heavier the load is on the database server the harder it will be to serve the clients as the traffic increases.
Just to add a bit of a different perspective to the other good answers:
First, to concur, the type and complexity of queries you are making will matter more 99% of the time than the number of queries.
However, in the rare situation where there is high latency on the network path to your database server (i.e. the db server is remote or such, not saying this is a logical or sane setup, but I have seen it done) then you want to minimize the number of queries done, because every single time you talk to the database server the network transmission time will take an order of magnitude or two longer than it takes to compute the query. This situation can really kill your page loading times, and so you'd really want to minimize the number of queries (actually, you just want to change your server setup...).