Creating a view that contains a view - performance hit? - sql-server-2008

When composing a view - should I stick with the base tables or can I feel confident that including a view within a view will not hurt performance. I want to include the view because it would allow me to change one base view if i have a change to the table design opposed to updating every single view that is dependent on the table changed. It just seems like the smarter thing to do, but want to make sure I am not doing something considered bad practice or hurts performance.

In SQL Server views get resolved at compile time. There is a very small performance impact during the compilation. There is no impact on the actual query execution. That assumes however that the same plan will be selected. If you nest views that contain complex joins, you might run into a situation where you access a table more often than necessary. The optimizer wont be able to figure that out and the system will end up doing a lot more work than necessary. So be careful to put only views into a query that do not contain more tables than you would have accessed by writing the query without the view.

Historically, some platforms have had "trouble" optimizing queries that incorporated multiple levels of views. I say "trouble", because most of the time even the poorly optimized queries were fast enough for me. (Almost all the time. But I try not to live near the bleeding edge.)
Several years ago, I decided I'd use views whenever it made sense to. Thoughtful use of views can greatly simplify complex databases; we all know that. But I decided to trust the optimizer to do a good enough job, and to trust the developers to release upgrades that made the optimizer better before my queries buried the server.
So if I thought a view would reduce the mental load on me, I created a view. If I needed to query a view of a view of a view, I just did it.
So far, that decision has proven to be a good one for me. I've never killed a server with a query, and I still understand my tables and views. (I still look at execution plans and test performance before I move a query to production, though.)

Related

Which one is faster view or subquery?

The question says it all. Which one is faster? And when should we use view instead subquery and vice verse when it comes to speed optimisation?
I do not have a certain situation but was thinking about that while trying some stuff with views in mysql.
A smart optimizer will come up with the same execution plan either way. But if there were to be a difference, it would be because the optimizer was for some reason not able to correctly predict how the view would behave, meaning a subquery might, in some circumstances, have an edge.
But that's beside the point; this is a correctness issue. Views and subquerys serve different purposes. You use views to provide code re-use or security. Reaching for a subquery when you should use a view without understanding the security and maintenance implications is folly. Correctness trumps performance.
Neither are particularly efficient in MySQL. In any case, MySQL does no caching of data in views, so the view simply adds another step in query execution. This makes views slower than subqueries. Check out this blog post for some extra info http://www.mysqlperformanceblog.com/2007/08/12/mysql-view-as-performance-troublemaker/
One possible alternative (if you can deal with slightly outdated data) is materialized views. Check out Flexviews for more info and an implementation.

SQL for large scale datasets

Some job descriptions include sth such as "demonstrated skill on handling large scale(massive) datasets using SQL"
I would like to know which kinds of SQL-related skill sets are required for meeting the requirements of these jobs?
Designing a performant schema and knowing when to denormalize (and when you've got problems you can solve other ways.)
Efficient query design.
The intimate details of index design, to the point where you can make changes and get the results you expected.
How to build, support, and effectively make use of test data.
How to read all the breadcrumbs your server leaves in its trail (logs and query plan analyses in particular.)
How to tell how hardware, dbms software, and configuration work together and be able to adjust parameters and modify hardware without fear, and get the results you expected.
Everything related to SQL would be, IMO. Everything from query writing to DBA changes with large scale datasets.

How many queries are too many?

I have to run one time 10 mysql queries for one person in one page. Is it very bad? I have quite good hosting, but still, can it break or something? Thank you very much.
Drupal sites typically make anywhere from 150 to 400+ queries per request. The total time spent querying the database is still under 1s - it's not the number that kills the server, but the quality/complexity of the queries (and possibly the size of the dataset they search through).
I can't tell what queries you're talking about but on most sites 10 is not much at all.
If you're concerned with performance, you can always see how long your queries take to execute in a database management program, such as MySQL Workbench.
10 fast queries can be better than 1 slow one. Define what's acceptable in terms of response time, throughput, in normal and peek traffic conditions, and measure if these 10 queries are a problem or not (i.e. don't respect your expectations).
If they are, then try to change your design and find a better solution.
How many queries are too many?
I will rephrase your question:
Is my app fast enough?
Come up with a business definition of "fast enough" for your application (based on business/user requirements), come up with a way to model all your usage scenarios and expected load, create simulations of that load and profile (trace/time) it.
This approach amounts to an educated guess. Anything short of it is pure speculation, and worthless.
If your application is already in production, and is working well in most cases, you can get feedback from users to determine pain points. From there, you can model those pain points and corresponding load, and profile.
Document your results. Once you make improvements to your application, you have a tool to determine if the optimizations you made achieved your goals.
When new to development as I assume you are. I recommend focusing on the most logical and obvious way to avoid over-processing. That is usually the avoidance of repeating a query by caching its first execution and checking for cached results before running queries.
After that don't spend too much time thinking about the number of queries and focus on well-written code. That means a good use of classes, methods and functions. While still having much to learn, you do not want to over-complicate every interaction with the database.
Enjoy what you are doing and keep it neat. That will result in easier to debug code which in itself can lead to better performance when you have the knowledge to take your code further. The performance of an application can be improved very quickly if the original work is well-written.
It depends on how much CPU cycles will the sum of the queries use.
1 query can consume way more CPU cycles than 100. It all depends on their contents.
You could begin by optimizing them following this guide: http://beginner-sql-tutorial.com/sql-query-tuning.htm
I think its not a problem. 10 Queries are not so much for a site. Less is better no question but when you have 3000 - 5000 then you should think about your structure.
And when you go in one query through a table with millions of rows without an index then are 10 to much.
I have seen a Typo3 site with a lot of extensions that make 7500 requests with the cache. This happens when you install and install and don't look at what happens.
But you can look that you make logical JOIN's over the tables that you have less queries.
Well there are big queries and small trivial queries. Which ones are yours? Generally, you should try to fetch the data in as few queries as possible. The heavier the load is on the database server the harder it will be to serve the clients as the traffic increases.
Just to add a bit of a different perspective to the other good answers:
First, to concur, the type and complexity of queries you are making will matter more 99% of the time than the number of queries.
However, in the rare situation where there is high latency on the network path to your database server (i.e. the db server is remote or such, not saying this is a logical or sane setup, but I have seen it done) then you want to minimize the number of queries done, because every single time you talk to the database server the network transmission time will take an order of magnitude or two longer than it takes to compute the query. This situation can really kill your page loading times, and so you'd really want to minimize the number of queries (actually, you just want to change your server setup...).

Performance considerations when using MySQL VIEWs

I was considering using MySQL views to provide an abstraction when pulling data from the DB. As I was looking for material on this, I came across this article, which ends with:
MySQL has long way to go getting
queries with VIEWs properly optimized.
The article is from 2007. Is this still applicable? Eg: Has MySQL solved these issues?
MySQL views work fine functionally, but they perform badly in the majority of cases.
This is because introducing even quite a simple view tends to cause a much worse query plan to be used by the optimiser. This makes using views impractical in the general case.
Often the server will materialise the entire view as a temporary table, which is not helpful for good performance (unless it happens to have only a very small number of rows in it).
So if you thought the explain plan was ok without a view, with a view it can turn terrible. This is still unresolved in the latest dev version of MySQL as far as I know.
You can have views, but don't expect decent (i.e. acceptable) performance.

Should I put my entire site on one database or split into two databases?

My webapp that I am developing will incorporate a forum (phpbb3 or smf), a wiki (doku wiki flat file storage), and my main web app.
All of these will share a User table. The forum I expect to see moderate usage (roughly 200 posts a day) to begin with and will increase from there. The main web app will be used heavily with triggers, mysql events, and stored procedures.
Would splitting the forum and main web app up into separate databases be the wiser choice (IE maintainability and performance)?
Why would you ever use two databases? What possible reason can there be? Please update the question to explain why you think two databases has some advantage over one database.
Generally, everyone tries to use one database to keep the management overhead down to an acceptable level. Why add complexity?
Keep your app simple to begin with. If you're not expecting a huge amount of traffic to begin with, then it's fine to have a single database. You can always upgrade your site later. Changing which database a table is stored in shouldn't require a huge amount of code.
It depends. If there is any chance of optimization, then splitting is good not otherwise. Another thing is that even if you split, you should be able to manage them all successfully, a bit of overhead.
For the sake of maintainability in the future, I would definitely recommend against splitting DBs when you have a shared table. This will only serve to increase the size of your queries (since joins across DBs will require further qualification). This also will eventually lead to confusion when someone new can't figure out why their query doesn't work.
Furthermore, if the two DBs are actually running on separate servers or instances, you won't be able to join and will be forced to move back and forth between code and query to do something that would otherwise be a simple join. That may not be a big deal for a simple one row look up type queries, but for more complicated summary type queries, it means dumping off a lot of the processing that RDBMs' are specifically optimized for into your more general purpose programming language.
separating to two database could not guarantee performance. putting tables in different tablespace (i.e. different drives) could boost performance, as independent queries doesn't often compete for hard disk's read/write head's access turns