My python application looks as follows:
def create_something(data):
some_stuff
insert_into_database(data)
Now the some_stuff is quite long and complicated, but can in principle be done by a mysql procedure. So I could also add a database BEFORE INSERT trigger and do all of it in the database directly.
The implementation difference is that right now that application is not stateless (stateful?), essentially because of the some_stuff. Putting all of this into the database procedure would allow me to make the application completely stateless, which I would like. But I wonder what the performance difference between these two options is?
Fewer calls to the database is better -- especially if there is a long physical distance between client and server.
Some algorithms are really messy to implement in SQL, hence your app language is better.
Some algorithms are much more straightforward to implement in SQL, hence SQL is better. Example: A JOIN is 'better' than nested loops in the app.
Using Triggers can obfuscate the code; keep this in mind when deciding whether to use them in SQL versus writing out more SQL statements. Also, a Stored Procedure might be a clean way to express the code.
"Cursors" in Stored Procs are horrible -- both in syntax and speed.
A database is a repository for data, it is not a storage engine. Do not expect to do all processing in SQL.
A good interface (in this case, between SQL and your app) is a minimal one. For example, summarize the data before returning only the aggregates, rather than shoveling lots of data to the app to summarize.
I'm already using Triggers heavily to avoid unnecessary PHP code, but didnt actually use many Stored procedures yet.
I was asking myself, if it actually would make sense to move all SELECT / INSERT / UPDATE's in Stored Procedures, and not have actual SQL (besides Transactions, and such stuff) but only Procedures.
Would that actually make sense? What drawbacks would I face?
By reading the MySQL Doc, I found this:
"Stored routines can provide improved performance because less
information needs to be sent between the server and the client. The
tradeoff is that this does increase the load on the database server
because more of the work is done on the server side and less is done
on the client (application) side. Consider this if many client
machines (such as Web servers) are serviced by only one or a few
database servers. "
Well, if I would use the Stored Procedures mainly as wrappers around SQL Code that I would execute anyway, would that really make much of a difference for the Database Server?
Mainly, I would like to know if this actually is a good idea, and what drawbacks I may face doing that.
Thanks for the help
Are they useful for anything outside of a database administrator? If I understand them correctly it's merely queries that can be saved directly into MySQL, so it'd be useless for any web development team to use them.
Stored procedures are code that runs on the database server.
They have a number of uses. Think: If I could run code directly on the database server, what could I use that for?
Among their many uses, stored procedures can be used to shift some of the processing load to the database server, to reduce network traffic, and to improve security.
http://en.wikipedia.org/wiki/Stored_procedure
Here are two good, simple advantages not covered in the other answers:
Security - parameterized stored procedures are safer than contencating strings for SQL (Google SQL Injection for about a million documents on this) However, parameterized queries are also good for this if your language supports them
Simplification of Maintenance - It's a heck of a lot easier to update a stored procedure than to recompile code and re-deploy. In my 15 years of development I've learned this the hard way. If there's a chance the query might change, put it in a stored proc. It's SOOO much easier to work with than having to recompile and redeploy code.
Added
They also reduce network chatter. If you have a lot of complex wueries to run, you can have them all in one stored procedure, and your app only needs to make one call to do all the work.
Also, in most platforms, stored procedures have performance benefits. In SQL Server, for example, the Database Engine optimizes and saves the executio plan to speed things up.
these links also answer your question:
http://blog.sqlauthority.com/2007/04/13/sql-server-stored-procedures-advantages-and-best-advantage/
http://searchsqlserver.techtarget.com/news/1052737/Why-use-stored-procedures
And I can't take credit for this answer, but I think this quote is a good point, even though I consider myself to be pretty skilled on both sides of the equation - there is something to be said for specialized knowledge.
Advantage 4: Stored procedures are
usually written by database
developers/administrators. Persons
holding these roles are usually more
experienced in writing efficient
queries and SQL statements. This frees
the GUI application developers to
utilize their skills on the functional
and graphical presentation pieces of
the application. If you have your
people performing the tasks to which
they are best suited, then you will
ultimately produce a better overall
application.
Think of a stored procedures as a library function. Do you want to rewrite sqrt (for example) every time you have to compute a square root or would you rather call a function to do it?
That's the benefit (in a nutshell) of stored procedures.
Stored procedures have lots of benefits. Among other things they help decouple application code from the database tables, simplify database maintenance and versioning and help take the best advantage of DBMS features such as query parameterisation, optimisation and security.
it'd be useless for any web
development team to use them
Huh? Stored procedures are extremely useful for any developers who need to use a database that supports them.
Stored Procedures can do far more than just query the database. They can contain any T-SQL statement. So you could use them to perform business logic, execute queries, do backups etc.
Many companies have a policy that all database activity is to be done via stored procedures. So, in now way would I say that a web development team would have no use for them. They might make great use of them.
On the other hand, in our company, we're not using them much of all for our next generation manufacturing applications. We're using an ORM (Linq-To-SQL) instead, and have very little use for stored procedures, at this point. I suspect though we'll still use them somewhat, in order to avoid several trips back and forth to the server. Somethings are just more efficient if done in a stored procedure, if you're already doing work on the server anyway.
Back in the 90's, stored procedures were the safest way to prevent anyone from accessing the data tables directly.
At first hand, they were more likely to counter security issues. Secondly, they were meant to be easier to work with the data, as there were no ORM tools as today's.
In the days, they are meant for complex transactions mainly. When I say complex, I mean something that cannot be solved with simple CRUD operations such as NHibernate or Entity Framework can do.
So, when a stored procedure only perform a SELECT, an INSERT, an UPDATE or a DELETE, you may be right that they are now useless somehow, as you can perform these basic repeated operations through an ORM tool. However, when you have to build a report, for instance, that requires desperate information data, and some other calculations performed, etc. and that the result is pretty complex to compute, then you better let the database engine work this out as it is designed to compute such data.
In addition to what others have said about security, encapsulation, performance, etc. of stored procedures, I'd like to add that the usefulness of stored procedures increases with the richness of the stored procedure language.
I don't have much experience with MySQL, but as far as I know the stored procedure language is pretty limited.
T-SQL (in Microsoft SQL Server) is more capable, but has several shortcomings compared to full-featured programming languages. For example, it is not possible to declare a constant value in T-SQL, and until quite recently there was no exception handling, so error handling was a pain. There is no concept of a package, so all your code will be stand-alone procedures with no way to group them together except for a good naming convention. (Although it's true that you can write stored procedures in .NET languages.)
On the other hand, PL/SQL (in Oracle), is a full-featured programming language with complex data types, exception handling, packages for grouping procedures (with separate public and private sections), object types, and lots and lots of built-in packages that deal with everything from file access to compression and generating web pages. All that, plus seamless integration with the database and the SQL language. Entire applications can be built using PL/SQL, without "leaving the database", so to speak. Check out http://apex.oracle.com for an example of a massive (framework) application implemented in pure PL/SQL.
Let's say you insert two rows in different tables, the second insert requires the id from the first.
$sql = "INSERT INTO t1 (f1,f2...) VALUES (v1, v2...)";
mysql_query($sql, $conn);
$id = mysql_insert_id();
$sql2 = "INSERT INTO t2 (f1,f2,id,f3...) VALUES (v1,v2,$id,v3....);
mysql_query($sql2,$conn)l
You went to the database twice, two server request/response. If you can store the process of INSERT, #id=insert id, INSERT all on the server in my_proc, you only have to do so once.
$sql = "CALL my_proc(arguments)";
mysql_query($sql);
Think of them as procedures in any regular program. A way of encapsulating a chunk of logic under a single invocation method.
If you consider functions in programs useless, then this conversation ends here. If you think they're useful, then there's nowhere else to go either. Nothing forces you to use them, but they're available if you should choose to.
If I understand them correctly it's
merely queries that can be saved
directly into MySQL, so it'd be
useless for any web development team
to use them.
Even with that (limited) definition, you're implying that web development teams never have any need of querying a database? Really?
A well written set of stored procs can completely remove queries from your client application and replace all of that with calls to the procedures. Now, I'm not saying that that's the only way of doing things, or even the right way. There's quite the discussion about that going on overall. But it is a very valid way of using them.
So, useless for a webdev team?
In addition to all the answers given here, i would also like to point out that stored procedures are a way to save the execution plan of a query.
you may have a set of SQL statements you just call from your application, but each time you execute the query SQL server has no way of knowing that the query you just invoked is the exact same query that as the one you called a few minutes ago (which would happen very frequently in a web application). So SQL server has to repeat the all the processing again (build the query plan and execute it).
Now if the same query had been encapsulated within a stored procedure, SQL server would have saved the execution plan for that stored procedure so that each time you call the sproc, it doesn't have to recompile the execution plan all the time. (It may even cache the data based on the parameters passed to the sproc, but i dont know exactly how this works)
It is easier to performance tune code in stored procs than what most ORMs create. It is easier to use stored procs when there are multiple applications that access the same database and might need to do the same things. It is far easier to refactor databases when all code is in stored procs because you can easily see where the changes need to be made. It is easier to use stored procs for things that don't normally hook up to ORMs like SSIS or reporting applications. Using stored procs you can limit access to only what the proc does and not allow access directly to the tables or views. This is critical in enforcing internal controls on financial data for instance and helps prevent fraud.
I've written complex procs that were well over a 1000 lines long. Try getting an ORM to write that kind of SQL. Then try to get it so that it will run without timing out.
Is it considered crazy to store common SQL queries for my web app in a database for use in execution? Or is that common practice? Or is it impossible?
My thinking is, this way, I avoid hard-coding SQL into my application files, and add another level of abstraction.
Is this crazy? Is this what a stored procedure is? Or is that something else?
EDIT: The below answers are useful as a background for 'stored procedures', but didn't answer my core question: Is a 'stored procedure' just when I have a database table that contains queries that can be called? ie, something like this
INDEX | NAME | QUERY
1 | show_names | "SELECT names.first, names.last FROM names;"
2 | show_5_cities | "SELECT cities.city FROM cities LIMIT 0,5;"
etc.
Or is there a more complicated mechanism that encompasses the concept of stored procedures? Is my example an actual example of something people do?
Along with MUG4N's great reasons on why to use stored procedures, here are three more:
Security
You can grant access to your application to execute stored procedures while denying direct table access.
Think defense in depth. If your app is cracked, then they will be limited to executing ONLY the procedures you have defined. This means things like 'drop table' would be explicitly disallowed, unless, of course, you have a procedure to do that.
Conversely, if your app is cracked and you allow the app to have full access to your sql server, then one of two things will happen. Either your data disappears and/or the cracker easily get's a copy.
Unit Testing.
It's much easier to unit test your queries if you can hit them directly without having to go through the application itself.
In Flight Changes:
If you need to modify a query AFTER you have published your site, it's much easier to just make a proc change than redeploy code that may have undergone other changes since the last deployment. For example, let's say you have a page that isn't performing all that well. After evaluation, you determine that just changing the joins on a query will fix this. Modify the proc and go.
Well in my opinion you should definitly use stored procedures. And this is common practice!
Here are just two advantages of using stored procedures:
They will run in all environments, and there is no need to recreate the logic. Since they are on the database server, it makes no difference what application environment is used - the stored procedure remains consistent. If your setup involves different clients, different programming languages - the logic remains in one place. Web developers typically make less use of this feature, since the web server and database server are usually closely linked. However, in complex client-server setups, this is a big advantage. The clients are automatically always in sync with the procedure logic as soon as its been updated.
They can reduce network traffic. Complex, repetitive tasks may require getting results, applying some logic to them, and using this to get more results. If this only has to be done on the database server, there is no need to send result sets and new queries back and forth from application server to database server. Network traffic is a common bottleneck causing performance issues, and stored procedures can help reduce this. More often though, it is the database server itself that is the bottleneck, so this may not be much of an advantage.
The idea certainly has its appeal -- but the problems is, they are nearly impossible to scale.. I have never seen a scalable solution to maintaining stored procs (especially in MySQL) that has not made me shutter.
Since it seems you're heading the PHP/MySQL route, I'll give a few examples of my experience with stored procs in MySQL:
They are generally far less readable and far more difficult to write than PHP.
They make debugging a nightmare
Trying to figure out why changing a value in table_1 triggers a change in table_2 (if you're even lucky enough to recognize that this happens) is much more difficult to determine by looking through dozens of stored procedures than it is to, say, look in the Model that handles changes to table_1.
To my knowledge there is no standardized & automated way to integrate stored procs / triggers / etc into any revision control system
A stored procedure is just one or more SQL statements that are "pre-compiled" and live inside the database. You call them to return one or more rows of data, or to update, insert, or delete data.
If you tell us what web framework and database you are using, we can give you actual examples of how to call a stored procedure, or at least point you to an article or two to get you going.
You could also consider using an ORM framework, such as Hibernate. This will allow you to get away from dealing with SQL code altogether. I am a .Net developer, so I'm not sure what is available to you on the PHP/MySQL platform, but I am sure there is a lot out there to choose from.
You should think about it, when developing a commercial grade tiered application there is always people behind the database making it secure and reliable, other people are behind the application logic and other people behind the web code, so you can get the best of all working together.
Once the application has been designed, everyone start making their implementations, the db people give to the others some kind of API to use hiding the SQL, the developers won't have to think about it and focus on their code, i had worked as db developer and used some COM techniques to overcome the expansion and modification of the application logic or reuse, the database in these kind of products is too important to leave it in the wild so security it's a really serious issue.
But in most cases, web applications are made by web developers and they tend to have no design time, making big changes on the near time so they don't use stored procedures, also they don't even secure execution or try to leave security to the application leaving the database unprotected and prone to attacks.
If you're doing everything and changing your product too often you should avoid them since it will be double work and most of the times will be useless, once you stabilize your logic then you could start migrating your heavier queries to stored procedures.
As I said in a previous post, our Rails app has to interface with an E-A-V type of table in a third-party application that we're pulling data from. I had created a View to make the data normal but it is taking way too long to run. We had one of our offshore PHP developers create a stored procedure to help speed it up.
Now we run into the issue that we need to call this stored procedure from the Rails app, as well as provide searching and filtering. The view could do this because Rails was treating it as a traditional Rails model. How could I do this with the stored proc? Would we need to write custom searching and ordering (we were using Searchlogic)? Management is incapable of understanding the drawbacks of using a stored proc from Rails; all they say is that the current method is taking too long to load the data and needs to be fixed, but searching and filtering are critical functions.
EDIT I posted the code for this query here: Optimizing a strange MySQL Query. What is funny is that when I run this query in a GUI (Navicat) it runs in about 5 seconds, but on the web page it takes over a minute to run; the view is complicated for reasons I outline in the original post but I would think that MySQL optimizes and caches views like SQL Server does (or rather, how I read that SQL Server does) to improve performance.
You can call stored procedures from Rails, but you are going to lose most of the benefits of ActiveRecord, as the standard generated SQL will not work. You can use the native database connection and call it, but it's going to be a leaky abstraction. You may want to consider DataMapper.
Looking back at your last question, I would get the DBA to create a trigger to create a more relational structure from the data. The trigger would insert the EVA data into a table, which is the only way I know of to do materialized views in MySQL. This way you only pay a small incremental background cost on insert, and the application can run normally.
Anyway...
ActiveRecord::Base.connection.execute("call SP_name (#{param1}, #{param2}, ... )")
But there's an open ticket out there on lighthouse indicating this approach may not work with out changing some of the parameters to use the connection.