Related
I am trying to improving my application's performance be combining multiple query calls into Stored Procedures. This will reduce n/w traffic, round trips and also separate data-processing logic from application.
While I am at it, I am thinking of using most efficient way to do it.
As of now, I am planning to use PreparedStatement with prepareCall method.
If there is a better way of doing it, please suggest.
I will need to pass many IN params to procedures and will also need OUT params back in JAVA code.
As far as I know, there is no "better" way. You are already doing what is better in your application overall. Using Stored Procs, instead of plain SQL. So you got the curve. If you are truly pressed for performance, then you might also want to to take a look at the database itself. Caching, Index, etc. From application, you can employ memcache, a cache layer. But these are not what you are looking for now, I guess.
While your question can be answered quite easily by guiding you towards some tutorials, e.g. from:
JDBC and stored procedures
jOOQ and stored procedures
Spring and stored procedures
... I would still like to point out that there are a great number of caveats to using stored procedures in MySQL. Please consider Bill Karwin's answer to this Quora question:
http://www.quora.com/MySQL/What-are-the-reasons-not-to-use-or-not-use-stored-procedures
For instance, the first issue:
MySQL stored procedures are compiled the first time a session uses them - but the compiled version is discarded at the end of the session. Unlike stored procedures in Oracle or other RDBMS brands, which keep the compiled version persistently. This means that MySQL adds a lot of overhead to procedures, especially if your pattern is to call a procedure only once per session.
I'm porting a reporting application from .net / MSSQL to php / MySQL and could use some advice from the MySQL experts out there.
I haven't done anything with MySQL for a few years, and when I was last using it stored procedures were brand new and I was advised to stay away from them because of that newness.
So, now that it's 2011, I was wondering if there's anything inherently "bad" about using them, as they worked so well for this app in MSSQL. I know it will depend on the needs of my app, so here are the high level points: (This will run on Linux if that matters)
The app generates a very complex report, however it is NOT a high concurrency app, typically 1-2 users at a time, 5 concurrent would shock me. I can even throttle it to prevent more than 2 or so users from using it simultaneously, so a lot of concurrent users is not going to be a concern.
Virtually 100% of the heavy lifting in this app in in the MSSQL stored procedure. The data is uploaded via the web front end, the stored procedure then takes it from there, and eventually spits out a csv / excel file for the user a few minutes later.
This works great using an MSSSQL stored procedure. However it's a good 2000 lines of sql code and I'm hesitant to submit the sql statements one at a time via php as opposed to using a stored procedure. Most importantly, it works fine with the current architecture, I'm not looking to change it unless I have to in order to accommodate MySQL / PHP.
Any gotchas in using a MySql stored procedure? Are they buggier than submitting sql statements or anything odd like that?
Thanks in advance for everyone's thoughts on this.
Stored procedures in MySQL are quite verbose in syntax, and are hard to debug or profile. Personally I think they are very useful in some cases, but I would be very hesitant to try to maintain a 2000+ line stored procedure in MySQL.
Are they useful for anything outside of a database administrator? If I understand them correctly it's merely queries that can be saved directly into MySQL, so it'd be useless for any web development team to use them.
Stored procedures are code that runs on the database server.
They have a number of uses. Think: If I could run code directly on the database server, what could I use that for?
Among their many uses, stored procedures can be used to shift some of the processing load to the database server, to reduce network traffic, and to improve security.
http://en.wikipedia.org/wiki/Stored_procedure
Here are two good, simple advantages not covered in the other answers:
Security - parameterized stored procedures are safer than contencating strings for SQL (Google SQL Injection for about a million documents on this) However, parameterized queries are also good for this if your language supports them
Simplification of Maintenance - It's a heck of a lot easier to update a stored procedure than to recompile code and re-deploy. In my 15 years of development I've learned this the hard way. If there's a chance the query might change, put it in a stored proc. It's SOOO much easier to work with than having to recompile and redeploy code.
Added
They also reduce network chatter. If you have a lot of complex wueries to run, you can have them all in one stored procedure, and your app only needs to make one call to do all the work.
Also, in most platforms, stored procedures have performance benefits. In SQL Server, for example, the Database Engine optimizes and saves the executio plan to speed things up.
these links also answer your question:
http://blog.sqlauthority.com/2007/04/13/sql-server-stored-procedures-advantages-and-best-advantage/
http://searchsqlserver.techtarget.com/news/1052737/Why-use-stored-procedures
And I can't take credit for this answer, but I think this quote is a good point, even though I consider myself to be pretty skilled on both sides of the equation - there is something to be said for specialized knowledge.
Advantage 4: Stored procedures are
usually written by database
developers/administrators. Persons
holding these roles are usually more
experienced in writing efficient
queries and SQL statements. This frees
the GUI application developers to
utilize their skills on the functional
and graphical presentation pieces of
the application. If you have your
people performing the tasks to which
they are best suited, then you will
ultimately produce a better overall
application.
Think of a stored procedures as a library function. Do you want to rewrite sqrt (for example) every time you have to compute a square root or would you rather call a function to do it?
That's the benefit (in a nutshell) of stored procedures.
Stored procedures have lots of benefits. Among other things they help decouple application code from the database tables, simplify database maintenance and versioning and help take the best advantage of DBMS features such as query parameterisation, optimisation and security.
it'd be useless for any web
development team to use them
Huh? Stored procedures are extremely useful for any developers who need to use a database that supports them.
Stored Procedures can do far more than just query the database. They can contain any T-SQL statement. So you could use them to perform business logic, execute queries, do backups etc.
Many companies have a policy that all database activity is to be done via stored procedures. So, in now way would I say that a web development team would have no use for them. They might make great use of them.
On the other hand, in our company, we're not using them much of all for our next generation manufacturing applications. We're using an ORM (Linq-To-SQL) instead, and have very little use for stored procedures, at this point. I suspect though we'll still use them somewhat, in order to avoid several trips back and forth to the server. Somethings are just more efficient if done in a stored procedure, if you're already doing work on the server anyway.
Back in the 90's, stored procedures were the safest way to prevent anyone from accessing the data tables directly.
At first hand, they were more likely to counter security issues. Secondly, they were meant to be easier to work with the data, as there were no ORM tools as today's.
In the days, they are meant for complex transactions mainly. When I say complex, I mean something that cannot be solved with simple CRUD operations such as NHibernate or Entity Framework can do.
So, when a stored procedure only perform a SELECT, an INSERT, an UPDATE or a DELETE, you may be right that they are now useless somehow, as you can perform these basic repeated operations through an ORM tool. However, when you have to build a report, for instance, that requires desperate information data, and some other calculations performed, etc. and that the result is pretty complex to compute, then you better let the database engine work this out as it is designed to compute such data.
In addition to what others have said about security, encapsulation, performance, etc. of stored procedures, I'd like to add that the usefulness of stored procedures increases with the richness of the stored procedure language.
I don't have much experience with MySQL, but as far as I know the stored procedure language is pretty limited.
T-SQL (in Microsoft SQL Server) is more capable, but has several shortcomings compared to full-featured programming languages. For example, it is not possible to declare a constant value in T-SQL, and until quite recently there was no exception handling, so error handling was a pain. There is no concept of a package, so all your code will be stand-alone procedures with no way to group them together except for a good naming convention. (Although it's true that you can write stored procedures in .NET languages.)
On the other hand, PL/SQL (in Oracle), is a full-featured programming language with complex data types, exception handling, packages for grouping procedures (with separate public and private sections), object types, and lots and lots of built-in packages that deal with everything from file access to compression and generating web pages. All that, plus seamless integration with the database and the SQL language. Entire applications can be built using PL/SQL, without "leaving the database", so to speak. Check out http://apex.oracle.com for an example of a massive (framework) application implemented in pure PL/SQL.
Let's say you insert two rows in different tables, the second insert requires the id from the first.
$sql = "INSERT INTO t1 (f1,f2...) VALUES (v1, v2...)";
mysql_query($sql, $conn);
$id = mysql_insert_id();
$sql2 = "INSERT INTO t2 (f1,f2,id,f3...) VALUES (v1,v2,$id,v3....);
mysql_query($sql2,$conn)l
You went to the database twice, two server request/response. If you can store the process of INSERT, #id=insert id, INSERT all on the server in my_proc, you only have to do so once.
$sql = "CALL my_proc(arguments)";
mysql_query($sql);
Think of them as procedures in any regular program. A way of encapsulating a chunk of logic under a single invocation method.
If you consider functions in programs useless, then this conversation ends here. If you think they're useful, then there's nowhere else to go either. Nothing forces you to use them, but they're available if you should choose to.
If I understand them correctly it's
merely queries that can be saved
directly into MySQL, so it'd be
useless for any web development team
to use them.
Even with that (limited) definition, you're implying that web development teams never have any need of querying a database? Really?
A well written set of stored procs can completely remove queries from your client application and replace all of that with calls to the procedures. Now, I'm not saying that that's the only way of doing things, or even the right way. There's quite the discussion about that going on overall. But it is a very valid way of using them.
So, useless for a webdev team?
In addition to all the answers given here, i would also like to point out that stored procedures are a way to save the execution plan of a query.
you may have a set of SQL statements you just call from your application, but each time you execute the query SQL server has no way of knowing that the query you just invoked is the exact same query that as the one you called a few minutes ago (which would happen very frequently in a web application). So SQL server has to repeat the all the processing again (build the query plan and execute it).
Now if the same query had been encapsulated within a stored procedure, SQL server would have saved the execution plan for that stored procedure so that each time you call the sproc, it doesn't have to recompile the execution plan all the time. (It may even cache the data based on the parameters passed to the sproc, but i dont know exactly how this works)
It is easier to performance tune code in stored procs than what most ORMs create. It is easier to use stored procs when there are multiple applications that access the same database and might need to do the same things. It is far easier to refactor databases when all code is in stored procs because you can easily see where the changes need to be made. It is easier to use stored procs for things that don't normally hook up to ORMs like SSIS or reporting applications. Using stored procs you can limit access to only what the proc does and not allow access directly to the tables or views. This is critical in enforcing internal controls on financial data for instance and helps prevent fraud.
I've written complex procs that were well over a 1000 lines long. Try getting an ORM to write that kind of SQL. Then try to get it so that it will run without timing out.
Is it considered crazy to store common SQL queries for my web app in a database for use in execution? Or is that common practice? Or is it impossible?
My thinking is, this way, I avoid hard-coding SQL into my application files, and add another level of abstraction.
Is this crazy? Is this what a stored procedure is? Or is that something else?
EDIT: The below answers are useful as a background for 'stored procedures', but didn't answer my core question: Is a 'stored procedure' just when I have a database table that contains queries that can be called? ie, something like this
INDEX | NAME | QUERY
1 | show_names | "SELECT names.first, names.last FROM names;"
2 | show_5_cities | "SELECT cities.city FROM cities LIMIT 0,5;"
etc.
Or is there a more complicated mechanism that encompasses the concept of stored procedures? Is my example an actual example of something people do?
Along with MUG4N's great reasons on why to use stored procedures, here are three more:
Security
You can grant access to your application to execute stored procedures while denying direct table access.
Think defense in depth. If your app is cracked, then they will be limited to executing ONLY the procedures you have defined. This means things like 'drop table' would be explicitly disallowed, unless, of course, you have a procedure to do that.
Conversely, if your app is cracked and you allow the app to have full access to your sql server, then one of two things will happen. Either your data disappears and/or the cracker easily get's a copy.
Unit Testing.
It's much easier to unit test your queries if you can hit them directly without having to go through the application itself.
In Flight Changes:
If you need to modify a query AFTER you have published your site, it's much easier to just make a proc change than redeploy code that may have undergone other changes since the last deployment. For example, let's say you have a page that isn't performing all that well. After evaluation, you determine that just changing the joins on a query will fix this. Modify the proc and go.
Well in my opinion you should definitly use stored procedures. And this is common practice!
Here are just two advantages of using stored procedures:
They will run in all environments, and there is no need to recreate the logic. Since they are on the database server, it makes no difference what application environment is used - the stored procedure remains consistent. If your setup involves different clients, different programming languages - the logic remains in one place. Web developers typically make less use of this feature, since the web server and database server are usually closely linked. However, in complex client-server setups, this is a big advantage. The clients are automatically always in sync with the procedure logic as soon as its been updated.
They can reduce network traffic. Complex, repetitive tasks may require getting results, applying some logic to them, and using this to get more results. If this only has to be done on the database server, there is no need to send result sets and new queries back and forth from application server to database server. Network traffic is a common bottleneck causing performance issues, and stored procedures can help reduce this. More often though, it is the database server itself that is the bottleneck, so this may not be much of an advantage.
The idea certainly has its appeal -- but the problems is, they are nearly impossible to scale.. I have never seen a scalable solution to maintaining stored procs (especially in MySQL) that has not made me shutter.
Since it seems you're heading the PHP/MySQL route, I'll give a few examples of my experience with stored procs in MySQL:
They are generally far less readable and far more difficult to write than PHP.
They make debugging a nightmare
Trying to figure out why changing a value in table_1 triggers a change in table_2 (if you're even lucky enough to recognize that this happens) is much more difficult to determine by looking through dozens of stored procedures than it is to, say, look in the Model that handles changes to table_1.
To my knowledge there is no standardized & automated way to integrate stored procs / triggers / etc into any revision control system
A stored procedure is just one or more SQL statements that are "pre-compiled" and live inside the database. You call them to return one or more rows of data, or to update, insert, or delete data.
If you tell us what web framework and database you are using, we can give you actual examples of how to call a stored procedure, or at least point you to an article or two to get you going.
You could also consider using an ORM framework, such as Hibernate. This will allow you to get away from dealing with SQL code altogether. I am a .Net developer, so I'm not sure what is available to you on the PHP/MySQL platform, but I am sure there is a lot out there to choose from.
You should think about it, when developing a commercial grade tiered application there is always people behind the database making it secure and reliable, other people are behind the application logic and other people behind the web code, so you can get the best of all working together.
Once the application has been designed, everyone start making their implementations, the db people give to the others some kind of API to use hiding the SQL, the developers won't have to think about it and focus on their code, i had worked as db developer and used some COM techniques to overcome the expansion and modification of the application logic or reuse, the database in these kind of products is too important to leave it in the wild so security it's a really serious issue.
But in most cases, web applications are made by web developers and they tend to have no design time, making big changes on the near time so they don't use stored procedures, also they don't even secure execution or try to leave security to the application leaving the database unprotected and prone to attacks.
If you're doing everything and changing your product too often you should avoid them since it will be double work and most of the times will be useless, once you stabilize your logic then you could start migrating your heavier queries to stored procedures.
I'm working on a Java based project that has a client program which needs to connect to a MySQL database on a remote server. This was implemented is as follows:
Use JDBC to write the SQL queries to be executed which are then hosted as a servlet using Apache Tomcat and made accessible via XML-RPC. The client code uses XML-RPC to remotely execute these JDBC based functions. This allows us to keep our MySQL database non-public, restricts use to the pre-defined functions, and allows Tomcat to manage the database transactions (which I've been told is better than letting MySQL do it alone, but I really don't understand why). However, this approach requires a lot of boiler-plate code, and Tomcat is a huge memory hog on our server.
I'm looking for a better way to do this. One way I'm considering is to make the MySQL database publicly accessible, re-writing the JDBC based code as stored procedures, and restricting public use to these procedures only. The problem I see with this are that translating all the JDBC code to stored procedures will be difficult and time consuming. I'm also not too familiar with MySQL's permissions. Can one grant access to a stored procedure which performs select statements on a table, but also deny arbitrary select statements on that same table?
Any other ideas are welcome, as are thoughts and or sugguestions on the stored procedure solution.
Thank you!
You can probably get the RAM upgraded in your server for less than the cost of even a few days development time, so don't write any code if that's all you're getting from the exercise. Also, just because the memory is used inside of tomcat, it doesn't mean that tomcat itself is using it. The memory could be used up by data or by technical flaws in your code.
If you've tried additional RAM and it is being eaten up, then that smells like a coding issue, so I'd suggest using a profiler, or log data to try and work out what the root cause is before changing anything. If the cause is large data sets then using the database directly will only delay the inevitable, instead you'd need to look at things like paging, summarisation, client side caching, or redesigning clients to reduce the use of expensive queries. Using a profiler, or simply reviewing the code base, will also tell you if something is creating too many objects (especially strings, or XML nodes) or leaking memory.
Boiler plate code can be avoided by refactoring creatively, and its good that you do avoid repetition. Its unclear how much structure you might already have, but with a little work its easy to centralise boilerplate JDBCs calls. There is no fundamental reason JDBC code should be repeated, perhaps you could tell us what code is being repeated?
Finally, I'll venture that there are many good reasons to put a web tier over your database. Flexibility (of deployment), compatibility, control (over the SQL) and security are all good reasons to keep the web tier.
MySQL 5.0.3+ does have an execute privilege that you can set (without setting select privileges) that should allow you to get the functionality you seek.
However, note this mysql bug report with JDBC (well and a lot of other drivers).
When calling the [procedure] with JDBC, I get "java.sql.SQLException: Driver requires
declaration of procedure to either contain a '\nbegin' or '\n' to follow argument
declaration, or SELECT privilege on mysql.proc to parse column types."
the workaround is:
See "noAccessToProcedureBodies" in /J 5.0.3 for a somewhat hackish, non-JDBC compliant
workaround.
I am sure you could implement your solution without much boiler-plate, esp. using something like Spring's remoting. Also, how much memory is Tomcat eating? I frankly believe that if it's just doing what you are describing, it could work in less than 128mb (conservative guess).
Your alternative is the "correct by the book" way of solving the problem. I say build a prototype and see how it works. The major problems you could have are:
MySQL having some important gotcha in this regard
MySQL's Stored Procedure support being too primitive and forcing you to do a lot of work
Some other strange hiccup
I'm probably one of those MySQL haters, so the situation might be better than I think.