MySQL - CREATE FUNCTION ? MODIFIES SQL DATA? - mysql

I am about to write a CREATE FUNCTION with MySQL and I am wondering, if CREATE TEMPORARY TABLE counts toward flag MODIFIES SQL DATA.
Function does not modify any permanent table, just temporary table, which it creates for optimization purpose.
Should I use flag MODIFIES SQL DATA or only READS SQL DATA?
What is a really benefit to use MODIFIES SQL DATA or READS SQL DATA flag anyway?

As of now (MySQL 5.5) these characteristics serve only as an in-code documentation.
From http://dev.mysql.com/doc/refman/5.5/en/create-procedure.html
Several characteristics provide information about the nature of data
use by the routine. In MySQL, these characteristics are advisory only.
The server does not use them to constrain what kinds of statements a
routine will be permitted to execute.
This is in contrast with (NON)DETERMINISTIC clause, which serves as a hint to optimizer whether the results of function can be cached.

Related

What is the the advantage to declaring READS SQL DATA in a MySQL function?

READS SQL DATA means the routine contains statements that read data (for example, SELECT), but not statements that write data.
I understand what it means, but what is the advantage to the user? Does the function execute faster if declared?
I also use "DETERMINISTIC" with "READS SQL DATA" as "DETERMINISTIC" is for caching purposes. I can only assume that after the function is called, if you declare "READS SQL DATA" it does not update the buffer pool regarding the results of the function if called again.

Index creation in Data Generator

I'm generating a script from an existing MySQL schema using DataGrip's SQL Generator feature. I obtain a working script containing create index statements. I would prefer the indexes to be created by a key clause in the create table statement. I can't see an option in SQL Generator to get that. Do I miss something? I have dozens of tables, so I can't just do it by hand.
The server is a MySQL 5.7.
You can use SQL Generator | Generate: Definitions provided by RDBMS server to get the same result
I found a solution using not the SQL Generator, which doesn't seem to be able to do what I want, but a raw export of the database structure. I select the schema (you can select various and multiple objects: schemas, tables, triggers, produres, functions), on right-click: SQL Scripts -> Request and Copy original DDL, which copies the resulting script extracted from the database. You can then paste it wherever you want, for example a SQL console or a text editor.

In MySQL, how can I have a stored procedure query the tables from the calling database instead of the database where the stored procedure is defined

For the sake of simplification, let's say I have 2 databases with data, db_data_1 and db_data_2 which have the same set of tables and I have a 3rd database where my stored procedures are defined, say db_sp. Let's say my stored procedure is called my_sp
When I try to call db_sp.my_sp from either db_data_1 or db_data_2, I get an error saying that the tables referenced in db_sp.my_sp don't exist.
How can I have db_sp.my_sp query the tables in the calling database vs the database where my_sp is defined (namely db_sp)
Thanks.
You must qualify the table names in your query with the database name in the stored procedure. SELECT col FROM db_data_1.tbl instead of SELECT col FROM tbl, for example.
The documentation says this:
USE statements within stored routines are not permitted. When a routine is invoked, an implicit USE db_name is performed (and undone when the routine terminates). The causes the routine to have the given default database while it executes. References to objects in databases other than the routine default database should be qualified with the appropriate database name.
Why is this so? It seems like a big pain in the xxx neck.
A big use of stored code is the hiding of data from unprivileged users. You can GRANT MySQL users access to stored procedures without granting access to the underlying tables. This restriction ties the tables to the procedures.
A user who has privileges only in the test database shouldn't be able to do this sort of thing.
USE production;
CALL test.get_all_user_private_data();
And, if you're USEing one database and you run stored code that's in a second database, it gets the data from that second database.
Your solution is to consider your stored code (procedures, functions) to be part of the schema definition for each database. They go along with your other data definition operations like CREATE TABLE. Don't try to put them in their own "code library" database, but put them in each database where they're needed.

No columns returned SSIS

I am implementing a SSIS package and currently trying to do the following.
Truncate the destination table
Fetch the data by executing the stored procedure and insert it into the destination table.
I have created an Execute SQL task to address step 1 and dataflow with oledb source and oledb destination to address the second point. It been working successfully so far but isn't working for one my stored procedure that uses temp tables.
When I edit the oledb source and click the preview button, I get the error no column returned
I know that SSIS has an issue with generating column while executing stored procedures that depend on temp tables. I have converted the stored proc to use temporary table variables and its now able to return columns in SSIS when I do a preview. The only downside is that the stored procedure is taking longer time to execute. Its taking 1 hour 15 mins as compared to 15 mins while using temp tables.
I did see a suggestion to use SET FMTONLY before executing the stored procedure as an alternate solution to changing to temp table variables but that didn't seem to work as I am getting syntax or permission denied error.
Could somebody tell me a solution to my problem which does not compromise on the performance.
Sounds like you've already read all the approaches to using Temp tables in SSIS, including the IF 1=0... trick? If you haven't seen that one yet, google it.
You say that using Table Variables causes your stored procedure to take about 5 times longer than using Temp Tables. The most likely reason for that is that you are indexing your temp tables but not your table variables. If you didn't know that table variables can be indexed, they can. You might try that.
Finally, a solution that you haven't mentioned is that you can replace your temporary table with a real table that gets truncated when you're done using it.
Short comment:
Try EXEC WITH RESULT SETS and specify the metadata yourself for a proc with temp tables; or use the Script Component as a source and specify the Output columns yourself.
Long comment:
Technically speaking, it is the driver/database you are using in SSIS that would decide the behavior when working with temp tables.
Metadata is an important factor when using SSIS's pipeline components. By metadata, I mean the names of the columns, their data types etc that a pipeline component uses. When designing a data flow, someone/something should provide this metadata to the components that require it.
In most cases, SSIS automatically retreives the metadata. Components that do not connect to a external data source, like Conditional Split etc, get their metadata from the other components they are connected to. For the pipeline components that connect to a external data source (like Oledb source, oledb destination, Lookup etc.), SSIS provides a mechanism to get this metadata without human involvement. This mechanism involves the driver connecting to the database and retrieving the metadata of the output. If the driver/database is capable of returning the metadata, then that metadata is used. If the driver/database is incapable, then you get the errors you are seeing. The rest of my comments are based on the assumption that you are using a SQL Server database in your question.
When working with a SQL Server database in SSIS, typically, we use the native client drivers provided by Microsoft. When trying to get the metadata, these drivers try to get the metadata without actually executing the SQL Statement (actual execution can have side effects; and also, might take more than a few seconds/minutes/hours; and you dont want side effects and long wait times during package design time.) So to get the metadata, the driver relies on the metadata of the actual objects used in the sql command. If the command uses a physical table or view, SQL Server already has the metadata available and can supply it to the driver. If it is a temp table, SQL Server does not have the metadata until it can create the temp table. If using FMT ONLY option, you can use it in such a way to create the temp tables, but avoid any heavy processing/side affects and thus be able to retrieve metadata without penalties. Post 2012, these native client drivers rely on some newer functionality to retrieve metadata than the drivers before 2012. In 2012 and after, the driver uses the sp_describe_first_result_set proc to retrieve metadata. So, whether you can get metadata or not is determined by the ability of the sp_describe_first_result_set proc.
So while SSIS can automatically get the metadata (because of the driver/database), it does not automatically get the metadata in some cases (again because of the driver/database). In cases involving the second scenario, some other process (typically a human) can help the driver infer metadata or provide the metadata to the component directly.
To help the driver, in case of SQL Server 2012 and after, you can use the WITH RESULTSETS clause to specify the output metadata. When this clause is present, the driver will use it and doesnt try to query the metadata from system objects; and thus avoid the error which you would otherwise get. If you are using the drivers that came with SQL Server 2008, you can use FMT ONLY. This option is at the driver/database level.
Another option could be to use a Script Component as the Source and in the Output columns, you can specify the columns/metadata. SSIS would not try to retrieve metadata from the datasource in this case, but would rely on the definitions you provided in the Output section of the Script Component.
As you can see, both options involve a human (or some other process) specifying the metadata instead of SSIS trying to retrieve the metadata in an automated fashion. I would prefer the first option if working with SQL Server and the second option if working with databases like MySql.

Difference between stored procedure and function in SQL Server [duplicate]

When should I use a function rather than a stored procedure in SQL, and vice versa? What is the purpose of each?
Functions are computed values and cannot perform permanent environmental changes to SQL Server (i.e., no INSERT or UPDATE statements allowed).
A function can be used inline in SQL statements if it returns a scalar value or can be joined upon if it returns a result set.
A point worth noting from comments, which summarize the answer. Thanks to #Sean K Anderson:
Functions follow the computer-science definition in that they MUST return a value and cannot alter the data they receive as parameters
(the arguments). Functions are not allowed to change anything, must
have at least one parameter, and they must return a value. Stored
procs do not have to have a parameter, can change database objects,
and do not have to return a value.
Here's a table summarizing the differences:
Stored Procedure
Function
Returns
Zero or more values
A single value (which may be a scalar or a table)
Can use transaction?
Yes
No
Can output to parameters?
Yes
No
Can call each other?
Can call a function
Cannot call a stored procedure
Usable in SELECT, WHERE and HAVING statements?
No
Yes
Supports exception handling (via try/catch)?
Yes
No
Functions and stored procedures serve separate purposes. Although it's not the best analogy, functions can be viewed literally as any other function you'd use in any programming language, but stored procs are more like individual programs or a batch script.
Functions normally have an output and optionally inputs. The output can then be used as the input to another function (a SQL Server built-in such as DATEDIFF, LEN, etc) or as a predicate to a SQL Query - e.g., SELECT a, b, dbo.MyFunction(c) FROM table or SELECT a, b, c FROM table WHERE a = dbo.MyFunc(c).
Stored procs are used to bind SQL queries together in a transaction, and interface with the outside world. Frameworks such as ADO.NET, etc. can't call a function directly, but they can call a stored proc directly.
Functions do have a hidden danger though: they can be misused and cause rather nasty performance issues: consider this query:
SELECT * FROM dbo.MyTable WHERE col1 = dbo.MyFunction(col2)
Where MyFunction is declared as:
CREATE FUNCTION MyFunction (#someValue INTEGER) RETURNS INTEGER
AS
BEGIN
DECLARE #retval INTEGER
SELECT localValue
FROM dbo.localToNationalMapTable
WHERE nationalValue = #someValue
RETURN #retval
END
What happens here is that the function MyFunction is called for every row in the table MyTable. If MyTable has 1000 rows, then that's another 1000 ad-hoc queries against the database. Similarly, if the function is called when specified in the column spec, then the function will be called for each row returned by the SELECT.
So you do need to be careful writing functions. If you do SELECT from a table in a function, you need to ask yourself whether it can be better performed with a JOIN in the parent stored proc or some other SQL construct (such as CASE ... WHEN ... ELSE ... END).
Differences between stored procedures and user-defined functions:
Stored procedures cannot be used in Select statements.
Stored procedures support Deferred Name Resolution.
Stored procedures are generally used for performing business logic.
Stored procedures can return any datatype.
Stored procedures can accept greater numbers of input parameter than user defined functions. Stored procedures can have up to 21,000 input parameters.
Stored procedures can execute Dynamic SQL.
Stored procedures support error handling.
Non-deterministic functions can be used in stored procedures.
User-defined functions can be used in Select statements.
User-defined functions do not support Deferred Name Resolution.
User-defined functions are generally used for computations.
User-defined functions should return a value.
User-defined functions cannot return Images.
User-defined functions accept smaller numbers of input parameters than stored procedures. UDFs can have up to 1,023 input parameters.
Temporary tables cannot be used in user-defined functions.
User-defined functions cannot execute Dynamic SQL.
User-defined functions do not support error handling. RAISEERROR OR ##ERROR are not allowed in UDFs.
Non-deterministic functions cannot be used in UDFs. For example, GETDATE() cannot be used in UDFs.
STORE PROCEDURE
FUNCTION (USER DEFINED FUNCTION)
Procedure can return 0, single or multiple values
Function can return only single value
Procedure can have input, output parameters
Function can have only input parameters
Procedure cannot be called from a function
Functions can be called from procedure
Procedure allows select as well as DML statement in it
Function allows only select statement in it
Exception can be handled by try-catch block in a procedure
Try-catch block cannot be used in a function
We can go for transaction management in procedure
We can not go for transaction management in function
Procedure cannot be utilized in a select statement
Function can be embedded in a select statement
Procedure can affect the state of database means it can perform CRUD operation on database
Function can not affect the state of database means it can not perform CRUD operation on database
Procedure can use temporary tables
Function can not use temporary tables
Procedure can alter the server environment parameters
Function can not alter the environment parameters
Procedure can use when we want instead is to group a possibly- complex set of SQL statements
Function can use when we want to compute and return a value for use in other SQL statements
Write a user-defined function when you want to compute and return a value for use in other SQL statements; write a stored procedure when you want instead is to group a possibly-complex set of SQL statements. These are two pretty different use cases, after all!
Basic Difference
Function must return a value but in Stored Procedure it is optional( Procedure can return zero or n values).
Functions can have only input parameters for it whereas Procedures can have input/output parameters .
Function takes one input parameter it is mandatory but Stored Procedure may take o to n input parameters..
Functions can be called from Procedure whereas Procedures cannot be called from Function.
Advance Difference
Procedure allows SELECT as well as DML(INSERT/UPDATE/DELETE) statement in it whereas Function allows only SELECT statement in it.
Procedures can not be utilized in a SELECT statement whereas Function can be embedded in a SELECT statement.
Stored Procedures cannot be used in the SQL statements anywhere in the WHERE/HAVING/SELECT section whereas Function can be.
Functions that return tables can be treated as another rowset. This can be used in JOINs with other tables.
Inline Function can be though of as views that take parameters and can be used in JOINs and other Rowset operations.
Exception can be handled by try-catch block in a Procedure whereas try-catch block cannot be used in a Function.
We can go for Transaction Management in Procedure whereas we can't go in Function.
source
a User Defined Function is an important tool available to a sql server programmer. You can use it inline in a SQL statement like so
SELECT a, lookupValue(b), c FROM customers
where lookupValue will be an UDF. This kind of functionality is not possible when using a stored procedure. At the same time you cannot do certain things inside a UDF. The basic thing to remember here is that UDF's:
cannot create permanent changes
cannot change data
a stored procedure can do those things.
For me the inline usage of a UDF is the most important usage of a UDF.
Stored Procedures are used as scripts. They run a series of commands for you and you can schedule them to run at certain times. Usually runs multiples DML statements like INSERT, UPDATE, DELETE, etc. or even SELECT.
Functions are used as methods. You pass it something and it returns a result. Should be small and fast - does it on the fly. Usually used in a SELECT statement.
SQL Server functions, like cursors, are meant to be used as your last weapon! They do have performance issues and therefore using a table-valued function should be avoided as much as possible. Talking about performance is talking about a table with more than 1,000,000 records hosted on a server on a middle-class hardware; otherwise you don't need to worry about the performance hit caused by the functions.
Never use a function to return a result-set to an external code (like ADO.Net)
Use views/stored procs combination as much as possible. you can recover from future grow-performance issues using the suggestions DTA (Database Tuning Adviser) would give you (like indexed views and statistics) --sometimes!
for further reference see: http://databases.aspfaq.com/database/should-i-use-a-view-a-stored-procedure-or-a-user-defined-function.html
Stored procedure:
Is like a miniature program in SQL Server.
Can be as simple as a select statement, or as complex as a long
script that adds, deletes, updates, and/or reads data from multiple
tables in a database.
(Can implement loops and cursors, which both allow you to work with
smaller results or row by row operations on data.)
Should be called using EXEC or EXECUTE statement.
Returns table variables, but we can't use OUT parameter.
Supports transactions.
Function:
Can not be used to update, delete, or add records to the database.
Simply returns a single value or a table value.
Can only be used to select records. However, it can be called
very easily from within standard SQL, such as:
SELECT dbo.functionname('Parameter1')
or
SELECT Name, dbo.Functionname('Parameter1') FROM sysObjects
For simple reusable select operations, functions can simplify code.
Just be wary of using JOIN clauses in your functions. If your
function has a JOIN clause and you call it from another select
statement that returns multiple results, that function call will JOIN
those tables together for each line returned in the result set. So
though they can be helpful in simplifying some logic, they can also be a
performance bottleneck if they're not used properly.
Returns the values using OUT parameter.
Does not support transactions.
To decide on when to use what the following points might help-
Stored procedures can't return a table variable where as function can do that.
You can use stored procedures to alter the server environment parameters where as using functions you can't.
cheers
Start with functions that return a single value. The nice thing is you can put frequently used code into a function and return them as a column in a result set.
Then, you might use a function for a parameterized list of cities. dbo.GetCitiesIn("NY") That returns a table that can be used as a join.
It's a way of organizing code. Knowing when something is reusable and when it is a waste of time is something only gained through trial and error and experience.
Also, functions are a good idea in SQL Server. They are faster and can be quite powerful. Inline and direct selects. Careful not to overuse.
Here's a practical reason to prefer functions over stored procedures. If you have a stored procedure that needs the results of another stored procedure, you have to use an insert-exec statement. This means that you have to create a temp table and use an exec statement to insert the results of the stored procedure into the temp table. It's messy. One problem with this is that insert-execs cannot be nested.
If you're stuck with stored procedures that call other stored procedures, you may run into this. If the nested stored procedure simply returns a dataset, it can be replaced with a table-valued function and you'll no longer get this error.
(this is yet another reason we should keep business logic out of the database)
I realize this is a very old question, but I don't see one crucial aspect mentioned in any of the answers: inlining into query plan.
Functions can be...
Scalar:
CREATE FUNCTION ... RETURNS scalar_type AS BEGIN ... END
Multi-statement table-valued:
CREATE FUNCTION ... RETURNS #r TABLE(...) AS BEGIN ... END
Inline table-valued:
CREATE FUNCTION ... RETURNS TABLE AS RETURN SELECT ...
The third kind (inline table-valued) are treated by the query optimizer essentially as (parametrized) views, which means that referencing the function from your query is similar to copy-pasting the function's SQL body (without actually copy-pasting), leading to the following benefits:
The query planner can optimize the inline function's execution just as it would any other sub-query (e.g. eliminate unused columns, push predicates down, pick different JOIN strategies etc.).
Combining several inline function doesn't require materializing the result from the first one before feeding it to the next.
The above can lead to potentially significant performance savings, especially when combining multiple levels of functions.
NOTE: Looks like SQL Server 2019 will introduce some form of scalar function inlining as well.
It is mandatory for Function to return a value while it is not for stored procedure.
Select statements only accepted in UDF while DML statements not required.
Stored procedure accepts any statements as well as DML statements.
UDF only allows inputs and not outputs.
Stored procedure allows for both inputs and outputs.
Catch blocks cannot be used in UDF but can be used in stored procedure.
No transactions allowed in functions in UDF but in stored procedure they are allowed.
Only table variables can be used in UDF and not temporary tables.
Stored procedure allows for both table variables and temporary tables.
UDF does not allow stored procedures to be called from functions while stored procedures allow calling of functions.
UDF is used in join clause while stored procedures cannot be used in join clause.
Stored procedure will always allow for return to zero. UDF, on the contrary, has values that must come - back to a predetermined point.
Functions can be used in a select statement where as procedures cannot.
Stored procedure takes both input and output parameters but Functions takes only input parameters.
Functions cannot return values of type text, ntext, image & timestamps where as procedures can.
Functions can be used as user defined datatypes in create table but procedures cannot.
***Eg:-create table <tablename>(name varchar(10),salary getsal(name))
Here getsal is a user defined function which returns a salary type, when table is created no storage is allotted for salary type, and getsal function is also not executed, But when we are fetching some values from this table, getsal function get’s executed and the return
Type is returned as the result set.
Generally using stored procedures is better for perfomances.
For example in previous versions of SQL Server if you put the function in JOIN condition the cardinality estimate is 1 (before SQL 2012) and 100 (after SQL 2012 and before of SQL 2017) and the engine can generate a bad execution plan.
Also if you put it in WHERE clause the SQL Engine can generate a bad execution plan.
With SQL 2017 Microsoft introduced the feature called interleaved execution in order to produce a more accurate estimate but the stored procedure remains the best solution.
For more details look the following article of Joe Sack
https://techcommunity.microsoft.com/t5/sql-server/introducing-interleaved-execution-for-multi-statement-table/ba-p/385417
In SQL Server, functions and stored procedure are two different types of entities.
Function: In SQL Server database, the functions are used to perform some actions and the action returns a result immediately.
Functions are two types:
System defined
User defined
Stored Procedures: In SQL Server, the stored procedures are stored in server and it can be return zero, single and multiple values.
Stored Procedures are two types:
System Stored Procedures
User Defined Procedures