Table name as a PostgreSQL function parameter - function

I want to pass a table name as a parameter in a Postgres function. I tried this code:
CREATE OR REPLACE FUNCTION some_f(param character varying) RETURNS integer
AS $$
BEGIN
IF EXISTS (select * from quote_ident($1) where quote_ident($1).id=1) THEN
return 1;
END IF;
return 0;
END;
$$ LANGUAGE plpgsql;
select some_f('table_name');
And I got this:
ERROR: syntax error at or near "."
LINE 4: ...elect * from quote_ident($1) where quote_ident($1).id=1)...
^
********** Error **********
ERROR: syntax error at or near "."
And here is the error I got when changed to this select * from quote_ident($1) tab where tab.id=1:
ERROR: column tab.id does not exist
LINE 1: ...T EXISTS (select * from quote_ident($1) tab where tab.id...
Probably, quote_ident($1) works, because without the where quote_ident($1).id=1 part I get 1, which means something is selected. Why may the first quote_ident($1) work and the second one not at the same time? And how could this be solved?

Before you go there: for only few, known tables names, it's typically simpler to avoid dynamic SQL and spell out the few code variants in separate functions or in a CASE construct.
That said, what you are trying to achieve can be simplified and improved:
CREATE OR REPLACE FUNCTION some_f(_tbl regclass, OUT result integer)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('SELECT (EXISTS (SELECT FROM %s WHERE id = 1))::int', _tbl)
INTO result;
END
$func$;
Call with schema-qualified name (see below):
SELECT some_f('myschema.mytable'); -- would fail with quote_ident()
Or:
SELECT some_f('"my very uncommon table name"');
Major points
Use an OUT parameter to simplify the function. You can directly select the result of the dynamic SQL into it and be done. No need for additional variables and code.
EXISTS does exactly what you want. You get true if the row exists or false otherwise. There are various ways to do this, EXISTS is typically most efficient.
You seem to want an integer back, so I cast the boolean result from EXISTS to integer, which yields exactly what you had. I would return boolean instead.
I use the object identifier type regclass as input type for _tbl. That does everything quote_ident(_tbl) or format('%I', _tbl) would do, but better, because:
.. it prevents SQL injection just as well.
.. it fails immediately and more gracefully if the table name is invalid / does not exist / is invisible to the current user. (A regclass parameter is only applicable for existing tables.)
.. it works with schema-qualified table names, where a plain quote_ident(_tbl) or format(%I) would fail because they cannot resolve the ambiguity. You would have to pass and escape schema and table names separately.
It only works for existing tables, obviously.
I still use format(), because it simplifies the syntax (and to demonstrate how it's used), but with %s instead of %I. Typically, queries are more complex so format() helps more. For the simple example we could as well just concatenate:
EXECUTE 'SELECT (EXISTS (SELECT FROM ' || _tbl || ' WHERE id = 1))::int'
No need to table-qualify the id column while there is only a single table in the FROM list. No ambiguity possible in this example. (Dynamic) SQL commands inside EXECUTE have a separate scope, function variables or parameters are not visible there - as opposed to plain SQL commands in the function body.
Here's why you always escape user input for dynamic SQL properly:
db<>fiddle here demonstrating SQL injection
Old sqlfiddle

If at all possible, don't do this.
That's the answer—it's an anti-pattern. If the client knows the table it wants data from, then SELECT FROM ThatTable. If a database is designed in a way that this is required, it seems to be designed sub-optimally. If a data access layer needs to know whether a value exists in a table, it is easy to compose SQL in that code, and pushing this code into the database is not good.
To me this seems like installing a device inside an elevator where one can type in the number of the desired floor. After the Go button is pressed, it moves a mechanical hand over to the correct button for the desired floor and presses it. This introduces many potential issues.
Please note: there is no intention of mockery, here. My silly elevator example was *the very best device I could imagine* for succinctly pointing out issues with this technique. It adds a useless layer of indirection, moving table name choice from a caller space (using a robust and well-understood DSL, SQL) into a hybrid using obscure/bizarre server-side SQL code.
Such responsibility-splitting through movement of query construction logic into dynamic SQL makes the code harder to understand. It violates a standard and reliable convention (how a SQL query chooses what to select) in the name of custom code fraught with potential for error.
Here are detailed points on some of the potential problems with this approach:
Dynamic SQL offers the possibility of SQL injection that is hard to recognize in the front end code or the back end code alone (one must inspect them together to see this).
Stored procedures and functions can access resources that the SP/function owner has rights to but the caller doesn't. As far as I understand, without special care, then by default when you use code that produces dynamic SQL and runs it, the database executes the dynamic SQL under the rights of the caller. This means you either won't be able to use privileged objects at all, or you have to open them up to all clients, increasing the surface area of potential attack to privileged data. Setting the SP/function at creation time to always run as a particular user (in SQL Server, EXECUTE AS) may solve that problem, but makes things more complicated. This exacerbates the risk of SQL injection mentioned in the previous point, by making the dynamic SQL a very enticing attack vector.
When a developer must understand what the application code is doing in order to modify it or fix a bug, he'll find it very difficult to get the exact SQL query being executed. SQL profiler can be used, but this takes special privileges and can have negative performance effects on production systems. The executed query can be logged by the SP but this increases complexity for questionable benefit (requiring accommodating new tables, purging old data, etc.) and is quite non-obvious. In fact, some applications are architected such that the developer does not have database credentials, so it becomes almost impossible for him to actually see the query being submitted.
When an error occurs, such as when you try to select a table that doesn't exist, you'll get a message along the lines of "invalid object name" from the database. That will happen exactly the same whether you're composing the SQL in the back end or the database, but the difference is, some poor developer who's trying to troubleshoot the system has to spelunk one level deeper into yet another cave below the one where the problem exists, to dig into the wonder-procedure that Does It All to try to figure out what the problem is. Logs won't show "Error in GetWidget", it will show "Error in OneProcedureToRuleThemAllRunner". This abstraction will generally make a system worse.
An example in pseudo-C# of switching table names based on a parameter:
string sql = $"SELECT * FROM {EscapeSqlIdentifier(tableName)};"
results = connection.Execute(sql);
While this does not eliminate every possible issue imaginable, the flaws I outlined with the other technique are absent from this example.

Inside plpgsql code, The EXECUTE statement must be used for queries in which table names or columns come from variables. Also the IF EXISTS (<query>) construct is not allowed when query is dynamically generated.
Here's your function with both problems fixed:
CREATE OR REPLACE FUNCTION some_f(param character varying) RETURNS integer
AS $$
DECLARE
v int;
BEGIN
EXECUTE 'select 1 FROM ' || quote_ident(param) || ' WHERE '
|| quote_ident(param) || '.id = 1' INTO v;
IF v THEN return 1; ELSE return 0; END IF;
END;
$$ LANGUAGE plpgsql;

I know this is an old thread, but I ran across it recently when trying to solve the same problem - in my case, for some fairly complex scripts.
Turning the entire script into dynamic SQL is not ideal. It's tedious and error-prone work, and you lose the ability to parameterize: parameters must be interpolated into constants in the SQL, with bad consequences for performance and security.
Here's a simple trick that lets you keep the SQL intact if you only need to select from your table - use dynamic SQL to create a temporary view:
CREATE OR REPLACE FUNCTION some_f(_tbl varchar) returns integer
AS $$
BEGIN
drop view if exists myview;
execute format('create temporary view myview as select * from %s', _tbl);
-- now you can reference myview in the SQL
IF EXISTS (select * from myview where myview.id=1) THEN
return 1;
END IF;
return 0;
END;
$$ language plpgsql;

The first doesn't actually "work" in the sense that you mean, it works only in so far as it does not generate an error.
Try SELECT * FROM quote_ident('table_that_does_not_exist');, and you will see why your function returns 1: the select is returning a table with one column (named quote_ident) with one row (the variable $1 or in this particular case table_that_does_not_exist).
What you want to do will require dynamic SQL, which is actually the place that the quote_* functions are meant to be used.

If the question was to test if the table is empty or not (id=1), here is a simplified version of Erwin's stored proc :
CREATE OR REPLACE FUNCTION isEmpty(tableName text, OUT zeroIfEmpty integer) AS
$func$
BEGIN
EXECUTE format('SELECT COALESCE ((SELECT 1 FROM %s LIMIT 1),0)', tableName)
INTO zeroIfEmpty;
END
$func$ LANGUAGE plpgsql;

If you want table name, column name and value to be dynamically passed to function as parameter
use this code
create or replace function total_rows(tbl_name text, column_name text, value int)
returns integer as $total$
declare
total integer;
begin
EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
return total;
end;
$total$ language plpgsql;
postgres=# select total_rows('tbl_name','column_name',2); --2 is the value

I have 9.4 version of PostgreSQL and I always use this code:
CREATE FUNCTION add_new_table(text) RETURNS void AS
$BODY$
begin
execute
'CREATE TABLE ' || $1 || '(
item_1 type,
item_2 type
)';
end;
$BODY$
LANGUAGE plpgsql
And then:
SELECT add_new_table('my_table_name');
It works good for me.
Attention! Above example is one of those which shows "How do not if we want to keep safety during querying the database" :P

Related

I want to make a procedure that has column and table as variables

I'm painfully new to SQL/mySQL as a whole so I'm flying blind right now so apologies.
I made a procedure in mySQL that selects a varchar data from a specific column and table, turn it into INT (contents of said column are numerical to begin with) and output its values after going through a mathematical operation as a (very simple) attempt in data masking. As follows:
CREATE PROCEDURE qdwh.mask_varchar_num2(tablename varchar(100), colname varchar (100))
BEGIN
set #a=concat('select','(','(','(','(','select',colname ,'from',tablename,')','+','0',')','+','297',')','*','5',')','as','colname');
prepare query from #a;
execute query;
deallocate prepare query;
END
but when i tried to call the procedure with the following line:
select [column] , mask_varchar_num2 ([column]) from [table];
an error "FUNCTION qdwh.mask_varchar_num2 does not exist" shows up. I wanted the script to output a select function of the column in question after the conversion to INT and the mathematical operation done to it, so i can then use this procedure in a larger script ("create table select as" kinda stuff) to convert the whole table into masked data as needed.
Is there something i am missing and what am i doing wrong? Dbeaver acknowledges the procedure script as legit so i dont know whats wrong. Thanks in advance for the advice.
Procedures are run by using call and cannot be called within a select query. To define a function, you need to use create function.
not an answer but here's what your select looks like..
set #colname='a';
set #tablename='t';
set #a=concat('select','(','(','(','(','select',#colname ,'from',#tablename,')','+','0',')','+','297',')','*','5',')','as','colname');
select #a
'select((((selectafromt)+0)+297)*5)ascolname'
missing a lot of spaces between tokens

Need to run a pl/pgsql fn that runs an INSERT

Here is what I have, I am trying to create an insert fn that loads row data essentially into a per-existing table. I also want to run a check on specific column data to make sure the source data is not invalid.
The problem I seem to be having is getting it to run successfully. For some reason I can't seem to get this to work and I have tried various ways and have researched diligently within the site(some that are close but, don't quite give me what I need). Here is basically what I have and want to achieve. I know it may be basic, so thanks in advance.
CREATE OR REPLACE FUNCTION Schema.insert_fn (arg_1 character varying , arg_2 integer)
RETURNS SETOF CHARACTER VARYING AS
$BODY$
BEGIN
--should this insert use some kind of temp table?
insert into <schema>.table1 (character varying, integer)
values (arg_1 character varying, arg_2 integer);
--If I wanted to run some sort of check on say arg_2
If(select distinct (arg_2) from <schema>.table2 where invalid_date is not null)
THEN
raise notice 'Data has been invalidated';
END IF;
Return 'complete';
END;
$BODY$
Update
First, it told me that my return need to have 'NEXT' or 'QUERY'
RETURN cannot have a parameter in function returning set;
use RETURN NEXT or RETURN QUERY at or near "'complete'"
Once I do this, of course the function will complete. However, when I call it I get an error saying for example:
invalid input syntax for type boolean: "arg_1"
I apologize if I come off a little vague. Obviously I can't give you the complete context of the arg names as they relate to what I am doing. I appreciate any help.
Update
I also receive this error:
more than one row returned by a subquery used as an expression
I did research on this issue as well and simply cannot relate any kind of solution to at least get this to work, meaning; when I call it say, with no arguments I receive this error.
Update #ErwinBrandstetter. I communicated that wrong. I meant if
'col2 = arg_2 and invalid_date is NOT null' to raise an exception
What is happening is that the 'EXIST' statement will take any instance to where a row is found. I tried 'WHERE EXIST' and I got an error. The problem I figure is that they(validated and invalidated data) share the same unique id whether and it makes the EXIST statement true(I didn't provide this info mind you).
Update
#ErwinBrandstetter It now operates successfully. Looks like all I needed to do was seperate the two conditions. Thanks.
IF EXISTS (condition)
THEN
INSERT
ELSEIF EXISTS (invalidated data condition)
THEN
RAISE EXCEPTION'DATA IS INVALIDATED';
END IF;
END;
Might work like this:
CREATE OR REPLACE FUNCTION schema.insert_fn (_arg1 text, _arg2 integer)
RETURNS text AS -- not SETOF!
$func$
BEGIN
INSERT INTO <schema>.table1 (col1, col2) -- names here! not types
SELECT _arg1, _arg2 -- again: names! cast only if needed
WHERE NOT EXISTS ( -- only if _arg2 not invalidated
SELECT 1
FROM <schema>.table2
WHERE col2 = _arg2
AND invalid_date IS NOT NULL
);
IF NOT FOUND THEN -- No exception yet, no INSERT either --> invalidated
RAISE EXCEPTION 'Data >>%<< has been invalidated.', _arg2;
END IF;
RETURN 'complete'::text; -- return value probably useless
END
$func$ LANGUAGE plpgsql
The most prominent error was that you declared the function to return a SETOF values, while you actually only return a single value. I might just use RETURNs void, since the return value does not carry information as is.
Read the manual here and here.
Use a SELECT with your INSERT to apply additional conditions directly.
There is more. See comments in code above.

Why can I reference a non-existing function within another function?

Here, I will show that referencing a non-existing function from another function is possible and SQL Server doesn't check it until the execution time:
USE [SomeDataBase];
SELECT dbo.Booo();
Obviously, if you don't have function Booo then an error will be generated regarding function Booo is not recognized. This isn't a surprise though!
Now, try this:
CREATE FUNCTION dbo.Foo()
RETURNS INT
AS
BEGIN
DECLARE #Temp INT
SET #Temp = (SELECT dbo.Booo())
RETURN 1
END
Surprisingly, this scrip creates the function Foo despite the fact that the Booo function doesn't exit.
Any idea?
Why do you think that's a bug? Since the code isn't actually executed until you run the Foo function, there's a case to be made that that is the point where the check should be made.
Maybe you write your functions in a top-down manner, rather than a bottom-up manner, and you want to write the upper levels first, drilling down to specifics later.
Unless it's documented to work one way and it works another way, it's not a bug, just a disagreement between you and Microsoft :-)
If you do
CREATE FUNCTION dbo.Foo()
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
DECLARE #Temp INT
SET #Temp = (SELECT dbo.Booo())
RETURN 1
END
You get your desired error and the function is not created. That does make altering the definition of dbo.Booo in the future more painful however (need to drop dbo.Foo first).
You can also use a SQL Server Data Tools project to validate things like referencing non existent objects/columns without using schemabinding.

If conditional in SQL Script for Mysql

In a sql script that does sequential execution, is there a way one can introduce an IF THEN ELSE conditional to control the flow of query execution?
I happened to run into this http://www.bennadel.com/blog/1340-MySQL-Does-Not-Support-IF-ELSE-Statements-In-General-SQL-Work-Flow.htm
which says that the IF THEN ELSE will not work in a sql script.
Is there another way around?
Basically, I want to run a particular "select colName from table" command and check if colName corresponds to a particular value. If it does, proceed with the rest of the script. Else, halt execution.
Please advise.
I just wrap my SQL script in a procedure, where conditional code is allowed. If you'd rather not leave the statements lying around, you can drop the procedure when you're done. Here's an example:
delimiter //
create procedure insert_games()
begin
set #platform_id := (select id from platform where name = 'Nintendo DS');
-- Only insert rows if the platform was found
if #platform_id is not null then
insert into game(name, platform_id) values('New Super Mario Bros', #platform_id);
insert into game(name, platform_id) values('Mario Kart DS', #platform_id);
end if;
end;
//
delimiter ;
-- Execute the procedure
call insert_games();
-- Drop the procedure
drop procedure insert_games;
If you haven't used procedures, the "delimiter" keyword might need some explanation. The first line switches the delimiter to "//" so that we can include semi-colons in our procedure definition without MySQL attempting to interpret them yet. Once the procedure has been created, we switch the delimiter back to ";" so we can execute statements as usual.
After doing some research I think I may have found a way to work around this. I was looking for a way to verify if a script had already executed against a target database. This will be primarily for version control of my databases. I have a table created to keep track of the scripts that have been executed and wanted some flow inside my scripts to check that table first before execution. While I have not completely solved the problem yet I have created a simple script that basically does what I need, I just need to wrap the DDL into the selects based on the value of the variables.
step 1 - Setup a bit variable to hold the result
step 2 - do your select and set the variable if the result is found
step 3 - Do what you need to do on false result
step 4 - Do what you need to do on true result
Here is the example script
set #schemachangeid = 0;
select #schemachangeid := 1 from SchemaChangeLog where scriptname = '1_create_tables.sql';
select 'scriptalreadyran' from dual where #schemachangeid = 1;
select 'scriptnotran' from dual where #schemachangeid = 0;
I also recognize this is an old thread but maybe this will help someone out there trying to do this kind of thing outside of a stored procedure like me.

Does SQL Server Management Studio (or SQL Server) evaluate *all* expressions?

Here's my configuration:
I have a re-runnable batch script that I use to update my database.
Inside of that batch script, I have code that says the following:
If Table 'A' doesn't exist, then create Table 'A' and insert rows into it.
Later on in that batch script, I create an schemabound indexed view on that table.
And if you didn't already know, indexed views require specific client settings.
Sometimes, when I re-run the script, that is after the table has been created, SQL Server Management Studio evaluates the "insert rows" code, which is protected by the 'If this table doesn't exist' code, and yields the following error:
Msg 1934, Level 16, State 1, Line 15
INSERT failed because the following SET options have incorrect settings: 'CONCAT_NULL_YIELDS_NULL, ANSI_WARNINGS, ANSI_PADDING, ARITHABORT'. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.
Please note: If someone were to try this INSERT statement in a vacuum, I would fully expect SSMS to generate this error.
But not when it's protected by a conditional block.
My Question:
Does the SSMS compiler evaluate all expressions, regardless of whether they will actually be executed?
Yes, it evaluates all of them,take a look at this
declare #i int
select #i =1
if #i = 1
begin
declare #i2 int
set #i2 = 5
end
else
begin
declare #i2 int
set #i2 = 5
end
Msg 134, Level 15, State 1, Line 12
The variable name '#i2' has already been declared. Variable names must be unique within a query batch or stored procedure.
Another example with temp tables is here: What is deferred name resolution and why do you need to care?
your only way out would be to wrap it inside dynamic SQL
Note that most of the settings you mention are connection-level, i.e. in case you set/change them they stay in effect unless you close the connection or explicitly change their value.
Returning to your question. The error you mention looks like runtime error, i.e. the INSERT is actually being executed. It would be better if you could show your script (omitting details, but keeping batches).
Edit: it is not SSMS compiler that evaluates SQL you try to execute - it is SQL Server. What do you meant by 'evaluate'? Is it 'execute'? When you run a batch (which is what actually is being executed by a server), SQL Server first does syntactic analysis and throws error in case it finds any syntactic error, nothing is being executed at this point of time. In case syntax is ok, the server starts executing you batch.
Again, the error you show seems to be runtime - so I guess you'd carefully watch for the conditions and track what happens (or provide us more details about 'sometimes').