Committing transactions while executing a postgreql Function - function

I have Postgresql Function which has to INSERT about 1.5 million data into a table. What I want is I want to see the table getting populated with every one records insertion. Currently what is happening when I am trying with say about 1000 records, the get gets populated only after the complete function gets executed. If I stop the function half way through, no data gets populated. How can I make the record committed even if I stop after certain number of records have been inserted?

This can be done using dblink. I showed an example with one insert being committed you will need to add your while loop logic and commit every loop. You can http://www.postgresql.org/docs/9.3/static/contrib-dblink-connect.html
CREATE OR REPLACE FUNCTION log_the_dancing(ip_dance_entry text)
RETURNS INT AS
$BODY$
DECLARE
BEGIN
PERFORM dblink_connect('dblink_trans','dbname=sandbox port=5433 user=postgres');
PERFORM dblink('dblink_trans','INSERT INTO dance_log(dance_entry) SELECT ' || '''' || ip_dance_entry || '''');
PERFORM dblink('dblink_trans','COMMIT;');
PERFORM dblink_disconnect('dblink_trans');
RETURN 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION log_the_dancing(ip_dance_entry text)
OWNER TO postgres;
BEGIN TRANSACTION;
select log_the_dancing('The Flamingo');
select log_the_dancing('Break Dance');
select log_the_dancing('Cha Cha');
ROLLBACK TRANSACTION;
--Show records committed even though we rolled back outer transaction
select *
from dance_log;

What you're asking for is generally called an autonomous transaction.
PostgreSQL does not support autonomous transactions at this time (9.4).
To properly support them it really needs stored procedures, not just the user-defined functions it currently supports. It's also very complicated to implement autonomous tx's in PostgreSQL for a variety of internal reasons related to its session and process model.
For now, use dblink as suggested by Bob.

If you have the flexibility to change from function to procedure, from PostgreSQL 12 onwards you can do internal commits if you use procedures instead of functions, invoked by CALL command. Therefore your function will be changed to a procedure and invoked with CALL command: e.g:
CREATE PROCEDURE transaction_test2()
LANGUAGE plpgsql
AS $$
DECLARE
r RECORD;
BEGIN
FOR r IN SELECT * FROM test2 ORDER BY x LOOP
INSERT INTO test1 (a) VALUES (r.x);
COMMIT;
END LOOP;
END;
$$;
CALL transaction_test2();
More details about transaction management regarding Postgres are available here: https://www.postgresql.org/docs/12/plpgsql-transactions.html

For Postgresql 9.5 or newer you can use dynamic background workers provided by pg_background extension. It creates autonomous transaction. Please, refer the github page of the extension. The sollution is better then db_link. There is a complete guide on Autonomous transaction support in PostgreSQL. There is a third way to start autonomous transaction in Postgres, but some patching neede. Please see Peter's Eisentraut patch proposal for OracleDB-style transactions.

Related

MariaDB stored functions

I currently have all of my sql queries written in my PHP files, within each class method. Is it possible to move all of these queries into stored procedures or stored functions in the database & simply pass the corresponding values / arguments into them from PHP?
I have read some of the documentation & it still appears unclear.
Thank you. :)
DELIMITER $$
create procedure `accounting`.`delete_invoice_line` (invoice_line_id INT)
BEGIN
delete from invoice_line where id = invoice_line_id;
END;
$$
DELIMITER ;
I had to figure the format for creating the procedure. I am following this pattern & it appears to be working properly. Then granting execute privileges for the user name.
Thank you all for your input. :)
Most queries can be moved into stored procedures, but probably not all of them. See mariadb's documentation on which SQL statements cannot be used in stored procedures:
ALTER VIEW; you can use CREATE OR REPLACE VIEW instead. LOAD DATA and
LOAD TABLE. INSERT DELAYED is permitted, but the statement is handled
as a regular INSERT.
LOCK TABLES and UNLOCK TABLES.
References to local variables within prepared statements inside a stored routine (use user-defined variables instead).
BEGIN (WORK) is treated as the beginning of a BEGIN END block, not a transaction, so START TRANSACTION needs to be used instead.
The number of permitted
recursive calls is limited to max_sp_recursion_depth. If this variable
is 0 (default), recursivity is disabled. The limit does not apply to
stored functions.
Most statements that are not permitted in prepared
statements are not permitted in stored programs. See Prepare
Statement:Permitted statements for a list of statements that can be
used.
SIGNAL, RESIGNAL and GET DIAGNOSTICS are exceptions, and may be
used in stored routines
Having said this, even though a SQL statement can be moved into a stored procedure, you may not necessarily want to do that due to code complexity or performance reasons.

BEGIN...END vs START TRANSACTION...COMMIT

What is the difference between doing:
START TRANSACTION
...
COMMIT
Or doing:
BEGIN
...
END
Does the later autocommit, or what might be a practical example of using one of the other?
In both MySQL 5.7 and MySQL 8, BEGIN and END is the same as in T-SQL and represents a "compound statement" also known as "a block of code", just like curly-braces in C, Java, C#, etc.
MySQL 5.7: https://dev.mysql.com/doc/refman/5.7/en/begin-end.html
MySQL 8.0: https://dev.mysql.com/doc/refman/8.0/en/begin-end.html
However, the BEGIN keyword is also (confusingly) overloaded as an alias for BEGIN WORK and START TRANSACTION, and their semantics depend on if they're being used inside a stored program or not:
Within all stored programs (stored procedures and functions, triggers, and events), the parser treats BEGIN [WORK] as the beginning of a BEGIN ... END block. Begin a transaction in this context with START TRANSACTION instead.
So:
START TRANSACTION
Always starts a transaction. You should prefer this syntax.
BEGIN:
If you're in a Stored Procedure, Function, Trigger or Event, then BEGIN by itself marks the start of a compound statement. You can only use START TRANSACTION to start a transaction.
If you're directly executing SQL against MySQL, then this also starts a transaction (as it's interpreted as BEGIN WORK). But it's silly and confusing to use it this way, so avoid it.
BEGIN WORK:
This is an alias for START TRANSACTION. I'd avoid using this completely to prevent confusion.

Calling a REST API from a trigger or stored procedure in mysql?

I want to call rest api in a POST method from a stored procedure or trigger in mysql server on windows.
How do I perform this solely with MySQL?
You can use a mysql-udf-http and then create a trigger like this:
delimiter $$
CREATE TRIGGER upd_check BEFORE UPDATE ON account
FOR EACH ROW
BEGIN
IF NEW.amount > 0 THEN
set #json = select json_object(account_id,amount)
select http_post('http://restservice.example.com/account/post',#json);
END IF;
END;$$
delimiter;
Basically you can't. And if you could (thru the use of a UDF library) you shouldn't.
Mysql is a high performance db engine that you ought to respect for its core competencies. Other stakeholders and waiting for your query or transaction to wrap up, and they suffer while yours doesn't. It is not a webserver with callbacks from asynchronous calls. Triggers and stored procs need to get in and get out as they are often used in a Transactional setting.

iBatis selectKey and transactions

I'm using iBatis with MySQL 5 in my Java app.
I have a persistent entity class
public class Entity {
private int id;
private Stirng property;
// setters and getters are omitted
}
Inserting new entity is done as follows:
<insert id="insert" parameterClass="MyEntity">
<selectKey resultClass="int" type="post" keyProperty="id" >
select LAST_INSERT_ID() as value
</selectKey>
{CALL insert_entity(#property#)}
</insert>
Transactions are managed inside the stored procedure as follows:
CREATE DEFINER=`user`#`%` PROCEDURE `insert`(IN p_property VARCHAR(255))
BEGIN
START TRANSACTION;
INSERT INTO entities (property) VALUES (p_property);
-- Do more stuff that requires transaction: update more tables etc.
COMMIT;
END;
What I'm trying to achieve is getting newly inserted entity id back to my Java code. While working with no concurrent DB updates, the setup above will do exactly what I want. The unclear part is what happens with concurrent DB updates - i.e. what is the exact timing and context of iBatis executing selectKey statement - I'd guess it will not be executed within the same transaction that defined in stored procedure, so it is possible that id returned will not be the id of the entity I want.
The only possible solution I can think about is avoid usage of selectKey:
<insert id="insert" parameterClass="MyEntity">
{CALL insert_entity(#property#, #id,mode=OUT#)}
</insert>
Returning the last inserted id from the stored procedure:
CREATE DEFINER=`user`#`%` PROCEDURE `insert`(
IN p_property VARCHAR(255),
OUT p_id INTEGER(11),
)
BEGIN
START TRANSACTION;
INSERT INTO entities (property) VALUES (p_property);
SELECT LAST_INSERT_ID() INTO p_id;
-- Do more stuff that requires transaction: update more tables etc.
COMMIT;
END;
Is there any better solution for this problem?
Edited: MySQL documentation for LAST_INSERT_ID states:
The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.
So it seems like the originals solution with selectKey will work in all the cases. However, for the complex stored procedures with multiple INSERT statements the second approach is safer.
Firstly, I have to state the obvious: You should seriously try to avoid doing your own transaction management inside your stored procedure.
Assuming that this is your only option, I'd say that the latter solution would be my preference, as it's is clear to any developer that the id is returned from within the transaction.

MySQL trigger : is it possible to delete rows if table become too large?

When inserting a new row in a table T, I would like to check if the table is larger than a certain threshold, and if it is the case delete the oldest record (creating some kind of FIFO in the end).
I thought I could simply make a trigger, but apparently MySQL doesn't allow the modification of the table on which we are actually inserting :
Code: 1442 Msg: Can't update table 'amoreAgentTST01' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.
Here is the trigger I tried :
Delimiter $$
CREATE TRIGGER test
AFTER INSERT ON amoreAgentTST01
FOR EACH ROW
BEGIN
DECLARE table_size INTEGER;
DECLARE new_row_size INTEGER;
DECLARE threshold INTEGER;
DECLARE max_update_time TIMESTAMP;
SELECT SUM(OCTET_LENGTH(data)) INTO table_size FROM amoreAgentTST01;
SELECT OCTET_LENGTH(NEW.data) INTO new_row_size;
SELECT 500000 INTO threshold;
select max(updatetime) INTO max_update_time from amoreAgentTST01;
IF (table_size+new_row_size) > threshold THEN
DELETE FROM amoreAgentTST01 WHERE max_update_time = updatetime; -- and check if not current
END IF;
END$$
delimiter ;
Do you have any idea on how to do this within the database ?
Or it is clearly something to be done in my program ?
Ideally you should have a dedicated archive strategy in a separate process that runs at off-peak times.
You could implement this either as a scheduled stored procedure (yuck) or an additional background worker thread within your application server, or a totally separate application service. This would be a good place to put other regular housekeeping jobs.
This has a few benefits. Apart from avoiding the trigger issue you're seeing, you should consider the performance implications of anything happening in a trigger. If you do many inserts, that trigger will do that work and effectively half the performance, not to mention the lock contention that will arise as other processes try to access the same table.
A separate process that does housekeeping work minimises lock contention, and allows the work to be carried out as a high-performance bulk operation, in a transaction.
One last thing - you should possibly consider archiving records to another table or database, rather than deleting them.