SERIAL-like INT column - mysql

I have an app where depending on the type of transaction being added or updated, the ticket number may or may not increment. I can't use a SERIAL datatype for ticket number because it would increment regardless of the transaction type, so I defined ticket number as an INT. So in a multi-user environment if user A is adding or updating a transaction and user B is also doing the same, I test for tran type and if next ticket number is required, then
LET ticket = (SELECT MAX(ticket) [WITH ADDLOCK or UPDLOCK?] FROM transactions) + 1
However this has to be done exactly when the row is being committed or troubles will begin. Can you think of a better way of doing this with: Informix, Oracle, MySQL, SQL-Server, 4Js/Genero or other RDBMS? This is one main factor which will determine what RDBMS I'm going to re-write my app in.

With the Informix DBMS, the SERIAL column will not change after it is inserted; indeed, you cannot update a SERIAL value at all. You can insert a new one with either 0 as the value - in which case a new value is generated - or you can insert some other value. If the other value already exists and there is a unique constraint, that will fail; if it does not exist, or if there is no unique constraint on the serial column, then it will succeed. If the value inserted is larger than the largest value previously inserted, then the next number to be inserted will be one larger again. If the number inserted is smaller, or negative, then there is no effect on the next number.
So, you could do your update without changing the value - no problem. If you need to change the number, you will have to do a delete and insert (or insert and delete), where the insert has a zero in it. If you prefer consistency and you use transactions, you could always delete, and then (re)insert the row with the same number or with a zero to trigger a new number. This assume you have a programming language running the SQL; I don't think you can tweak ISQL and Perform to do that automatically.
So, at this point, I don't see the problem on Informix.
With the appropriate version of IDS (anything that is supported), you can use SEQUENCE to control the values inserted too. This is based on the Oracle syntax and concept; DB2 also supports this. Other DBMS have other equivalent (but different) mechanisms for handling the auto-generated numbers.

That's what sequences were created for and which is supported by most databases (MySQL being the only one that does not have sequences - not 100% sure about Informix though)
Any algorithm that relies on the SELECT MAX(id) anti-pattern is either dead-slow in a multi-user environment or will simply not work correctly in a multi-user environment.
If you need to support MySQL as well, I'd recommend to use the "native" "auto increment" type in each database (serial for PostgreSQL, auto_increment for MySQL, identity for SQL Server, sequence + trigger in Oracle and so on) and let the driver return the generated ID value
In JDBC there is a getGeneratedKeys() method and I'm sure other interfaces have something similar.

From your tags it's hard to tell what database you are using.
For SQL Server (since it's listed) I suggest
ticket_num = (SELECT MAX(ticket_number) FROM transactions with (updlock)) + 1

Related

Can I do Change Data Capture with MariaDb's Automatic Data Versioning

We're using MariaDb in production and we've added a MariaDb slave so that our data team can perform some ETL tasks from this slave to our datawarehouse. However, they lack a proper Change Data Capture feature (i.e. they want to know which rows from the production table changed since yesterday in order to query rows that actually changed).
I saw that MariaDb's 10.3 had an interesting feature that allowed to perform a SELECT on an older version of a table. However, I haven't found resources that supported the idea that it could be used for CDC, any feedback on this feature?
If not, we'll probably resort to streaming the slave's binlogs to our datawarehouse but that looks challenging..
Thanks for your help!
(As a supplement to Stefans answer)
Yes, the System-Versioning can be used for CDC because the validity-period in ROW_START (Object starts to be valid) and ROW_END (Object is now invalid) can be interpreted when an INSERT-, UPDATE- or DELETE-query happened. But it's more cumbersome as with alternative CDC-variants.
INSERT:
Object was found for the first time
ROW_START is the insertion time
UPDATE:
Object wasn't found for the first time
ROW_START is the update time
DELETE:
ROW_END lies in the past
there is no new entry for this object in the next few lines
I'll add a picture to clarify this.
You can see that this versioning is space saving because you can combine the information about INSERT and DELETE of an object in one line, but to check for DELETEs is costly.
In the example above I used a Table with a clear Primary Key. So a check for the-same-object is easy: just look at the id. If you want to capture changes in talbes with an key-combination this can also make the whole process more annoying.
Edit: another point is that the protocol-Data is kept in the same table as the "real" data. Maybe this is faster for an INSERT than known alternativ solution like the tracking per TRIGGER (like here), but if changes are made quite frequent on the table and you want to process/analyse the CDC-Data this can cause performance problems.
MariaDB supports System-Versioned Tables since version 10.3.4. System version tables are specified in the SQL:2011 standard. They can be used for automatically capturing previous versions of rows. Those versions can then be queried to retrieve their values as they have been set at a specific point in time.
The following text and code example is from the official MariaDB documentation
With system-versioned tables, MariaDB Server tracks the points in time
when rows change. When you update a row on these tables, it creates a
new row to display as current without removing the old data. This
tracking remains transparent to the application. When querying a
system-versioned table, you can retrieve either the most current
values for every row or the historic values available at a given point
in time.
You may find this feature useful in efficiently tracking the time of
changes to continuously-monitored values that do not change
frequently, such as changes in temperature over the course of a year.
System versioning is often useful for auditing.
With adding SYSTEM VERSIONING to a newly created or an already existing table (using ALTER), the table will be expanded by row_start and row_end time stamp columns which allow retrieving the record valid within the time between the start and the end timestamps.
CREATE TABLE accounts (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255),
amount INT
) WITH SYSTEM VERSIONING;
It is then possible to retrieve data as it was at a specific time (with SELECT * FROM accounts FOR SYSTEM_TIME AS OF '2019-06-18 11:00';), all versions within a specific time range
SELECT * FROM accounts
FOR SYSTEM_TIME
BETWEEN (NOW() - INTERVAL 1 YEAR)
AND NOW();
or all versions at once:
SELECT * FROM accounts
FOR SYSTEM_TIME ALL;

How to create Trigger to prevent inserting the duplicate row in MYSQL 5.0.27?

This question follows up the following previous question.
is it possible to make a field in Mysql table to to have length max=700, to be unicode (3 bytes) and to be unique?
I have a table MyText which have ID column, text column & many other columns
Id - text - type ..& other fileds
1 - this is my text
2 - xxxx
I want the text column support unicode with max length can hold 700 Unicode characters. I can't set Unique (text) because MYSQL only supports 765 bytes max length for unique column while Unicode takes 3 bytes so I need 2100 bytes (700*3) unique column.
So, the solution is to crate a trigger that prevents the user to insert the duplicate. For example, if user inserts "THIS is My Text" (We won't care case sensitive) into MyText table, then Mysql will abort completely ALL Queries that contain that Inserting Statement and will generate an SQLException to prevent the system to do other query.
Ok, suppose you have to run a series of sql statements in your Java code
insert into MyText('THIS is My Text',1);
insert into OtherTable ('some text');
update othetTable...
Then when the system doing the insert into MyText('THIS is My Text',1);, it should stop doing other queries below it.
Also, some people suggests to do the prefix index to help Nysql to do the select quicker, but I am not sure it is too necessary since I got ID colum which was indexed.
Note: MYSQL 5.027 is 2006 version which is pretty old, but I love it
SO how to create trigger that meets my requirement or how to fix my problem?
I would recommend not using a trigger for this because of performance reasons.
Instead, create an additional column to store an MD5 or SHA1 hash of your value, and make that column unique using a constraint.
As per the above links, both hashing functions exist in your version of MySQL. Alternatively, if it's easier to integrate this in your Java code, you could do the hashing in Java using the MessageDigest class.
The part in your question where you indicate that no further queries should be executed if the insert statement fails because of a duplicate, is best handled using transactions. These are also supported in Java using plain JDBC or most ORM frameworks.

Manually increament primary key - Transaction and racing condition

This may not be a real world issue but is more like a learning topic.
Using PHP, MySQL and PDO, I know all about auto_increment and lastInsertId(). Consider that the primary key has no auto_incerment attribute and we have to use something like SELECT MAX(id) FROM table in order to retrieve last id, increment it manually and then INSERT INTO table (id) VALUES (:lastIdPlusOne). Wrap whole code in beginTransaction and commit.
Is this approach safe? If user A and B at the same time load this script what will happens at the end? both transaction will be failed? Or both will be successful (for instance, if the last id was 10, A will insert 11 and B will insert 12)?
Note that since I am a PHP & MySQL developer, therefor I am more interested in MySQL behavior in this case.
If both got the same max, then the one that inserts first will succeed, and other(s) will fail.
To overcome this issue without using using auto_increment fields, you may use a trigger before insert that does the job (new.id=max) i.e. same logic, but in a trigger, so the DB server is the one who controls it.
Not sure though if this is 100% safe in a master-master replication environment in case of a server failure.
This is #eggyal comment, that I quote here:
You must ensure that you use a locking read to fetch the MAX() in the first (select) query; it will then block until the transaction is committed. However, this is very poor design and should not be used in a production system.

race condition in mysql select sql

What I try to accomplish seems simple,
Db type: MyISAM
Table Structure: card_id, status
Query: select an unused card_id from a table, and set the row as "used".
Is it race condition that when two queries running at the same time, and before status is updated, the same card_id is fetched twice?
I did some search already. It seems Lock table is a solution, but it's overkill to me and need Lock Privilege.
Any Idea?
Thanks!
It really depends on what statements you are running.
For plain old UPDATE statements against a MyISAM table, MySQL will obtain a lock on the entire table, so there is no "race" condition between two sessions there. One session will wait until the lock is released, and then proceed with it's own update (or will wait for a specified period, and abort with a "timeout".)
BUT, if what you are asking about is two sessions both running a SELECT against a table, to retrieve an identifier for a row to be updated, and both sessions retrieving the same row identifier, and then both sessions attempting to update the same row, then yes, that's a definite possibility, and one which really does have to be considered.
If that condition is not addressed, then it's basically going to be a matter of "last update wins", the second session will (potentially) overwrite the changes made by a previous update.
If that's an untenable situation for your application, then that does need to be addressed, either with a different design, or with some mechanism that prevents the second update from overwriting the update applied by the first update.
One approach, as you mentioned, is to avoid this situation by first obtaining an exclusive lock on the table (using a LOCK TABLES statement), then running a SELECT to obtain an identifier, and then running an UPDATE to update the identified row, and then finally, releasing the lock (using an UNLOCK TABLES statement.)
That's a workable approach for some low volume, low concurrency applications. But it does have some significant drawbacks. Of primary concern is reduced concurrency, due to the exclusive locks obtained on a single resource, which has the potential to cause a performance bottleneck.
Another alternative is an strategy called "optimistic locking". (As opposed to the previously described approach, which could be described as "pessimistic locking".)
For an "optimistic locking" strategy, an additional "counter" column is added to the table. Whenever an update is applied to a row in the table, the counter for that row is incremented by one.
To make use of this "counter" column, when a query retrieves a row that will (or might) be updated later, that query also retrieves the value of the counter column.
When an UPDATE is attempted, the statement also compares the current value of the "counter" column in the row with the previously retrieved value of the counter column. (We just include a predicate (e.g. in the WHERE clause) of the UPDATE statement. For example,
UPDATE mytable
SET counter = counter + 1
, col = :some_new_value
WHERE id = :previously_fetched_row_identifier
AND counter = :previously_fetched_row_counter
If some other session has applied an update to the row we are attempting to update (sometime between the time our session retrieved the row and before our session is attempting to do the update), then the value of the "counter" column on that row will have been changed.
The predicate on our UPDATE statement checks for that, and if the "counter" has been changed, that will cause our update to NOT be applied. We can then detect this condition (i.e. the affected rows count will be a 0 rather than a 1) and our session can take some appropriate action. ("Hey! Some other session updated a row we were intending to update!")
There are some good write-ups on how to implement an "optimistic locking" strategy.
Some ORM frameworks (e.g. Hibernate, JPA) provide support for this type of locking strategy.
Unfortunately, MySQL does NOT provide support for a RETURNING clause in an UPDATE statement, such as:
UPDATE ...
SET status = 'used'
WHERE status = 'unused'
AND ROWNUM = 1
RETURNING card_id INTO ...
Other RDBMS (e.g. Oracle) do provide that kind of functionality. With that feature of the UPDATE statement available, we can simply run the UPDATE statement to both 1) locate a row with status = 'unused', 2) change the value of status = 'used', and 3) return the card_id (or whatever columns we want) of the row the we just updated.
That gets around the problem of having to run a SELECT and then running a separate UPDATE, with the potential of some other session updating the row between our SELECT and our UPDATE.
But the RETURNING clause is not supported in MySQL. And I've not found any reliable way of emulating this type functionality from within MySQL.
This may work for you
I'm not entirely sure why I previously abandoned this approach using user variables (I mentioned above that I had played around with this. I think maybe I needed something more general, which would update more than one row and return a set of of id values. Or, maybe there was something that wasn't guaranteed about the behavior of user variables. (Then again, I only reference user variables in carefully constructed SELECT statements; I don't use user variables in DML; it may be because I don't have a guarantee of their behavior.)
Since you are interested in exactly ONE row, this sequence of three statements may work for you:
SELECT #id := NULL ;
UPDATE mytable
SET card_id = (#id := card_id)
, status = 'used'
WHERE status = 'unused'
LIMIT 1 ;
SELECT ROW_COUNT(), #id AS updated_card_id ;
It's IMPORTANT that these three statements run in the SAME database session (i.e. keep a hold of the database session; don't let go of it and get a new one.)
First, we initialize a user variable (#id) to a value which we won't confuse with a real card_id value from the table. (A SET #id := NULL statement would work as well, without returning a result, like the SELECT statement does.)
Next, we run the UPDATE statement to 1) find one row where status = 'unused', 2) change the value of the status column to 'used', and 3) set the value of the #id user variable to the card_id value of the row we changed. (We'd want that card_id column to be integer type, not character, to avoid any possible character set translation issues.)
Next, we run a query get the number of rows changed by the previous UPDATE statement, using the ROW_COUNT() function (we are going to need to verify that this is 1 on the client side), and retrieve the value of the #id user variable, which will be the card_id value from the row that was changed.
After I post this questions, I thought of a solution which is exactly the same as the one you mentioned at the end. I used update statement, which is "update TABLE set status ='used' where status = 'unused' limit 1", which returns the primary Id of the TABLE, and then I can use this primary ID to get cart_id. Says there are two update statements occurs at the same time, as you said, "MySQL will obtain a lock on the entire table, so there is no "race" condition between two sessions there", so this should solve my issue. But I am not sure why you said, "MySQL does NOT provide support an style statement".

MySQL and implementing something close to sequences?

I am recently in the process of moving from oracle to mysql and would like some advice if how i am implementing something similar to sequences in mysql is a good way.
Essentially how i am currently going to implement it is by having a separate table in mysql for each sequence in oracle and have a single column which represents the last_number and increment this column when ever i insert a new row, that's one way another way i could go about doing it is by creating a single table with several rows representing each sequence and increment each row separately whenever i do an insert.
Another simpler way of doing it i could just do a select max()+1 on the relevant column when inserting data.
I'm basically thinking of switching to the select max()+1 option as it seems simpler to implement, but i would like to get some advice on what you think would be the best way of doing it out of these options, and if there is any pitfalls that i am currently not aware of when using select max()+1.
Also the reason im am not using auto_increment and the function last_insert_id() is i want to follow the ansi standard.
Thanks.
First of all: The max()+1 version is NOT guaranteed to give you a sequence, if you use transactions in a high isolation level.
The way we typically use sequences (if we can't avoid them) is to create a table with an AUTO_INCREMENT value, INSERT INTO it, SELECT last_insert_id(), DELETE FROM table WHERE field<$LASTINSERTID. This is ofcourse done in a stored procedure.
There is a read consistency problem, in that two sessions both running ...
insert into ... select max(..)+1 from ...
... at the same time both see the same value of max(...), hence they both try to insert the same new value.
You have the same problem with your table of maxima method, and you have to use a locking mechanism to avoid multiple session reading the same value. This leads to a concurrency problem where inserts to the table are serialised.