I need to create a unique account number for users signing up on my web application. The account number will be created by incrementing the numbers
user 1 - 9898000000001
user 2 - 9898000000002
...
I have the following stored procedure in MySQL database
consider bank_id as '9898' below.
BEGIN
Set #initialComId = '0000001';
Set #table_value = null;
Set #t = null;
SELECT max(company_va) into #table_value FROM virtual_account_numbers
IF ((#table_value) is null and bank_id =1) then
Set #newComId = CONCAT(bank_id,'000001');
INSERT INTO virtual_account_numbers (company_va,partner_banks_id)VALUES (#newComId,bank_id);
SELECT company_va from virtual_account_numbers ORDER BY virtual_account_numbers_id DESC LIMIT 1;
END
With the above-stored procedure, I am running into a dead lock if 10 users register at the same time.
Is there a better solution to this? The account number cannot be random generated number and should be incremented ones.
All together:
Let the database manage the sequence via AUTO_INCREMENT
Use LPAD to fill up with Zeros. Example with 1 : LPAD('1', 10, '0') = 0000000001
Write the new account number into a VARCHAR field
To aid a search, put an INDEX on the account number field.
Related
I have two tables in MySQL like this
Users -> user_id , user_name , number_of_comments
Comments -> comment_id , comment , user_id
Is there a way to get the number of comments for each user and update it in the number_of_comments column automatically?
Not recommended, but solves nevertheless. For learning purposes only.
CREATE TRIGGER tr_ai_update_n_of_comments
AFTER INSERT ON comments
FOR EACH ROW
UPDATE users
SET number_of_comments = ( SELECT COUNT(*)
FROM comments
WHERE comments.user_id = NEW.user_id )
WHERE user_id = NEW.user_id;
If the rows in comments may be updated (with user_id value changing) and/or deleted then create similar AFTER DELETE and AFTER UPDATE triggers.
PS. I strongly recommend you to remove users.number_of_comments column at all and calculate actual comments amount value by according query when needed.
If you agree that the value may be approximate (slightly different from the exact one), then you can use an incremental trigger.
CREATE TRIGGER tr_ai_update_n_of_comments
AFTER INSERT ON comments
FOR EACH ROW
UPDATE users
SET number_of_comments = number_of_comments + 1
WHERE user_id = NEW.user_id;
But just in case, provide for the creation of a service stored procedure (or event) that will periodically recalculate the accumulated value.
I have access to a reporting dataset (that I don't control) that we retrieve daily from a cloud service and store in a mysql db to run advanced reporting and report combining locally with 3rd party data visualization software.
The data often has duplicate values on an id field that create problems when joining with other tables for data analysis.
For example:
+-------------+----------+------------+----------+
| workfile_id | zip_code | date | total |
+-------------+----------+------------+----------+
| 78002 | 90210 | 2016-11-11 | 2010.023 |
| 78002 | 90210 | 2016-12-22 | 427.132 |
+-------------+----------+------------+----------+
Workfile_id is duplicated because this is the same job, but additional work on the job was performed in a different month than the original work. Instead of the software creating another workfile id for the job, the same is used.
Doing joins with other tables on workfile_id is problematic when more than one of the same id is present, so I was wondering if it is possible to do one of two things:
Make duplicate workfile_id's unique. Have sql append a number to the workfile id when a duplicate is found. The first duplicate (or second occurrence of the same workfile id) would need to get a .01 appended to the end of the workfile id. Then later, if another duplicate is inserted, it would need to auto increment the appended number, say .02, and so on with any subsequent duplicate workfile_id. This method would work best with our data but I'm curious how difficult this would be for the server from a performance perspective. If I could schedule the alteration to take place after the data is inserted to speed up the initial data insert, that would be ideal.
Sum total columns and remove duplicate workfile_id row. Have a task that identifies duplicate workfile_ids and sums the financial columns of the duplicates, replacing the original total with new sum and deleting the 'new row' after the columns have been added together.
This is more messy from a data preservation perspective, but is acceptable if the first solution isn't possible.
My assumption is that there will be significant overhead to have the server compare new workfile_id values to all existing worlfile_id values each time data is inserted, but our dataset is small and new data is only inserted once daily, at 1:30am, and it also should be feasible to keep the duplicate workfile_id searching to rows inserted within the last 6 mo.
Is finding duplicates in a column (workfile_id) and appending an auto-incrementing value onto the workfile_id possible?
EDIT:
I'm having trouble getting my trigger to work based on sdsc81's answer below.
Any ideas?
DELIMITER //
CREATE TRIGGER append_subID_to_workfile_ID_salesjournal
AFTER INSERT
ON salesjournal FOR EACH ROW
BEGIN
SET #COUNTER = ( SELECT (COUNT(*)-1) FROM salesjournal WHERE workfile_id = NEW.workfile_id );
IF #COUNTER > 1 THEN
UPDATE salesjournal SET workfile_id = CONCAT(workfile_id, #COUNTER) WHERE id = NEW.id;
END IF;
END;//
DELIMITER ;
It's hard to know if the trigger isn't working at all, or if just the code in the trigger isn't working. I get no errors on insert. Is there any way to debug trigger errors?
Well, everything is posible ;)
You dont control the dataset but you can modifify the database, right?
Then you could use a trigger after every insert of a new value, and update it, if its duplicate. Something like:
SET #COUNTER = ( SELECT (COUNT(*)-1) FROM *your_table* WHERE workfile_id = NEW.workfile_id );
IF #COUNTER > 1 THEN
UPDATE *your_table* SET workfile_id = CONCAT(workfile_id, #COUNTER) WHERE some_unique_id = NEW.some_unique_id;
END IF;
If there are only one insert a day, and there is defined an index over the workfile_id value, then it shouldn't be any problem for your server at all.
Also, you could implement the second solution, doing:
DELIMITER //
CREATE TRIGGER append_subID_to_workfile_ID_salesjournal
AFTER INSERT ON salesjournal FOR EACH ROW
BEGIN
SET #COUNTER = ( SELECT (COUNT(*)-1) FROM salesjournal WHERE workfile_id = NEW.workfile_id );
IF #COUNTER > 1 THEN
UPDATE salesjournal SET total = total + NEW.total WHERE workfile_id = NEW.workfile_id AND id <> NEW.id;
DELETE FROM salesjournal WHERE id = NEW.id;
END IF;
END;//
DELIMITER ;
Hope this helps.
I need to create an entity form which has unique identification column in the database and its not a primary key column and I need to display that in the form creation page. I've set this column as UNIQUE and not null. Now whenever I create a new user, employee or any entity I need to generate a sequence number like in this format and display it in the form,
ID_001, ID_002 ... ID_00N and so on.
EMP_001, EMP_002 ... EMP_00N and so on.
and when the three digit sequence number reaches the max limit of 999. The seqence number should generate the number as four digits until 9999 is reached and the employee code will be like EMP_1000. So when I get the last insert id when creating the form, it will not work if more than one user is creating simultaneously and there would be a conflict. I thought about creating a new table like sequence_generator. Where I store key-value pair of the the entity-last insert id. So whenver next insert happens I can read from this table and increment by 1 for new sequence numbers.
So How do I best implement this sequence generating which is also Unique in Java/MySql/Mybatis/Spring?
I would create my own sequencing implementation using triggers. I am not very familiar with mysql. So, take my examples as a pseudo-code. Your trigger would look like:
Create a table with no auto-increment. Example:
CREATE TABLE EMPLOYEE (
ID CHAR(30), NAME CHAR(30)
)
Create a trigger with the logic to auto-increment your columns. Similar to:
CREATE TRIGGER EMPLOYEE_SEQUENCE BEFORE INSERT ON EMPLOYEE
FOR EACH ROW
BEGIN
SET #PREPENDED_ZEROS = '';
SET #ID_AS_NUMBER = CAST(SUBSTRING(ID,3) AS INT) + 1;
IF #ID_AS_NUMBER < 10 THEN
SET #PREPENDED_ZEROS = '00';
ELSEIF #ID_AS_NUMBER < 100 THEN
SET #PREPENDED_ZEROS = '0';
END IF;
SET NEW.ID = 'EMP_' || #PREPENDED_ZEROS || #ID_AS_NUMBER;
END;
When I try to run below update query, It takes about 40 hours to complete. So I added a time limitation(Update query with time limitation). But still it takes nearly same time to complete.Is there any way to speed up this update?
EDIT: What I really want to do is only get logs between some specific dates and run this update query on this records.
create table user
(userid varchar(30));
create table logs
( log_time timestamp,
log_detail varchar(100),
userid varchar(30));
insert into user values('user1');
insert into user values('user2');
insert into user values('user3');
insert into user values('');
insert into logs values('no user mentioned','user3');
insert into logs values('inserted by user2','user2');
insert into logs values('inserted by user3',null);
Table before Update
log_time | log_detail | userid |
.. |-------------------|--------|
.. | no user mention | user3 |
.. | inserted by user2 | user2 |
.. | inserted by user3 | (null) |
Update query
update logs join user
set logs.userid=user.userid
where logs.log_detail LIKE concat("%",user.userID,"%") and user.userID != "";
Update query with time limitation
update logs join user
set logs.userid = IF (logs.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44', user.userID, null)
where logs.log_detail LIKE concat("%",user.userID,"%") and user.userID != "";
Table after update
log_time | log_detail | userid |
.. |-------------------|--------|
.. | no user mentione | user3 |
.. | inserted by user2 | user2 |
.. | inserted by user3 | user3 |
EDIT: Original question Sql update statement with variable .
Log tables can easily fill up with tons of rows of data each month and even the best indexing won't help, especially in the case of a LIKE operator. Your log_detail column is 100 characters long and your search query is CONCAT("%",user.userID,"%"). Using a function in a SQL command can slow things down because the function is doing extra computations. And what you're trying to search for is, if your userID is John, %John%. So your query will scan every row in that table because indexes will be semi-useless. If you didn't have the first %, then the query would be able to utilize its indexes efficiently. Your query would, in effect, do an INDEX SCAN as opposed to an INDEX SEEK.
For more information on these concepts, see:
Index Seek VS Index Scan
Query tuning a LIKE operator
Alright, what can you do about this? Two strategies.
Option 1 is to limit the number of rows that you're searching
through. You had the right idea using time limitations to reduce the
number of rows to search through. What I would suggest is to put the
time limitations as the first expression in your WHERE clause.
Most databases execute the first expression first. So when
the second expression kicks in, it'll only scan through the rows returned by
the first expression.
update logs join user
set logs.userid=user.userid
where logs.log_time between '2015-08-01' and '2015-08-11'
and logs.log_detail LIKE concat('%',user.userID,'%')
Option 2 depends on your control of the database. If you have total
control (and you have the time and money, MySQL has a feature called
Auto-Sharding. This is available in MySQL Cluster and MySQL
Fabric. I won't go over those products in much detail as the links
provided below can explain themselves much better than I could
summarize, but the idea behind Sharding is to split the rows into
horizontal tables, so to speak. The idea behind it is that you're
not searching through a long database table, but instead across
several sister tables at the same time. Searching through 10 tables
of 10 million rows is faster than searching through 1 table of 100
million rows.
Database Sharding - Wikipedia
MySQL Cluster
MySQL Fabric
First, the right place to put the time limitation is in the where clause, not an if:
update logs l left join
user u
on l.log_detail LIKE concat("%", u.userID)
set l.userid = u.userID
where l.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
If you want to set the others to NULL do this before:
update logs l
set l.userid = NULL
where l.log_time not between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
But, if you really want this to be fast, you need to use an index for the join. It is possible that this will use an index on users(userid):
update logs l left join
user u
on cast(substring_index(l.log_detail, ' ', -1) as signed) = u.userID
set l.userid = u.userID
where l.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
Look at the explain on the equivalent select. It is really important that the cast() be to the same type as the UserId.
You could add a new column called log_detail_reverse where a trigger can be set so that when you insert a new row, you also insert the log_detail column in reverse character order using the MySQL function reverse. When you're doing your update query, you also reverse the userID search. The net effect is that you then transform your INDEX SCAN to an INDEX SEEK, which will be much faster.
update logs join user
set logs.userid=user.userid
where logs.log_time between '2015-08-01' and '2015-08-11'
and logs.log_detail_reverse LIKE concat(reverse(user.userID), '%')
MySQL Trigger
The Trigger could be something like:
DELIMITER //
CREATE TRIGGER log_details_in_reverse
AFTER INSERT
ON logs FOR EACH ROW
BEGIN
DECLARE reversedLogDetail varchar(100);
DECLARE rowId int; <-- you don't have a primary key in your example, but I'm assuming you do have one. If not, you should look into adding it.
-- Reverse the column log_detail and assign it to the declared variable
SELECT reverse(log_detail) INTO reversedLogDetail;
SELECT mysql_insert_id() INTO rowId;
-- Update record into logs table
UPDATE logs
SET log_detail_reverse = reversedLogDetail
WHERE log_id = rowId;
END; //
DELIMITER ;
One thing about speeding up updates is not to update records that need no update. You only want to update records in a certain time range where the user doesn't match the user mentioned in the log text. Hence limit the records to be updated in your where clause.
update logs
set userid = substring_index(log_detail, ' ', -1)
where log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44'
and not userid <=> substring_index(log_detail, ' ', -1);
we have Orders table with an identity column (OrderID) but our order number is composed by OrderType (2 chars), OrderYear (2 chars) and OrderID (6 chars), totally 10 chars (i.e. XX12123456).
This counter has limitations: we can arrive to identity 999999 as OrderID . Next order will have ID composed by 7 chars. Obviously we cannot ave duplicates order ids.
So we have created a table prefilled with progressive OrderID and OrderYear (from 100000 to 999999, order year from 12 to 16, for instance): this stored procedure begins a transacation with SERIALIZABLE isolation level, take first order id not used, update it as used and commit the transaction.
Being our Orders table, i'm worried about deadlocks on executing order id calculation stored procedure or duplicated orderids.
I'll test this with a console application that create multiple concurrency threads and try to extract orderids simulating a production load.
Doubts are:
Exists another method to simulate an identity column safely?
May consider usage of triggers?
May consider differente isolation level?
Other ideas? :D
Thanks!
[EDIT]
After googling and reading a bunch of MSDN documentation, i've found many examples showing how managing errors and dealocks and approaching a type of automatic reply directly from SP, as follow:
CREATE PROCEDURE [dbo].[sp_Ordine_GetOrderID]
#AnnoOrdine AS NVARCHAR(2) = NULL OUTPUT,
#IdOrdine AS INT = NULL OUTPUT
AS
SET NOCOUNT ON
DECLARE #retry AS INT
SET #retry = 2
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
WHILE (#retry > 0)
BEGIN
BEGIN TRY
BEGIN TRANSACTION OrderID
SELECT TOP 1 #AnnoOrdine = AnnoOrdine, #IdOrdine = IdOrdine
FROM ORDINI_PROGRESSIVI --WITH (ROWLOCK)
WHERE Attivo = 1
--ORDER BY AnnoOrdine ASC, IDOrdine ASC
UPDATE ORDINI_PROGRESSIVI WITH (UPDLOCK)
SET Attivo = 0
WHERE AnnoOrdine = #AnnoOrdine AND IdOrdine = #IdOrdine
IF ISNULL(#IdOrdine, '') = '' OR ISNULL(#AnnoOrdine,'') = ''
BEGIN
RAISERROR('Deadlock', 1, 1205)
END
SET #retry = 0
COMMIT TRANSACTION OrderID
SELECT #AnnoOrdine AS AnnoOrdine, #IdOrdine AS IdOrdine
END TRY
BEGIN CATCH
IF (ERROR_NUMBER() = 1205)
SET #retry = #retry - 1;
ELSE
SET #retry = -1;
IF XACT_STATE() <> 0
ROLLBACK TRANSACTION;
END CATCH
END
This approach reduce deadlocks (absent at all) but sometimes i got EMPTY output parameter.
Tested with 30 contemporary threads (so, 30 customers processes that insert orders at the same moment)
Here a debug log with query duration, in milliseconds: http://nopaste.info/285f558758.html
Enough robust for production?
If you do discover that the current solution is creating problems, and it's possible that it won't, then an alternative would be to have a table for each id type you want to create with an identity column and a dummy field
ie:
ABtypeID (ABID int identity(1,1), dummy varchar(1))
You can then insert a record into this table and use the built in functions to retrieve an identity.
ie
insert ABTypeID (dummy) values (null)
select Scope_Identity()
You can delete from these tables as and when you like, and truncacte them at year end to reset the id counters.
You can even wrap the insert in a transaction that gets rolled back - the identity value is not affected by the rollback