Update query - incrementing int field value and if statement - mysql

I am currently learning SQL through my local mysql db. I have a table named transactions that has 5 fields. I am running an update query for column where name = jane. Essentially, I want to integrate an if statement when date_created – tran_date = 1 month to reset values to transaction = 0, tran_date = 0000-00-00, and change date_created to the new current date.
Query(help)- UPDATED
UPDATE transactions SET transactions = transactions + 1, tran_date = CURDATE() WHERE name = 'jim'
Create tables and set values:
CREATE TABLE transactions
(
id int auto_increment primary key,
date_created DATE,
name varchar(20),
transactions int(6),
tran_date DATE
);
INSERT INTO transactions
(date_created, name, transactions, tran_date)
VALUES
(NOW(), 'jim', 0, 0000-00-00),
(NOW(), 'jane', 0, 0000-00-00);

Your updatesyntax is wrong:
UPDATE transactions SET transactions = transactions + 1, tran_date = CURDATE() WHRE name = 'jim'
You have not to use AND in set clause. You must use a comma at this place.

Related

ClickHouse deduplication/upsert with different functions per column

I have a ClickHouse table which looks like this:
CREATE TABLE test
(
id Int,
property_id Int,
created_at DateTime('UTC'),
modified_at DateTime('UTC'),
data Int,
json_str Nullable(String)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
ORDER BY (property_id, created_at);
When inserting new rows, I want to update (upsert) existing rows with matching id and property_id according to these rules:
created_at: Keep the earliest
modified_at: Keep the latest
data: Keep the value of the row with the latest modified_at
json_str: Ideally, deep merge json objects (stored as strings) of all matching rows
I did quite a bit of research and tried setting up a deduplication pipeline, using a source table, a destination table (ENGINE = AggregatingMergeTree) and a materialized view (using minState, maxState, argMaxState) but I couldn't figure it out so far. I'm running into errors related to primary key, partitioning, wrong aggregation functions, etc. Even a setup without merging json_str would be very helpful.
After a lot of trial and error, I found a solution (ignoring json_str for now):
-- Source table with duplicates
DROP TABLE IF EXISTS ingest;
CREATE TABLE ingest
(
id Int,
property_id Int,
created_at DateTime('UTC'), -- Should be preserved
modified_at DateTime('UTC'), -- Should be updated
data Int -- Should be updated
) ENGINE = MergeTree
ORDER BY (property_id, created_at);
-- Destination table without duplicates
DROP TABLE IF EXISTS dedup;
CREATE TABLE dedup
(
id Int,
property_id Int,
created_at_state AggregateFunction(min, DateTime),
modified_at_state AggregateFunction(max, DateTime),
data_state AggregateFunction(argMax, Int, DateTime)
) ENGINE = SummingMergeTree
ORDER BY (property_id, id);
-- Transformation pipeline
DROP VIEW IF EXISTS pipeline;
CREATE MATERIALIZED VIEW pipeline TO dedup
AS SELECT
id,
property_id,
minState(created_at) AS created_at_state,
maxState(modified_at) AS modified_at_state,
argMaxState(data, modified_at) AS data_state
FROM ingest
GROUP BY property_id, id;
-- Insert data with a duplicate
INSERT INTO ingest (id, property_id, created_at, modified_at, data)
VALUES (1, 100, '2022-01-01 08:00:00', '2022-01-01 08:00:00', 2000),
(1, 100, '2022-01-01 08:01:00', '2022-01-01 08:01:00', 3000),
(2, 100, '2022-01-01 08:00:00', '2022-01-01 08:00:00', 4000),
(3, 200, '2022-01-01 08:05:00', '2022-01-01 08:05:00', 5000);
-- Query deduplicated table with merge functions
SELECT id,
property_id,
toDateTime(minMerge(created_at_state), 'UTC') AS created_at,
toDateTime(maxMerge(modified_at_state), 'UTC') AS modified_at,
argMaxMerge(data_state) AS data
FROM dedup
GROUP BY property_id, id
ORDER BY id, property_id;
id
property_id
created_at
modified_at
data
1
100
2022-01-01T08:00Z
2022-01-01T08:01Z
3000
2
100
2022-01-01T08:00Z
2022-01-01T08:00Z
4000
3
200
2022-01-01T08:05Z
2022-01-01T08:05Z
5000

INSERT/INNER JOIN MYSQL

I am attempting to insert into the table 'refunds' but I need to reference another table 'transactions' date created. Both tables share the transactionId value. I want to insert into the 'refunds' table if the date created is < 48 hours old. So far I have this insert statement, but cannot get it to work with any sort of join.
INSERT IGNORE INTO refunds
SET
transactionId = ?,
refundAmount = ?
You can do an INSERT/SELECT, something like:
INSERT INTO refunds(transactionId, <col2>, ...)
SELECT transactionId, <col2>, ...
FROM transactions
WHERE date_col > DATE_SUB(CURRENT_DATE, INTERVAL 2 DAY)
This will let you INSERT your refund table based on rows from the transactions table.

Inserting running total

Problem to insert running total in MySQL transactional database. need your help for solutions and opinion. Table structure of my table is,
create table `wtacct` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ACCOUNT_NO` varchar(16),
`AMOUNT` float(16,2),
`BALANCE` float(16,2)
);
[Please note other fields have been removed to make it simple example]
I am doing Transaction as,
Dr 10 USD from account 1001 and
Cr 10 USD to account 2002
Insert query
INSERT INTO wtacct (ID, ACCOUNT_NO, AMOUNT, BALANCE)
VALUES ('', 1001, -10, 100), ('', 2002, 10, 5000);
I want the Balance as,
BALANCE of Account no 1001 = Last transaction Balance of account 1001 - 10.
My solutions and limitations
Solution 1
In insert statement put sub query in balance field:
select balance from wtacct where account_no=1001 and id in(select max(id) from wtacct where account_no=1001)
Limitation: Mysql does not support same table select query (wtacct) where inserting the data (wtacct).
Solution 2
Using insert into select statement
insert into wtacct select '' ID, 1001 ACCOUNT_NO, -10 AMOUNT, (BALANCE-10) BALANCE where account_no=1001 and id in(select max(id) from wtacct where account_no=1001)
Limitation: For first transaction there is no record in wtacct for the account 1001 so select query will not return any record for first transaction.
Solution 3
Taking balance in variable and use it in insert statement.
select #balance1001 :=balance from wtacct
where account_no=1001 and id in(select max(id) from wtacct where account_no=1001)
select #balance2002 :=balance from wtacct
where account_no=2002 and id in(select max(id) from wtacct where account_no=2002)
INSERT INTO wtacct (ID, ACCOUNT_NO, AMOUNT, BALANCE)
VALUES ('', 1001, -10, #balance1001-10), ('', 2002, 10, #balance2002+10);
Limitation: there is a chance to be change the balance in time between select and insert query execution. also its costly, 3 query execution required.
Solution 4
Insert and then update Balance
INSERT INTO wtacct (ID, ACCOUNT_NO, AMOUNT, BALANCE)
VALUES ('', 1001, -10, 0);
UPDATE wtacct set balance = (ifnull(Select balance from wtacct where account_no=1001 and id in(select max(id) from wtacct where id <last_insert_id() and account_no=1001),0) -10)
where id =last_insert_id() and account_no=1001
........
Limitation: query is costly. its required 4 (two insert and 2 update) query execution. note last_insert_id() is php function
Solution 5
Using a trigger on insert statement. In the trigger, the balance will be updated calculating last transaction value and insert amount.
Limitation: Trigger not support transaction behavior and may fail.
Please give your solution and opinion on the above solutions. Please note in the above example their may be some syntax error/error. Please ignore them.
A big limitation I didn't see listed is a potential race condition, where two rows are being inserted into the table at the same time. There's a chance that the two inserts will both get the current "balance" from the same previous row.
One question: do you also have a separate "current balance" table that keeps a single value of the current "balance" for each account? Or are you only relying on the "balance" from the previous transaction.
Personally, I would track the current balance on a separate "account balance" table. And I would use BEFORE INSERT/UPDATE triggers to maintain the value in that row, and use that to return the current balance for the account.
For example, I would define a trigger like this which gets fired when a row is inserted into `wtacct` table:
CREATE TRIGGER wtacct_bi
BEFORE INSERT ON wtacct
FOR EACH ROW
BEGIN
IF NEW.amount IS NULL THEN
SET NEW.amount = 0;
END IF
;
UPDATE acct a
SET a.balance = (#new_balance := a.balance + NEW.amount)
WHERE a.account_no = NEW.account_no
;
SET NEW.balance = #new_balance
;
END$$
The setup for that trigger...
CREATE TABLE acct
( account_no VARCHAR(16) NOT NULL PRIMARY KEY
, balance DECIMAL(20,2) NOT NULL DEFAULT 0
) ENGINE=InnoDB
;
CREATE TABLE wtacct
( id BIGINT NOT NULL PRIMARY KEY AUTO_INCREMENT
, account_no VARCHAR(16) NOT NULL COMMENT 'FK ref acct.account_no'
, amount DECIMAL(20,2) NOT NULL
, balance DECIMAL(20,2) NOT NULL
, FOREIGN KEY FK_wtacct_acct (account_no) REFERENCES acct (account_no)
ON UPDATE CASCADE ON DELETE RESTRICT
) ENGINE=InnoDB
;
My reason for using a separate "current balance" table is that there is only one row for the given account_no, and that row retains the current balance of the account.
The UPDATE statement in the trigger should obtain an exclusive lock on the row being updated. And that exclusive lock prevents any other UPDATE statement from simultaneously updating the same row. The execution of the UPDATE statement will add the `amount` from the current transaction row being inserted to the current balance.
If we were using Oracle or PostgreSQL, we could use a RETURNING clause to get the value that was assigned to the \'balance\' column.
In MySQL we can do a wonky workaround, using a user-defined variable. The new value we are going to assign to the column is first assigned to the user_defined variable, and then that is assigned to the column.
And we can assign the value of the user-defined variable to the `balance` column of the row being inserted into `wtacct`.
The purpose of this approach is to make the retrieval and update of the current balance in a single statement, to avoid any race conditions.
The UPDATE statement locates the row, obtains an exclusive (X) lock on the row, retrieves the current balance (value from the \'balance\' column), calculates the new current balance, and assigns it back to the \'balance\' column. Then continues to hold the lock until the transaction completes.
Once the trigger completes, the INSERT statement (which initially fired the trigger) proceeds, attempting to insert the new row into `wtacct`. If that fails, then all of the changes made by the INSERT statement and execution of the trigger are rolled back, keeping everything consistent.
Once a COMMIT or ROLLBACK is issued by the session, the exclusive (X) lock held on the row(s) in `acct` are released, and other sessions can obtain locks on that row in `acct`.
I have done it using Store Procedure for MySql
CREATE DEFINER=`root`#`%` PROCEDURE `example_add`(IN dr Int, IN cr Int)
BEGIN
DECLARE LID int;
Declare Balance decimal(16,2);
INSERT INTO example (Debit,Credit)
VALUES (dr, cr);
SET LID = LAST_INSERT_ID();
SET Balance = (select SUM(Debit) - SUM(Credit) as Balance from example);
UPDATE Example SET Balance = Balance WHERE ID = LID;
END
Use it example_add(10,0) or example_add(0,15) then select and see the result.

Is it possible to do an operation only if DATETIME is smaller than NOW() at some time interval?

I have a REPLACE INTO operation done in my database, but I would like to make it happen only if the time in that same table for that P_ID is less than 10 minutes let's say.
table_a
P_ID CHECK DATETIME
===================================
10 1 2013-06-27 13:23:23
5 0 2013-06-24 11:14:02
::
REPLACE INTO table_a (P_ID,CHECK,DATETIME) VALUES ('5','1','2013-06-24 11:10:00');
So, I would like this REPLACE to NOT happen because it hasn't been 10 minutes since last update.
Is this possible? Or does it take another query?
UPDATE1: You can wrap your REPLACE INTO into a stored procedure
DELIMITER $$
CREATE PROCEDURE sp_replace(IN pid INT, IN chk INT, IN dt DATETIME)
BEGIN
IF 10 < COALESCE(TIMESTAMPDIFF(MINUTE,
(SELECT datetime FROM table_a WHERE p_id = pid), dt), 11) THEN
REPLACE INTO table_a (p_id, `check`, datetime) VALUES (pid, chk, dt);
END IF;
END$$
DELIMITER ;
Here is SQLFiddle demo
Assuming that P_ID has unique constraint on it, you can use INSERT ... SELECT ... ON DUPLICATE KEY UPDATE to do it's job like this
INSERT INTO table_a (p_id, `check`, datetime)
SELECT 5, 1, '2013-06-24 11:10:00'
FROM table_a
WHERE 10 < COALESCE(TIMESTAMPDIFF(MINUTE, '2013-06-24 11:00:00',
(SELECT datetime FROM table_a WHERE p_id = 5)), 11)
ON DUPLICATE KEY UPDATE `check` = VALUES(`check`), datetime = VALUES(datetime);
Here is SQLFiddle demo
Note: both REPLACE INTO and INSERT INTO ON DUPLICATE KEY change auto incremented primary key (f_id in your case)

Custom Report on User Data - SQL Server

Given the following very simple table:
Create Table tblUserLogins (
LoginNumber int PRIMARY KEY IDENTITY(1,1),
Username varchar(100),
LoginTime datetime
)
Basically when a user logs into the site, a record is created in this table, indicating the user logged in. (For security reasons the developers in my team do not have access to the tables holding the login information, so this was a work-around).
What I need, is some help writing a query which will actually return me the username of the user who logged on the most during a given period (supplied as input values to the procedure).
I can select the data between any 2 given dates using the following:
SELECT * FROM tblUserLogins
WHERE LoginTime BETWEEN #DateFrom AND #DateTo
However I am not sure how I can aggregate the user data, without first dumping the contents of the above query to a temporary table.
Any help would be gratefully received.
SELECT TOP 1
Username,
COUNT(Username) as Total
FROM
tblUserLogins
WHERE
LoginTime BETWEEN #DateFrom AND #DateTo
GROUP BY
Username
ORDER BY
Total DESC
Here it is a full example:
Create Table #tblUserLogins (
LoginNumber int PRIMARY KEY IDENTITY(1,1),
Username varchar(100),
LoginTime datetime
)
declare #DateFrom datetime, #DateTo datetime
select #DateFrom = getdate()
insert into #tblUserLogins values ('test1', getdate())
insert into #tblUserLogins values ('test2', getdate())
insert into #tblUserLogins values ('test1', getdate())
select #DateTo = getdate()
SELECT TOP 1 Username, count(LoginTime) 'Total' FROM #tblUserLogins
WHERE LoginTime BETWEEN #DateFrom AND #DateTo
GROUP BY UserName
ORDER BY Total, Username desc
drop table #tblUserLogins
If there are more than one username with the same number of entries, it'll only show one (with the name most near z).