I have a circumstance where I need to update some table rows, marking the ones that do not appear in an external data-source as disabled (i.e. update active=0). The straight-forwards solution is to BEGIN a transaction, UPDATE every row to active=0, and then scan the remote data, doing an UPDATE for each entry that should be active=1 to put it back. I have around 1k rows, so this should be a relatively quick operation, even if there is a lot of inefficient query parsing.
However, this data will often not change at all. Hence, in the majority of cases, the net effects of the transaction will be zero change. If the database engine will resolve the whole thing, detect that nothing is changing, and not change anything as a result, that would be ideal. However, if it is going to go through and actually update every row, every time, I would rather find another solution.
Here's a demo. I created a table with just a simple integer in a row.
mysql> create table t ( i int );
mysql> insert into t set i = 42;
I check the current number of log writes.
mysql> show status like 'innodb_log_write_requests';
+---------------------------+---------+
| Variable_name | Value |
+---------------------------+---------+
| Innodb_log_write_requests | 5432152 |
+---------------------------+---------+
Then change the value in the row with an UPDATE and confirm it resulted in a log write:
mysql> update t set i = 43;
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> show status like 'innodb_log_write_requests';
+---------------------------+---------+
| Variable_name | Value |
+---------------------------+---------+
| Innodb_log_write_requests | 5432153 |
+---------------------------+---------+
Next, make an UPDATE that has no net effect.
mysql> update t set i = 43;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
Notice Changed: 0.
Look at the log writes, it is also unchanged:
mysql> show status like 'innodb_log_write_requests';
+---------------------------+---------+
| Variable_name | Value |
+---------------------------+---------+
| Innodb_log_write_requests | 5432153 |
+---------------------------+---------+
I think it has pretty much been concluded that there is disk I/O for your no-op. Let's discuss the task at hand:
Instead of actually modifying the database, can you keep in memory a list of the items that might be disabled? After you have made the scan, if there are any to disable, then proceed to disable all in a single UPDATE ... WHERE id IN (...)
On another topic... If you are actually doing this
BEGIN;
UPDATE a=0; -- for all rows
COMMIT;
-- all are disabled briefly
BEGIN;
UPDATE a=1 WHERE id = ... -- one row at a time
COMMIT;
Then you have a window where everthing is disabled. You probably don't want that.
Related
I am at the REPEATABLE-READ level.
Why does it make me wait?
I understand that all reads (SELECTs) at any level are non-blocking.
what am I missing?
Session 1:
mysql> lock tables users write;
Query OK, 0 rows affected (0.00 sec)
Session 2:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from users where id = 1; // wait
Session 1:
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
Session 2:
mysql> select * from users where id = 1;
+----+-----------------+--------------------+------+---------------------+--------------------------------------------------------------+----------------+---------------------+---------------------+------------+
| id | name | email | rol | email_verified_at | password | remember_token | created_at | updated_at | deleted_at |
+----+-----------------+--------------------+------+---------------------+--------------------------------------------------------------+----------------+---------------------+---------------------+------------+
| 1 | Bella Lueilwitz | orlo19#example.com | NULL | 2022-08-01 17:22:29 | $2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi | MvMlaX9TQj | 2022-08-01 17:22:29 | 2022-08-01 17:22:29 | NULL |
+----+-----------------+--------------------+------+---------------------+--------------------------------------------------------------+----------------+---------------------+---------------------+------------+
1 row in set (10.51 sec)
In this question the opposite is true
Why doesn't LOCK TABLES [table] WRITE prevent table reads?
You reference a question about MySQL 5.0 posted in 2013. The answer from that time suggests that the client was allowed to get a result that had been cached in the query cache. Since then, MySQL 5.6 and 5.7 disabled the query cache by default, and MySQL 8.0 removed the feature altogether. This is a good thing.
The documentation says:
WRITE lock:
Only the session that holds the lock can access the table. No other session can access it until the lock is released.
This was true in the MySQL 5.0 days too, but the query cache allowed some clients to get around it. But I guess it wasn't reliable even then, because if the client ran a query that happened not to be cached, I suppose it would revert to the documented behavior. Anyway, it's moot, because all currently supported versions of MySQL should have the query cache disabled or removed.
On MySQL, first, I ran BEGIN; or START TRANSACTION; to start a transaction as shown below:
mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)
Or:
mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)
Then, I ran the query below to check if the transaction is running but I got nothing as shown below:
mysql> SELECT trx_id FROM information_schema.innodb_trx;
Empty set (0.00 sec)
So just after this, I ran the query below to show all the rows of person table:
mysql> SELECT * FROM person;
+----+------+
| id | name |
+----+------+
| 1 | John |
| 2 | Tom |
+----+------+
2 rows in set (0.00 sec)
Then again, I ran the query below, then now, I could check the transaction is actually running:
mysql> SELECT trx_id FROM information_schema.innodb_trx;
+-----------------+
| trx_id |
+-----------------+
| 284321631771648 |
+-----------------+
1 row in set (0.00 sec)
So, does only running BEGIN; or START TRANSACTION; query really start a transaction?
Yes. start transaction starts a transaction. Otherwise, that statement would have a very misleading name.
You can check this by trying to do things you cannot do inside a transaction, e.g. change the isolation level:
START TRANSACTION;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
ERROR 1568 (25001): Transaction characteristics can't be changed while a transaction is in progress
So why don't you see it in the innodb_trx table?
Because innodb_trx only lists transactions that involve InnoDB. Until you e.g. query an InnoDB-table, the InnoDB engine doesn't know about that transaction and doesn't list it, just as shown in your tests. Try changing the table to a MyISAM-table. It will not show up in innodb_trx even after you ran a query on it.
A transaction is started on the mysql level, not the storage engine level.
For example, the NDB engine also supports transactions and has its own transaction table where it only shows transactions that involve NDB tables.
I know that using the locks or MVCC in Mysql can achieve concurrency control, such as repeatable-reading. But I don't know how MVCC avoids phantom-reading. In other places, I learned that it is generally implemented through MVCC and Gap-Lock, but what I currently understand is that MVCC does not need locks, that is, both updates and deletions are implemented using undo-logs. If so, how do MVCC and the lock mechanism work together?
For example, to avoid phantom-reading, would MVCC add a gap-lock on some rows in T1? If so, how MVCC does when updates occurred in T2, just appends a update undo-log generally? or blocks it?
MySQL (specifically, InnoDB) does not support REPEATABLE-READ for locking statements. For example, UPDATE, DELETE or SELECT...FOR UPDATE. These statements always take locks on the most recently committed row version, as if the transaction isolation level were READ-COMMITTED.
You can observe this happening:
mysql> create table mytable (id int primary key, x int);
Query OK, 0 rows affected (0.05 sec)
mysql> insert into mytable values (1, 42);
Query OK, 1 row affected (0.02 sec)
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from mytable;
+----+------+
| id | x |
+----+------+
| 1 | 42 |
+----+------+
So far, so good. Now open a second window and update the value:
mysql> update mytable set x = 84;
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
Now back in the first window, a non-locking read still views the original value because of REPEATABLE-READ, but a locking read views the most recently committed version:
mysql> select * from mytable;
+----+------+
| id | x |
+----+------+
| 1 | 42 |
+----+------+
1 row in set (0.00 sec)
mysql> select * from mytable for update;
+----+------+
| id | x |
+----+------+
| 1 | 84 |
+----+------+
1 row in set (0.00 sec)
mysql> select * from mytable;
+----+------+
| id | x |
+----+------+
| 1 | 42 |
+----+------+
1 row in set (0.00 sec)
You can go back and forth as many times as you want, and the same transaction can return both values, depending on doing a locking read vs. non-locking read.
This is a strange behavior of InnoDB, but it allows reads to not be blocked. I have used other MVCC implementations such as InterBase/Firebird, which solve this differently. It would block the read until the transaction in the second window commits or rolls back. If it rolls back, then the locking read can read the original value. If the other transaction commits, then the locking read gets an error.
InnoDB makes a different choice on how to implement MVCC, to avoid blocking the read. But it causes the strange behavior where a locking read must view the latest committed row version.
As the song says, "you can't always get what you want."
Okay so maybe this is way deeper than I will ever need to go however I want to be able to analyze this nested loop so that I can understand it.
Given:
mysql> describe t1;
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| dt | datetime | NO | MUL | NULL | |
+-------+----------+------+-----+---------+-------+
1 row in set (0.00 sec)
And:
mysql> insert t1 values(101),(102),(103),(104),(105),(106),(107),(#c:=now());
Query OK, 8 rows affected (0.03 sec)
Records: 8 Duplicates: 0 Warnings: 0
And:
mysql> insert t1 select #c:=#c+interval 1 second from t1,t1 b,t1 c,t1 d,t1 e,t1 f;
Query OK, 262144 rows affected (1.94 sec)
Records: 262144 Duplicates: 0 Warnings: 0
So far I have the understanding that (#ofrows)^(#oftables)=(# of rows Added)
My question is why this is the case. I cannot find out exactly how MySQL is handling the rows and other system variables in order to create the equation that I have provided here. My equation is clearly a simplified version of the resulting action performed by the server as using 2 rows of data and 6 tables similarly gives an output of 64.
Does anyone know exactly how this is manipulated? I have been working on this for 2 days and I cannot get my mind off of it...
Also why is it inserting more than 6... maybe 36? rows into the table in the first place?? it is only specifying one possible select-able row from the tables and that is the previously inserted now() and then adding 1 second to that row and resetting the #c based on the final change so shouldn't it by logic only insert a few rows?
I guess to put it simply I understand what is happening specifically in the select #c:=#c+interval 1 second portion of the statement, but after that I am not quite sure...
I guess to put it simply how does:
select #c:=#c+interval 1 second;
+--------------------------+
| #c:=#c+interval 1 second |
+--------------------------+
| 2014-07-20 18:17:50 |
+--------------------------+
1 row in set (0.00 sec)
turn into this:
...
Query OK, 262144 rows affected (1.94 sec)
Records: 262144 Duplicates: 0 Warnings: 0
The answer that was satisfactory for me on this issue was so simple I can't believe that I didn't realize it before now.
Today I needed to quickly fill a table with values so I decided to use another table that I had to do this. For example take table i which has the column i with values 1-50. I executed the following command to populate my table c column i.
insert into c select i.i from i,i b;
Based on my knowledge above I know that this would fill table c with exactly 2500 values. being that it executed the 50^2.
What I actually found that this was doing was examining each row from table i and multiplying it by the count of rows found in the first buffered instance i b and then inserting the resulting rows into the table. If you have more buffered instances it will get the results from one and then proceed to multiply those results further by the original 50 rows that would be found in the instance (for example i c).
I ended up with 50 rows of 1-50 within table c.
I didn't realize that this match based multiplication was being performed in this way before because I was using an incrementing operator and it would incrementally create new rows based on this multiplication, however it did not leave any indication that this is what it did due to the fact that it was not matching rows to duplicate rather it matched the number of rows to insert.
I am doing these :
insert into table_name(maxdate) values
((select max(date1) from table1)), -- goes in row1
((select max(date2) from table2)), -- goes in row2
.
.
.
((select max(date500) from table500));--goes in row500
is it possible that while insertion , order of inserting might get change ?.Eg when i will do
select maxdate from table_name limit 500;
i will get these
date1 date2 . . date253 date191 ...date500
Short answer:
No, not possible.
If you want to double check :
mysql> create table letest (f1 varchar(50), f2 varchar(50));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into letest (f1,f2) values
( (SELECT SLEEP(5)), 'first'),
( (SELECT SLEEP(1)), 'second');
Query OK, 2 rows affected, 1 warning (6.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> select * from letest;
+------+--------+
| f1 | f2 |
+------+--------+
| 0 | first |
| 0 | second |
+------+--------+
2 rows in set (0.00 sec)
mysql>
SLEEP(5) is the first row to be inserted after 5 seconds,
SLEEP(1) is the second row to be inserted after 5+1 seconds
that is why query takes 6 seconds.
The warning that you see is
mysql> show warnings;
+-------+------+-------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------+
| Note | 1592 | Statement may not be safe to log in statement format. |
+-------+------+-------------------------------------------------------+
1 row in set (0.00 sec)
This can affect you only if you are using a master-slave setup, because the replication binlog will not be safe. For more info on this http://dev.mysql.com/doc/refman/5.1/en/replication-rbr-safe-unsafe.html
Later edit: Please consider a comment if you find this answer not usefull.
Yes, very possible.
You should consider a database table unordered, and a SELECT statement without ORDER clause as well. Every DBMS can choose how to implement tables (often even depending on Storage Engine) and return the rows. Sure, many DBMS's happen to return your data in the order you inserted, but never rely on it.
The order of the retrieved data my depend on the execution plan, and may even be different when running the same query multiple times. Especially when only retrieving part of the data (TOP/LIMIT).
If you want to impose an order, add a field which orders your data. Yes, an autoincrement primary key will be enough in many cases. If you think you'll be wanting to change the order someday, add another field.