SELECT ... FOR UPDATE inside a UPDATE query - mysql

I've been trying to implement a simple script that locks a table from being read, updates some fields, and then unlocks it.
Here's my table:
mysql> SHOW COLUMNS FROM tb1;
+---------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------+------+-----+---------+-------+
| id | int(11) | YES | PRI | NULL | |
| status | int(1) | YES | | 0 | |
+---------------+---------+------+-----+---------+-------+
Lets see if you guys understand what I'm trying to do:
Start transaction
SELECT all rows with status != 1 (it may return more than 1 row) with FOR UPDATE statement;
UPDATE the field status of the rows that has been selected in pass 2
Commit
I tried to achieve this in many ways, but I cant persist the SELECT data that I got in pass 2 and I can't use SELECT ... FOR UPDATE as a subquery of a UPDATE like this UPDATE tb1 SET status=1 WHERE id IN (SELECT id FROM tb1 WHERE status != 1 FOR UPDATE);
Is it possible to achieve this instead of updating row by row?

Related

SQL UPDATE multiple rows at once after SELECT

My sample table:
+------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------+------+-----+---------+-------+
| id | bigint(20) | YES | | NULL | |
| other_id | bigint(20) | YES | | NULL | |
| another_id | bigint(20) | YES | | NULL | |
+------------+------------+------+-----+---------+-------+
+------+----------+------------+
| id | other_id | another_id |
+------+----------+------------+
| 988 | 102 | NULL |
| 989 | 103 | NULL |
| 990 | 104 | NULL |
| 991 | 105 | NULL |
| 992 | 106 | NULL |
| 987 | 101 | NULL |
+------+----------+------------+
How would I SELECT and UPDATE the above table in one query to the effect of doing something like this for every row:
UPDATE
x
SET
another_id = 987
WHERE
id = 987
AND other_id = 101;
UPDATE
x
SET
another_id = 988
WHERE
id = 988
AND other_id = 102
I would hate to run a manual update like this for every row and would like to do it all in one go.
To me it seems that you simply want to set the value of another_id to id:
UPDATE
x
SET
another_id = id
You can provide a range of other_id values in the where clause if you need to restrict the number of rows updated:
UPDATE
x
SET
another_id = id
WHERE other_id IN (...) --list the values you want here.
Shadow is correct on his update statement. It looks like you want to carry all ID directly into the ANOTHER_ID. If you only want this to occur where the "Other_ID" is a given range, just add " where other_id between 101 and 21234" or whatever range you want it to happen for.
To see the results of what Shadow's answer would result in, change it to a simple SELECT statement to see. If it is correct, change to the update version. Example...
Select
ID,
ID AS Another_ID,
Other_ID
from
YourTable
you will get all records showing the two columns showing the ID AS The "Another_ID". It does NOT UPDATE The "Another_ID" column, just queries the value AS the result column name. Again, if you wanted only certain range of numbers, just add
where Other_ID between 101 and 21234
(or whatever value range)
Now, to see as an UPDATE command is exactly as Shadow TRIED to explain..
update YourTable set
AnotherID = ID
and ALL records get updated... If within specific range... use the same where clause as the Select.
If you want to try this without messing up production data, work with a temporary bogus table you can always delete after you are done..
insert into MyTempTable
( ID,
Another_ID,
Other_ID
)
select ID, Another_ID, Other_ID
From YourTable
where ID between 500 and 800
Now you have a test table to play with the insert table and see the impact...

MySQL: Strange behavior of UPDATE query (ERROR 1062 Duplicate entry)

I have a MySQL database the stores news articles with the publications date (just day information), the source, and category. Based on these I want to generate a table that holds the article counts w.r.t. to these 3 parameters.
Since for some combinations of these 3 parameters there might be no article, a simple GROUP BY won't do. I therefore first generate a table news_article_counts with all possible combinations of the 3 parameters, and an default article_count of 0 -- like this:
SELECT * FROM news_article_counts;
+--------------+------------+----------+---------------+
| published_at | source | category | article_count |
+------------- +------------+----------+---------------+
| 2016-08-05 | 1826089206 | 0 | 0 |
| 2016-08-05 | 1826089206 | 1 | 0 |
| 2016-08-05 | 1826089206 | 2 | 0 |
| 2016-08-05 | 1826089206 | 3 | 0 |
| 2016-08-05 | 1826089206 | 4 | 0 |
| ... | ... | ... | ... |
+--------------+------------+----------+---------------+
For testing, I now created a temporary table tmp as the GROUP BY result from the original news article table:
SELECT * FROM tmp LIMIT 6;
+--------------+------------+----------+-----+
| published_at | source | category | cnt |
+--------------+------------+----------+-----+
| 2016-08-05 | 1826089206 | 3 | 1 |
| 2003-09-19 | 1826089206 | 4 | 1 |
| 2005-08-08 | 1826089206 | 3 | 1 |
| 2008-07-22 | 1826089206 | 4 | 1 |
| 2008-11-26 | 1826089206 | 8 | 1 |
| ... | ... | ... | ... |
+--------------+------------+----------+-----+
Given these two tables, the following query works as expected:
SELECT * FROM news_article_counts c, tmp t
WHERE c.published_at = t.published_at AND c.source = t.source AND c.category = t.category;
But now I need to update the article_count of table news_article_counts with the values in table tmp where the 3 parameters match up. For this I'm using the following query (I've tried different ways but with the same results):
UPDATE
news_article_counts c
INNER JOIN
tmp t
ON
c.published_at = t.published_at AND
c.source = t.source AND
c.category = t.category
SET
c.article_count = t.cnt;
Executing this query yields this error:
ERROR 1062 (23000): Duplicate entry '2018-04-07 14:46:17-1826089206-1' for key 'uniqueIndex'
uniqueIndex is a joint index over published_at, source, category of table news_article_counts. But this shouldn't be a problem since I do not -- as far as I can tell -- update any of those 3 values, only article_count.
What confuses me most is that in the error it mentions the timestamp I executed the query (here: 2018-04-07 14:46:17). I have no absolutely idea where this comes into play. In fact, some rows in news_article_counts now have 2018-04-07 14:46:17 as value for published_at. While this explains the error, I cannot see why published_at gets overwritten with the current timestamp. There is no ON UPDATE CURRENT_TIMESTAMP on this column; see:
CREATE TABLE IF NOT EXISTS `test`.`news_article_counts` (
`published_at` TIMESTAMP NOT NULL,
`source` INT UNSIGNED NOT NULL,
`category` INT UNSIGNED NOT NULL,
`article_count` INT UNSIGNED NOT NULL DEFAULT 0,
UNIQUE INDEX `uniqueIndex` (`published_at` ASC, `source` ASC, `category` ASC))
ENGINE = MyISAM
DEFAULT CHARACTER SET = utf8mb4;
What am I missing here?
UPDATE 1: I actually checked the table definition of news_article_counts in the database. And there's indeed the following:
mysql> SHOW COLUMNS FROM news_article_counts;
+---------------+------------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+-------------------+-----------------------------+
| published_at | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| source | int(10) unsigned | NO | | NULL | |
| category | int(10) unsigned | NO | | NULL | |
| article_count | int(10) unsigned | NO | | 0 | |
+---------------+------------------+------+-----+-------------------+-----------------------------+
But why is on update CURRENT_TIMESTAMP set. I double and triple-checked my CREATE TABLE statement. I removed the joint index, I added an artificial primary key (auto_increment). Nothing help. I've even tried to explicitly remove these attributes from published_at with:
ALTER TABLE `news_article_counts` CHANGE `published_at` `published_at` TIMESTAMP NOT NULL;
Nothing seems to work for me.
It looks like you have the explicit_defaults_for_timestamp system variable disabled. One of the effects of this is:
The first TIMESTAMP column in a table, if not explicitly declared with the NULL attribute or an explicit DEFAULT or ON UPDATE attribute, is automatically declared with the DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP attributes.
You could try enabling this system variable, but that could potentially impact other applications. I think it only takes effect when you're actually creating a table, so it shouldn't affect any existing tables.
If you don't to make a system-level change like this, you could add an explicit DEFAULT attribute to the published_at column of this table, then it won't automatically add ON UPDATE.

Restrict insertion based on a count

So, I need to safely restrict the insertion of entries in a table based on the count of other entries in that same table. Say we have the following table:
resource:(id, foreign_key)
I need to create up to a number of entries based on the foreign key. So, as soon as I reach a count, let's say 100 for our example, I want to restrict creating more entries.
The obvious answer would be something like that:
count the entries with the specified foreign key.
if count < limit insert the new entry
And in fact, that's what I have been using. The thing is, this approach is not fail-proof since between 1 and 2 there might occur another insertion. I considered the possibility of using transactions but (unless I'm completely misunderstanding transactions) this has the same issue:
start transaction
insert the new entry
if entries have exceeded the limit, rollback. otherwise commit
Now, say we already have 99/100 entries and two transactions run at the same time. They both will commit since they don't see each-other's entries.
Short of actually creating the entry and then delete it if it's invalid (which feels kindof messy in my mind) I can't think of a way to solve this issue. Any ideas?
edit: upon request I'm providing sample data:
table1
+-------------+------------------+------+-----+----------------+
| Field | Type | Null | Key | Extra |
+-------------+------------------+------+-----+----------------+
| id | int(10) unsigned | NO | PRI | auto_increment |
| limit | int(10) unsigned | NO | MUL | |
+-------------+------------------+------+-----+----------------+
table2
+-------------+------------------+------+-----+----------------+
| Field | Type | Null | Key | Extra |
+-------------+------------------+------+-----+----------------+
| id | int(10) unsigned | NO | PRI | auto_increment |
| foreign_id | int(10) unsigned | NO | MUL | |
+-------------+------------------+------+-----+----------------+
and some sample data:
table1
+----+----------+
| id | limit |
+----+----------+
| 1 | 5 |
+----+----------+
table2
+----+---------------+
| id | foreign_id |
+----+---------------+
| 1 | 1 |
+----+---------------+
| 2 | 1 |
+----+---------------+
| 3 | 1 |
+----+---------------+
| 4 | 1 |
+----+---------------+
At this point, let's say that two users attempt to create table2 entries. The first one will have to be accepted and the 2nd rejected.
With the first approach, if both users go through step 1 (counting the old entries) and then through step 2 (insert the new entry) both entries will be created.
With the second approach, if both of them run at the same time, they both will count 4 slots before themselves and commit instead of one of them rollbacking.
Halo Mate, a Stored Procedure similar to this structure may help you
UPDATE
DROP PROCEDURE IF EXISTS sp_insert_record;
DELIMITER //
CREATE PROCEDURE sp_insert_record(
IN insert_value1 INT(9),
IN chosen_id INT(9)
)
BEGIN
SELECT id, `limit`
INTO #id, #limit
FROM table1
WHERE id = chosen_id;
START TRANSACTION;
INSERT INTO table2 (id, foreign_id)
VALUES (insert_value1, chosen_id);
SELECT COUNT(id)
INTO #count
FROM table2
WHERE foreign_id = #id;
IF #count <= #limit THEN
COMMIT;
ELSE
ROLLBACK;
END IF;
END//
DELIMITER ;
By using a Stored Procedure, you can also add any validation or process based on your requirements.
Hope this can be of help, cheers!

Delete duplicate rows GROUP BY with LIKE

I have a string in database (mysql) which is like:
{"StateId":73,"CallTime":"\/Date(1336365498912+0500)\/","CallId":"1336365489.14157","Target":"agi://127.0.0.1"}},"Profile":{"$type":"DataWriter.DbProfile, DataWriterObjects","Name":"DataService","Provider":"mssql","ConnectionString":"Data Source=localhost\\mydb; Database=mydb; User Id=sa; Password=admin;"}}
The string is a JSON object which contains multiple fields. The problem is that I have multiple duplicate rows which I want to remove from the database. A row is considered a duplicate if the CallId and StateId is same but the CallTime is different. So first I want to get list of the duplicates (GROUP BY) of those rows which have CallId same and ignore the difference in CallTime. The below record has different CallTime from the first one but same CallId, hence it is considered a duplicate (basically need not to consider CallTime for duplicate)
{"StateId":73,"CallTime":"\/Date(1336365498913+0500)\/","CallId":"1336365489.14157","Target":"agi://127.0.0.1"}},"Profile":{"$type":"DataWriter.DbProfile, DataWriterObjects","Name":"DataService","Provider":"mssql","ConnectionString":"Data Source=localhost\\mydb; Database=mydb; User Id=sa; Password=admin;"}}
So how do I do a GROUP BY? Basically everything in the GROUP BY should be matched ignoring the CallTime value.
The table structure is
mysql> describe Statements;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| SequenceId | bigint(10) | NO | PRI | NULL | auto_increment |
| Profile | varchar(32) | YES | MUL | NULL | |
| CacheItem | text | NO | | NULL | |
+------------+-------------+------+-----+---------+----------------+
After that I want to delete the duplicates. Anyone help me out?
I think your database is not atomic enough, you may have to split out your JSON string into separate fields

Break this Query into batches

I have a MySQL table contacts, with structure as follows
+--------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| contactee_id | int(11) | NO | MUL | 0 | |
| contacter_id | int(11) | NO | MUL | 0 | |
+--------------+----------+------+-----+---------+----------------+
contactee_id and contacter_id are both ids, which together defines a relationship between two users. In order to calculate the count of relations, a user have, I have the following query
INSERT INTO followers (id, followers)
SELECT contactee_id, 1
FROM contacts
ON DUPLICATE KEY
UPDATE followers = followers + 1
The problem with this query is that it locks the contacts table for too long (more than 16 minutes). I want to get it done in batches, so that the SQL does not locks contacts table for too long. Few ways, I thought of, but they all need to lock the entire table. Is there a way this could be done?
If you just want the count of relations use the count and group by together like
SELECT contactee_id,count(contacter_id) FROM contacts group by contactee_id;
This will give you all the contactee_id and the number of contacter_id's for each contactee
Run query for some records and then save the id of the last record in a table or filesystem, start next query from that id and update it every cycle.