I would like to query a database table for some of it's oldest entries and update them with a second query afterwards.
But how can I prevent that another process (that does the same) will return the same rows by the SELECT query and the UPDATE part will modify the entries twice?
As far as I see a simple transaction cannot prevent this from happening.
Use the SELECT ... FOR UPDATE mechanism to do this (see http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html)
Related
I don't have a real code sorry. But only a problem explanation.
I would like to understand how is the best way to solve this problem.
I have 3 queries:
The first one is a long Transaction which performs an SQL INSERT statement in a table.
The second query COUNTs the number of rows of the previous table after the INSERT took place
The third query UPDATEs one field of the previously inserted record with the count number retrieved by the second query.
So far so good. My 3 queries are executed correctly.
Now suppose that these 3 queries are executed inside an API call. What happens now is that if multiple API calls are executed too fast and simultaneously, the second COUNT query retrieves a wrong value and consequently the 3th UPDATE has also a wrong value.
Nevertheless I have dead locks on the INSERT query because while making the INSERT, the SELECT COUNT tried to read at the same time on a second api call.
My question is what would be the best approach to solve this kind of problem.
I don't need code. I just would like to understand the best way to go.
Would I need to lock all the tables, for example?
It is unclear what you are doing, but this might be faster:
CREATE TEMPORARY TABLE t ...; -- all columns except count
INSERT IN t ...; -- the incoming data
SELECT COUNT(*) INTO #ct FROM t;
INSERT INTO real_table
(...) -- including the count-column last
SELECT ..., #ct FROM t; -- Note how count is tacked on last
Assume I want to find ids that appear in both mode=1 and mode=2:
SELECT id FROM tab a WHERE mode=1 and (SELECT COUNT(*) from tab b where b.mode=2 and a.id=b.id) = 0
and I need this query to run very quickly, even though the table contains millions of rows (already have an index on id1 and on id2). Is there a way to create something like a view that contains this query, that is updated automatically every time the table changes, to have the results prepared for me in advance?
You can create a table called summary_tab. Use a programming language or command line to execute a query like this:
insert into summary_tab
select id from ...
Then, use a task scheduler like cron to execute the script or command line every few minutes.
The other option is to create an AFTER INSERT trigger on your table that will execute a query like this and update summary table. However, if the query takes a long time and/or if you are inserting a bunch of records in tab table, the trigger will slow inserts down.
You could also try something like this:
select id
from tab
where mode in (1, 2)
group by id
having count(*) = 2
Check the speed and results of this query. If it is not that fast, try creating an index on id and another index on mode and yet another with combination of id+mode and see if one of the indexes makes the query fast enough that you don't have to use a summary table.
Good morning.
I have a table on MySQL DataBase.
In this table there are 5 robots that can write like 10 record each per hour.
Every 3 month a script that I have created, make a copy of the table and then delete all the table entries (In this way I can keep the IDs in a certain order).
My question is.
That are two different statement:
CREATE TABLE omologationResult_{date} AS SELECT * FROM omologationResult
DELETE FROM omologationResult
if the script is going to copy the table at point 0, and a record will be added from the robots, there's no problem, because the SQL statement starts from the lowest ID 'till the end. But if the script is going to delete the table and the robot is writing in it. What will happen? I lose the last robot record?
And if it's true. What can I do to make a copy of the table and then remove only the data that I've copied?
Thank you
Yes, this is not a safe operation because it's not atomic. It's quite possible for another thread to insert values into that table in between your CREATE .. SELECT and the DELETE. One option you have is to use a multi table DELETE
CREATE TABLE omologationResult_{date} AS SELECT * FROM omologationResult;
DELETE omologationResult FROM omologationResult
INNER JOIN omologationResult_{date} ON omologationResult_{date}.id = omologationResult.id
Will ensure that only items that exist in both tables have been deleted from omologationResult
I make 2 queries in a transaction: SELECT (containing JOIN clause) and UPDATE. It is required that data in selected rows don't change before the update is done, so i'm using FOR UPDATE clause. My question is: does the 'for update' works only for part of data selected from table specified in FROM clause or for data from joined tables also? My DBMS is MySql.
The documentation simply says that the lock is on rows read without excepting joined tables, so it should be on all records on all the joined tables. If you want to lock only the rows in one of the tables, you can do that separately: 'SELECT 1 FROM keytable WHERE ... FOR UPDATE'.
That said, this is not needed to simply prevent an update between the SELECT and UPDATE. The read lock on the SELECT already does this. The purpose of the FOR UPDATE would be to prevent another transaction from reading the rows and thus potentially causing a deadlock because the UPDATE can not be applied until the other transaction releases its read lock.
UPDATE myTable SET niceColumn=1 WHERE someVal=1;
SELECT * FROM myTable WHERE someVal=1;
Is there a way to combine these two queries into one? I mean can I run an update query and it shows the rows it updates. Because here I use "where id=1" filtering twice, I don't want this. Also I think if someVal changes before select query I will have troubles about what I get (ex: update updates it and after that someVal becomes 0 because of other script).
Wrap the two queries in a transaction with the desired ISOLATION LEVEL so that no other threads can't affect the locked rows between the update and the select.
Actually, even what you have done will not show the rows it updated, because meanwhile (after the update) some process may add/change rows.
And this will show all the records, including the ones updated yesterday :)
If I want to see exactly which rows were changed, I would go with temp table. First select into a temp table all the row IDs to be updated. Then perform the update based on the raw IDs in the temp table, and then return the temp table.
CREATE TEMPORARY TABLE to_be_updated
SELECT id
FROM myTable
WHERE someVal = 1;
UPDATE myTable
SET niceColumn = 1
WHERE id IN (SELECT * FROM to_be_updated);
SELECT *
FROM myTable
WHERE id IN (SELECT * FROM to_be_updated)
If in your real code the conditional part (where and so on) is too long to repeat, just put it in a variable that you use in both queries.
Unless you encounter a different problem, you shouldn't need these two combined.