UPDATE myTable SET niceColumn=1 WHERE someVal=1;
SELECT * FROM myTable WHERE someVal=1;
Is there a way to combine these two queries into one? I mean can I run an update query and it shows the rows it updates. Because here I use "where id=1" filtering twice, I don't want this. Also I think if someVal changes before select query I will have troubles about what I get (ex: update updates it and after that someVal becomes 0 because of other script).
Wrap the two queries in a transaction with the desired ISOLATION LEVEL so that no other threads can't affect the locked rows between the update and the select.
Actually, even what you have done will not show the rows it updated, because meanwhile (after the update) some process may add/change rows.
And this will show all the records, including the ones updated yesterday :)
If I want to see exactly which rows were changed, I would go with temp table. First select into a temp table all the row IDs to be updated. Then perform the update based on the raw IDs in the temp table, and then return the temp table.
CREATE TEMPORARY TABLE to_be_updated
SELECT id
FROM myTable
WHERE someVal = 1;
UPDATE myTable
SET niceColumn = 1
WHERE id IN (SELECT * FROM to_be_updated);
SELECT *
FROM myTable
WHERE id IN (SELECT * FROM to_be_updated)
If in your real code the conditional part (where and so on) is too long to repeat, just put it in a variable that you use in both queries.
Unless you encounter a different problem, you shouldn't need these two combined.
Related
I have a MySQL database with just 1 table:
Fields are: blocknr (not unique), btcaddress (not unique), txid (not unique), vin, vinvoutnr, netvalue.
Indexes exist on both btcaddress and txid.
Data in it looks like this:
I need to delete all "deletable" record pairs. An example is given in red.
Conditions are:
txid must be the same (there can be more than 2 records with same txid)
vinvoutnr must be the same
vin must be different (can have only 2 values 0 and 1, so 1 must be 0 other must be 1)
In a table of 36M records, about 33M records will be deleted.
I've used this:
delete t1
from registration t1
inner join registration t2
where t1.txid=t2.txid and t1.vinvoutnr=t2.vinvoutnr and t1.vin<>t2.vin;
It works but takes 5 hours.
Maybe this would work too (not tested yet):
delete t1
from registration as t1, registration as t2
where t1.txid=t2.txid and t1.vinvoutnr=t2.vinvoutnr and t1.vin<>t2.vin;
Or do I forget about a delete query and try to make a new table with all non-delatables in and then drop the original ?
Database can be offline for this delete query.
Based on your question, you are deleting most of the rows in the table. That is just really expensive. A better approach is to empty the table and re-populate it:
create table temp_registration as
<query for the rows to keep here>;
truncate table registration;
insert into registration
select *
from temp_registration;
Your logic is a bit hard to follow, but I think the logic on the rows to keep is:
select r.*
from registration r
where not exists (select 1
from registration r2
where r2.txid = r.txid and
r2.vinvoutnr = r.vinvoutnr and
r2.vin <> r.vin
);
For best performance, you want an index on registration(txid, vinvoutnr, vin).
Given that you expect to remove the majority of your data it does sound like the simplest approach would be to create a new table with the correct data and then drop the original table as you suggest. Otherwise ADyson's corrections to the JOIN query might help to alleviate the performance issue.
I have a database called 'master_database' and a table called 'info'
In the 'info' table I have multiple records and I need the 'email' field to not contain any duplicates but currently it does. What SQL command can I run to remove these duplicates?
You can know the rows that are repeated by using this:
SELECT email, COUNT(email) FROM info GROUP BY email HAVING COUNT(email) > 1
DELETE FROM master_database.info WHERE info.ID NOT IN (SELECT MAX(info.ID) FROM master_database.info GROUP BY info.email HAVING COUNT(info.email) > 1)
This assumes you have a unique ID in the table where the higher the number the later the record, if you have a last_edited timestamp it might be better to use the MAX of that.
PLEASE TEST FIRST! Run the following to test:
SELECT * FROM master_database.info WHERE info.ID NOT IN (SELECT MAX(info.ID) FROM master_database.info GROUP BY info.email HAVING COUNT(info.email) > 1)
These values will be deleted.
If you don't have a unique field at all (ie the rows are really duplicates, the following methods will work):
Go to relevant table.
Create select query to select only the relevant duplicate row pairs.
Amend query to delete but add limit like:
DELETE FROM [table name] WHERE [fieldname]LIKE '[value]' AND [fieldname]LIKE '[value]' LIMIT 1
simulate to check effect before using query to delete one of a duplicate pair or repeat usage if there are more than a pair of duplicates
or
Go to relevant table.
Create select query to select only the relevant rows.
Copy query for later pasting.
Export button at the bottom of the query results will export just the query results.
Paste query back but amend it to delete those rows.
Copy export insert query for one of the duplicate rows to reinsert without the duplicate.
I would like to query a database table for some of it's oldest entries and update them with a second query afterwards.
But how can I prevent that another process (that does the same) will return the same rows by the SELECT query and the UPDATE part will modify the entries twice?
As far as I see a simple transaction cannot prevent this from happening.
Use the SELECT ... FOR UPDATE mechanism to do this (see http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html)
I am trying to read 10 records from a MySql table and updating a field IsRead to 1 to avoid the duplicate read. So when I again read the data then the next 10 record should be read not the already read records using IsRead,
select * from tablename where IsRead=0 limit 10;
But my Question is how can I read and update the 10 records at the same time.
Using a Single Query.
EDIT
Previously I am reading and updating one one records, but now I want to avoid the reading time (once for reading and once for updating) so what will be the suitable way to read and update 10 records. Duplicate record should not be read.
What you're looking for is not a single statement, but transactions.
Transactions are a way to make multiple statements ACID compliant. Read about it in the link provided. In short it means, "all or nothing".
Code wise it would simply look something like this:
START TRANSACTION;
select * from tablename where IsRead=0 ORDER BY created_or_whatever_column limit 10 for update;
update tablename set IsRead = 1 ORDER BY created_or_whatever_column LIMIT 10;
COMMIT;
Notice, that I added an order by clause. Using limit without order by doesn't make sense. There's no order in the data in a database, unless you specify it.
Also I added for update to the select statement, so the rows are locked until the transaction ends (with commit) so that no other transaction manipulates these rows in the meantime.
What you should also have a look at in this context are the isolation levels.
Here is what im trying to do explained in a query
DELETE FROM table ORDER BY dateRegistered DESC LIMIT 1000 *
I want to run such query in a script which i have already designed. Every time it finds older records that are 1001th record or above it deletes
So kinda of setting Max Row size but deleting all the older records.
Actually is there a way to set that up in the CREATE statement.
Therefore: If i have 9023 rows in the database, when i run that query it should delete 8023 rows and leave me with 1000
If you have a unique ID for rows here is the theoretically correct way, but it is not very efficient (not even if you have an index on the dateRegistered column):
DELETE FROM table
WHERE id NOT IN (
SELECT id FROM table
ORDER BY dateRegistered DESC
LIMIT 1000
)
I think you would be better off by limiting the DELETE directly by date instead of number of rows.
I don't think there is a way to set that up in the CREATE TABLE statement, at least not a portable one.
The only way that immediately occurs to me for this exact job is to do it manually.
First, get a lock on the table. You don't want the row count changing while you're doing this. (If a lock is not practical for your app, you'll have to work out a more clever queuing system rather than using this method.)
Next, get current row count:
SELECT count(*) FROM table
Once you have that, you should with simple maths be able to figure out how many rows need deleting. Let's say it said 1005 - you need to delete 5 rows.
DELETE FROM table ORDER BY dateRegistered ASC LIMIT 5
Now, unlock the table.
If a lock isn't practical for your scenario, you'll have to be a bit more clever - for example, select the unique ID of all the rows that need deleting, and queue them for gradual deletion. I'll let you work that out yourself :)