Basically I am creating a summary table.
The issue is that sometimes the data in the primary table is modified manually. I am using an ON DUPLICATE KEY UPDATE, however I also need something like ON MISSING KEY DELETE. The summary needs to update to the changed data.
Is the best solution really to simply delete all summary records and re-run the INSERT SELECT query? It just doesn't seem like a good idea.
Any keys not in the select query, should not be in the summary table.
After you've populated the summary_table, you could do this:
DELETE s FROM summary_table s LEFT OUTER JOIN original_table o ON s.id = o.id
WHERE o.id IS NULL;
That will remove from summary_table any rows where the id no longer exists in the original_table.
I don't think there's any way you can do this in one statement.
i'm not sure i understand, but it sounds like you want triggers on the primary table for INSERT and UPDATE that add to the summary, and another trigger on DELETE that subtracts from the summary...
If the summary query is fast and changes are sporadic, you can just rerun the summary. You might consider using triggers so you remove 1 when deleting, add 1 when inserting etc
Related
I want to do the following
Delete all the records from a table called survey and then set the status of deleted reference numbers to zero(0) which is in other table.
The query which I am using is below
Update ref_numbers rn SET rn.status = 0 where ref_no IN (DELETE FROM survey WHERE id < 302)
But this query is not working, I have also tired to modify the delete query like DELETE ref_no FROM survey...... but still it's not working. I think I am missing something which i don't know.
Any help would be highly appreciated.
You can't delete and update in the same operation. You can solve this problem better using either (1) a trigger -- maintain the reference count using an AFTER trigger on the child table, or (2) a view -- create the reference count as a correlated subquery in a view of the parent table.
I have an sql query that finds and groups these duplicates using very complicated conditions:
SELECT right(post_url, LOCATE('-', REVERSE(post_url),LOCATE('-',REVERSE(post_url))+1) -1) as name,
left(post_name,LOCATE('-',post_url,LOCATE('-',post_url)+1) - 1) as city,
post_title as original,ID,post_name,count(*)
FROM table WHERE post_type='finder'
GROUP BY name,city having count(*) > 1
To explain the query, post_url is basically a url name, ending with the name of someone, e.g : new-jersey-something-something-donald-t
I go to the second dash from the right and get the name that way. Then I get the city/state which is in the second dash from the left. In this manner, I've successfully found the duplicates in this database-but I'm having trouble thinking of a way to isolate the duplicate and delete it. In addition, I only want to delete the copy that does not have %near% in post_url. my question is, using the query here, how would I change this to delete the duplicate?
You're not going to be able to do it in one query. That's because you need to write a query that looks something like this:
DELETE FROM table
WHERE id IN (SELECT ... FROM table WHERE ...)
MySQL specifically prohibits this. You can't delete based on a subquery that references the same table. You also can't rewrite this query using JOINs.
There is an easy solution, though: use a temporary table and two queries.
-- build the list of IDs to delete
CREATE TEMPORARY TABLE temp
SELECT ... FROM table WHERE ...
-- now delete those items
DELETE FROM table
WHERE id IN (SELECT id FROM temp);
You can improve performance with JOINs and indexes.
The key to "isolating" the duplicates is to ensure that every item you want to delete has a primary key - that way you can easily build a list of IDs to delete. If your table don't have primary keys, you are reduced to doing WHERE clauses and JOINs on multiple columns - that gets messy very quickly.
So I know in MySQL it's possible to insert multiple rows in one query like so:
INSERT INTO table (col1,col2) VALUES (1,2),(3,4),(5,6)
I would like to delete multiple rows in a similar way. I know it's possible to delete multiple rows based on the exact same conditions for each row, i.e.
DELETE FROM table WHERE col1='4' and col2='5'
or
DELETE FROM table WHERE col1 IN (1,2,3,4,5)
However, what if I wanted to delete multiple rows in one query, with each row having a set of conditions unique to itself? Something like this would be what I am looking for:
DELETE FROM table WHERE (col1,col2) IN (1,2),(3,4),(5,6)
Does anyone know of a way to do this? Or is it not possible?
You were very close, you can use this:
DELETE FROM table WHERE (col1,col2) IN ((1,2),(3,4),(5,6))
Please see this fiddle.
A slight extension to the answer given, so, hopefully useful to the asker and anyone else looking.
You can also SELECT the values you want to delete. But watch out for the Error 1093 - You can't specify the target table for update in FROM clause.
DELETE FROM
orders_products_history
WHERE
(branchID, action) IN (
SELECT
branchID,
action
FROM
(
SELECT
branchID,
action
FROM
orders_products_history
GROUP BY
branchID,
action
HAVING
COUNT(*) > 10000
) a
);
I wanted to delete all history records where the number of history records for a single action/branch exceed 10,000. And thanks to this question and chosen answer, I can.
Hope this is of use.
Richard.
Took a lot of googling but here is what I do in Python for MySql when I want to delete multiple items from a single table using a list of values.
#create some empty list
values = []
#continue to append the values you want to delete to it
#BUT you must ensure instead of a string it's a single value tuple
values.append(([Your Variable],))
#Then once your array is loaded perform an execute many
cursor.executemany("DELETE FROM YourTable WHERE ID = %s", values)
Does anyone know what would be more efficient and use less resources:
Method 1-- Using a single SELECT statement to get data from one table and then iterating through it to execute multiple UPDATEs on another table. E.G. (pseudo-code, execute() runs query):
Query1_resultset = execute("SELECT item_id, sum(views) as view_count FROM tableA WHERE condition=1");
while(Query1_resultset as row) {
execute("UPDATE tableB SET view_count=row.view_count WHERE id=row.item_id");
}
Method 2-- Use a single INSERT.. ON DUPLICATE KEY UPDATE statement with a nested SELECT statement. E.G.:
INSERT INTO tableB (id, view_count) SELECT item_id, SUM(views) as view_count FROM tableA WHERE condition=1 ON DUPLICATE KEY UPDATE view_count=VALUES(view_count);
Note: ID on tableB is a primary key. There actually won't be any INSERTS because I know the key will exist. So it's all UPDATEs. Just using this statement to pass in a single query rather than multiple.
I'm really curious as to why either would be more efficient. Is it the number of queries that determines how quickly it will run? Where is the bottleneck?
I'm looking for something that will scale (the number of rows being updated grows daily).
Any ideas?
Thanks
It depens on your update/insert ratio. If you have lots of inserts and only a couple of updates than the INSERT ... ON DUPLICATE KEY UPDATE statement will be faster.
If you mainly have updates, than you would be better off with an UPDATE statement and an insert as fallback (if there was no update). You could use the multi table update clause to do it with a single update instead of a select followed by an update by the way. If you're doing both a SELECT and an UPDATE than the INSERT will definately be faster.
I think INSERT.. ON DUPLICATE KEY UPDATE is more efficient (otherwise, it wouldn't make much sense to add such an extension). By the way, your first example is not exactly the same as the second one - you neither use transactions nor you lock the table, so it's possible that the record returned by SELECT will not exist by the time you execute UPDATE.
Is there a way to remove all repeat rows from a MySQL database?
A couple of years ago, someone requested a way to delete duplicates. Subselects make it possible with a query like this in MySQL 4.1:
DELETE FROM some_table WHERE primaryKey NOT IN
(SELECT MIN(primaryKey) FROM some_table GROUP BY some_column)
Of course, you can use MAX(primaryKey) as well if you want to keep the newest record with the duplicate value instead of the oldest record with the duplicate value.
To understand how this works, look at the output of this query:
SELECT some_column, MIN(primaryKey) FROM some_table GROUP BY some_column
As you can see, this query returns the primary key for the first record containing each value of some_column. Logically, then, any key value NOT found in this result set must be a duplicate, and therefore it should be deleted.
These questions / answers might interest you :
How to delete duplicate records in mysql database?
How to delete Duplicates in MySQL table.
And idea that's often used when you are working with a big table is to :
Create a new table
Insert into that table the unique records (i.e. only one version of the duplicates in the original table, generally using a select distinct)
and use that new table in your application ; or drop the old table and rename the new one to the old name.
Good thing with this principle is you have the possibility to verify what's in the new table before dropping the old one -- always nice to check that sort of thing ^^