I have an issue where I'm filtering a table by a bunch of different values. There's about 30 different filters on this table and since I'm still a novice with MySQL I have it done in a stored procedure executing multiple DELETE queries from a temporary table to filter. This example is only going to show the filter that I'm having issues from, which is a DELETE FROM table WHERE value IN () query.
Here's a test Schisma:
CREATE TABLE accounts (
user_id INT(11) NOT NULL AUTO_INCREMENT,
name VARCHAR(40) NOT NULL,
PRIMARY KEY(user_id)
);
CREATE TABLE blocked (
user_id INT(11) NOT NULL,
other_id INT(11) NOT NULL,
);
INSERT INTO accounts (name) VALUES ('Chris'), ('Andy');
INSERT INTO blocked (user_id, other_id) VALUES (1, 2);
The queries create two tables: the accounts table containing two rows, and the blocked table containing one row where user_id 1 has user_id 2 blocked.
Here's the query that's causing us some problem (Please note that the queries are actually more complex than displayed, but the DELETE query is 100% the same, and the issue persists through the test example provided):
BEGIN
#user_in input is a int(11) value bassed in the CALL FUNCTION(ID).
CREATE TEMPORARY TABLE IF NOT EXISTS filtered AS (SELECT * FROM accounts);
DELETE FROM filtered WHERE user_id IN (SELECT other_id FROM blocked WHERE blocked.user_id = user_in);
SELECT * FROM filtered;
END
This query should delete the row with the user_id field of 2, as in the blocked table the only row is (1, 2).
Running the SELECT query directly providing the user_id returns the other_id of 2.
SELECT other_id FROM blocked WHERE blocked.other_id = 2;
However, the stored procedure returns both rows, instead of just one. Why?
NOTE: The above query is to show what is returned when the query SELECT other_id FROM blocked WHERE blocked.user_id = user_in, another example would be SELECT other_id FROM blocked WHERE blocked.user_id = 1 granted user_in is set to 1. Both of these queries will return a set of (2) which would make the delete query look like DELETE FROM filtered WHERE user_id IN (2). This is not working, for whatever reason.
To get a simple select of that users use next query
SELECT * FROM accounts WHERE accounts.user_id NOT IN (SELECT distinct blocked.other_id from blocked)
To do it with one single select without deleting rows from temporary table use next query:
BEGIN
CREATE TEMPORARY TABLE IF NOT EXISTS filtered AS (SELECT * FROM accounts WHERE accounts.user_id NOT IN (SELECT distinct blocked.other_id from blocked));
SELECT * from filtered;
END
No need for select all in temporary table first and then delete specific rows.
Hope it helps
EDIT:
I'v read the question and still a bit confused about you problem. But i checked this solution and it works perfectly so i don't understand what is problem with this. In your procedure you have
DELETE FROM filtered WHERE user_id IN (SELECT other_id FROM blocked WHERE blocked.user_id = user_in);
and after that you say that
SELECT other_id FROM blocked WHERE blocked.other_id = 2;
And i can say that blocked.other_id and blocked.user_id are two different columns.
No disrespect but amateur mistake to mix up columns. :)
The problem here is with this statement:
DELETE FROM filtered WHERE user_id IN (SELECT other_id FROM blocked WHERE blocked.other_id = user_id);
Try changing it to this:
DELETE FROM filtered WHERE user_id
IN (SELECT other_id FROM blocked);
Reason being that the blocked table has both a other_id and a user_id column. So where you are attempting to join out to the filtered table you are in fact comparing the other_id and user_id columns in the blocked table only. Which are not equal. So no delete happens.
Related
I have a temporary table called 'tempaction'. I wanted to select rows where 'ActionID' matches that of another table. I got the safe update mode error, I think as ActionID is part of a compound primary key. However, when I try
UPDATE action
SET Status = 'Sent'
WHERE ActionID IN( select ActionID from tempaction)
AND DeviceID IN( select DeviceID from tempaction);
I get temporary table cannot be reopened error.
Checking both parts of primary key has worked for the safe update error in the past. I also understand that I cannot reference a temporary table twice in the same statement.
How can I select rows with matching ActionID's or matching ActionID's AND DeviceID's from this temporary table?
Tempory Table
CREATE TEMPORARY TABLE tempaction (ActionID BIGINT)
SELECT *
FROM action
WHERE DeviceID = '1234'
AND Status = 'Pending'
You can try Update using Join with sub-query.
UPDATE action a
JOIN
tempaction t ON a.ActionID = t.ActionID
SET
a.Status = 'Sent';
I know, deleting duplicates from mysql is often discussed here. But none of the solution work fine within my case.
So, I have a DB with Address Data nearly like this:
ID; Anrede; Vorname; Nachname; Strasse; Hausnummer; PLZ; Ort; Nummer_Art; Vorwahl; Rufnummer
ID is primary Key and unique.
And i have entrys for example like this:
1;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Mobile;012345;67890
2;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Fixed;045678;877656
The different PhoneNumber are not the problem, because they are not relevant for me. So i just want to delete the duplicates in Lastname, Street and Zipcode. In that case ID 1 or ID 2. Which one of both doesn't matter.
I tried it actually like this with delete:
DELETE db
FROM Import_Daten db,
Import_Daten dbl
WHERE db.id > dbl.id AND
db.Lastname = dbl.Lastname AND
db.Strasse = dbl.Strasse AND
db.PLZ = dbl.PLZ;
And insert into a copy table:
INSERT INTO Import_Daten_1
SELECT MIN(db.id),
db.Anrede,
db.Firstname,
db.Lastname,
db.Branche,
db.Strasse,
db.Hausnummer,
db.Ortsteil,
db.Land,
db.PLZ,
db.Ort,
db.Kontaktart,
db.Vorwahl,
db.Durchwahl
FROM Import_Daten db,
Import_Daten dbl
WHERE db.lastname = dbl.lastname AND
db.Strasse = dbl.Strasse And
db.PLZ = dbl.PLZ;
The complete table contains over 10Mio rows. The size is actually my problem. The mysql runs on a MAMP Server on a Macbook with 1,5GHZ and 4GB RAM. So not really fast. SQL Statements run in a phpmyadmin. Actually i have no other system possibilities.
You can write a stored procedure that will each time select a different chunk of data (for example by rownumber between two values) and delete only from that range. This way you will slowly bit by bit delete your duplicates
A more effective two table solution can look like following.
We can store only the data we really need to delete and only the fields that contain duplicate information.
Let's assume we are looking for duplicate data in Lastname , Branche, Haushummer fields.
Create table to hold the duplicate data
DROP TABLE data_to_delete;
Populate the table with data we need to delete ( I assume all fields have VARCHAR(255) type )
CREATE TABLE data_to_delete (
id BIGINT COMMENT 'this field will contain ID of row that we will not delete',
cnt INT,
Lastname VARCHAR(255),
Branche VARCHAR(255),
Hausnummer VARCHAR(255)
) AS SELECT
min(t1.id) AS id,
count(*) AS cnt,
t1.Lastname,
t1.Branche,
t1.Hausnummer
FROM Import_Daten AS t1
GROUP BY t1.Lastname, t1.Branche, t1.Hausnummer
HAVING count(*)>1 ;
Now let's delete duplicate data and leave only one record of all duplicate sets
DELETE Import_Daten
FROM Import_Daten LEFT JOIN data_to_delete
ON Import_Daten.Lastname=data_to_delete.Lastname
AND Import_Daten.Branche=data_to_delete.Branche
AND Import_Daten.Hausnummer = data_to_delete.Hausnummer
WHERE Import_Daten.id != data_to_delete.id;
DROP TABLE data_to_delete;
You can add a new column e.g. uq and make it UNIQUE.
ALTER TABLE Import_Daten
ADD COLUMN `uq` BINARY(16) NULL,
ADD UNIQUE INDEX `uq_UNIQUE` (`uq` ASC);
When this is done you can execute an UPDATE query like this
UPDATE IGNORE Import_Daten
SET
uq = UNHEX(
MD5(
CONCAT(
Import_Daten.Lastname,
Import_Daten.Street,
Import_Daten.Zipcode
)
)
)
WHERE
uq IS NULL;
Once all entries are updated and the query is executed again, all duplicates will have the uq field with a value=NULL and can be removed.
The result then is:
0 row(s) affected, 1 warning(s): 1062 Duplicate entry...
For newly added rows always create the uq hash and and consider using this as the primary key once all entries are unique.
I'm currently working on a project with a MySQL Db of more than 8 million rows. I have been provided with a part of it to test some queries on it. It has around 20 columns out of which 5 are of use to me. Namely: First_Name, Last_Name, Address_Line1, Address_Line2, Address_Line3, RefundID
I have to create a unique but random RefundID for each row, that is not the problem. The problem is to create same RefundID for those rows whose First_Name, Last_Name, Address_Line1, Address_Line2, Address_Line3 as same.
This is my first real work related to MySQL with such large row count. So far I have created these queries:
-- Creating Teporary Table --
CREATE temporary table tempT (SELECT tt.First_Name, count(tt.Address_Line1) as
a1, count(tt.Address_Line2) as a2, count(tt.Address_Line3) as a3, tt.RefundID
FROM `tempTable` tt GROUP BY First_Name HAVING a1 >= 2 AND a2 >= 2 AND a3 >= 2);
-- Updating Rows with First_Name from tempT --
UPDATE `tempTable` SET RefundID = FLOOR(RAND()*POW(10,11))
WHERE First_Name IN (SELECT First_Name FROM tempT WHERE First_Name is not NULL);
This update query keeps on running but never ends, tempT has more than 30K rows. This query will then be run on the main DB with more than 800K rows.
Can someone help me out with this?
Regards
The solutions that seem obvious to me....
Don't use a random value - use a hash:
UPDATE yourtable
SET refundid = MD5('some static salt', First_Name
, Last_Name, Address_Line1, Address_Line2, Address_Line3)
The problem is that if you are using an integer value for the refundId then there's a good chance of getting a collision (hint CONV(SUBSTR(MD5(...),1,16),16,10) to get a SIGNED BIGINT). But you didn't say what the type of the field was, nor how strict the 'unique' requirement was. It does carry out the update in a single pass though.
An alternate approach which creates a densely packed seguence of numbers is to create a temporary table with the unique values from the original table and a random value. Order by the random value and set a monotonically increasing refundId - then use this as a look up table or update the original table:
SELECT DISTINCT First_Name
, Last_Name, Address_Line1, Address_Line2, Address_Line3
INTO temptable
FROM yourtable;
set #counter=-1;
UPDATE temptable t SET t,refundId=(#counter:=#counter + 1)
ORDER BY r.randomvalue;
There are other solutions too - but the more efficient ones rely on having multiple copies of the data and/or using a procedural language.
Try using the following:
UPDATE `tempTable` x SET RefundID = FLOOR(RAND()*POW(10,11))
WHERE exists (SELECT 1 FROM tempT y WHERE First_Name is not NULL and x.First_Name=y.First_Name);
In MySQL, it is often more efficient to use join with update than to filter through the where clause using a subquery. The following might perform better:
UPDATE `tempTable` join
(SELECT distinct First_Name
FROM tempT
WHERE First_Name is not NULL
) fn
on temptable.First_Name = fn.First_Name
SET RefundID = FLOOR(RAND()*POW(10,11));
Hi Here i came across a situation in which by mistakenly Without dropping the table i have run the batch file of the table which consists of some insert statements in detail
I have a table like alert_priority consists of records like
Id priority_name
--- --------------
1 P0
2 P1
3 P2
and now by mistakenly without dropping alert_priority i have executed script file of the table which consists of some insert statements and now after executing the script my records in the table are like
Id priority_name
--- --------------
1 P0
2 P1
3 P2
1 P0
2 P1
3 P2
Now i want to delete the records which are extra(records after Id 3) and i should have all the records which are present before i have executed the script file.
Although i have an option to drop the table and execute the script file once again, I wanted to know is there any way which we can do through sql query
I have no primay keys in the table
First , consider setting your ID fields as AI (auto increasment) and even PK (Primary Key).
In order to remove those duplicated rows , we will create a new table and will move all
those duplicated rows to it.
After that , drop that table.
CREATE TEMPORARY TABLE bad_temp AS SELECT DISTINCT * FROM alert_priority
you can copy all unique records into a new table, then delete the old table:
SELECT DISTINCT * INTO new_table FROM old_table
In SQL-Server it would be easy using ROW_NUMBER, but alas MySQL doesn't have a function like that :-(
Best way to solve it would be as follows:
Create a new table identical in structure to the first, but with no
data.
Use the query: INSERT INTO name_of_new_table SELECT DISTINCT * FROM name_of_old_table
Drop the old table
Rename the new table to whatever the old table was called.
CREATE TABLE new_tbl(id int AUTO_INCREMENT,priority_name);
INSERT INTO new_tbl
select priority_name from old_tbl group by priority_name;
To just delete the duplicate new rows and leave the old ones in place (on the basis that I assume there are already other tables whose rows refer to the original rows):-
DELETE FROM alert_priority
WHERE Id IN (SELECT MaxId
FROM (SELECT priority_name, MAX(Id) AS MaxId, COUNT(Id) AS CountId
FROM alert_priority
GROUP BY priority_name
HAVING CountId > 1))
Following query will give you all records that you want to keep:
SELECT min(id)
FROM alert_priority
GROUP BY priority_name
HAVING count(*) > 1
OR min(id) = max(id)
To remove all duplicates, run this query:
DELETE FROM alert_priority
WHERE id NOT IN (
SELECT min(id)
FROM alert_priority
GROUP BY priority_name
HAVING count(*) > 1
OR min(id) = max(id)
)
Hi Here i came across a situation in which by mistakenly Without dropping the table i have run the batch file of the table which consists of some insert statements in detail
I have a table like alert_priority consists of records like
Id priority_name
--- --------------
1 P0
2 P1
3 P2
and now by mistakenly without dropping alert_priority i have executed script file of the table which consists of some insert statements and now after executing the script my records in the table are like
Id priority_name
--- --------------
1 P0
2 P1
3 P2
1 P0
2 P1
3 P2
Now i want to delete the records which are extra(records after Id 3) and i should have all the records which are present before i have executed the script file.
Although i have an option to drop the table and execute the script file once again, I wanted to know is there any way which we can do through sql query
I have no primay keys in the table
First , consider setting your ID fields as AI (auto increasment) and even PK (Primary Key).
In order to remove those duplicated rows , we will create a new table and will move all
those duplicated rows to it.
After that , drop that table.
CREATE TEMPORARY TABLE bad_temp AS SELECT DISTINCT * FROM alert_priority
you can copy all unique records into a new table, then delete the old table:
SELECT DISTINCT * INTO new_table FROM old_table
In SQL-Server it would be easy using ROW_NUMBER, but alas MySQL doesn't have a function like that :-(
Best way to solve it would be as follows:
Create a new table identical in structure to the first, but with no
data.
Use the query: INSERT INTO name_of_new_table SELECT DISTINCT * FROM name_of_old_table
Drop the old table
Rename the new table to whatever the old table was called.
CREATE TABLE new_tbl(id int AUTO_INCREMENT,priority_name);
INSERT INTO new_tbl
select priority_name from old_tbl group by priority_name;
To just delete the duplicate new rows and leave the old ones in place (on the basis that I assume there are already other tables whose rows refer to the original rows):-
DELETE FROM alert_priority
WHERE Id IN (SELECT MaxId
FROM (SELECT priority_name, MAX(Id) AS MaxId, COUNT(Id) AS CountId
FROM alert_priority
GROUP BY priority_name
HAVING CountId > 1))
Following query will give you all records that you want to keep:
SELECT min(id)
FROM alert_priority
GROUP BY priority_name
HAVING count(*) > 1
OR min(id) = max(id)
To remove all duplicates, run this query:
DELETE FROM alert_priority
WHERE id NOT IN (
SELECT min(id)
FROM alert_priority
GROUP BY priority_name
HAVING count(*) > 1
OR min(id) = max(id)
)