I have a MySQL table (TABLE1) with 400 thousand records
CREATE TABLE `TABLE1` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`NAME` varchar(255) NOT NULL,
`VALUE` varchar(255) NOT NULL,
`UID` varchar(255) NOT NULL,
`USER_ID` varchar(255) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `ukey1` (`VALUE`,`NAME`,`UID`),
UNIQUE KEY `ukey2` (`UID`,`NAME`,`VALUE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE2` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`UID` varchar(255) DEFAULT NULL,
`TABLE3ID` bigint(20) NOT NULL
PRIMARY KEY (`ID`),
KEY `FKEY` (`TABLE3ID`),
CONSTRAINT `FKEY` FOREIGN KEY (`TABLE3ID`) REFERENCES `TABLE3` (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE3` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`TYPEID` bigint(20) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
The following query is very slow and takes hours and finally fails
delete from TABLE1 t1
inner join TABLE2 t2 on t1.UID=t2.UID
inner join TABLE3 t3 on t2.TABLE3ID=t3.ID
where t3.TYPEID in (234,3434) t1.USER_ID is not null and t1.USER_ID <> '12345';
Visual explain shows the following and adding index on UID not helping. How to optimize the performance of this query?
I tried adding an index on TABLE1.UID
Converting into a subquery
A simple query like SELECT * FROM TABLE3 where UID="SOMEUID" takes 800+ ms to fetch data
Change it to a JOIN.
DELETE t1
FROM TABLE1 AS t1
JOIN (SELECT uid FROM ...) AS t2 ON t1.uid = t2.uid
WHERE USER_ID is not null and USER_ID <> '12345';
I've found that MySQL implements WHERE uid IN (subquery) very poorly sometimes. Instead of getting all the results of the subquery and looking them up in the index of the table, it scans the table and performs the subquery for each row, then checks if the uid is in that result.
First of all make a backup of that table this is the first rule for doing a delete queries or you can ruin it and take all the precautions that you considere before
( uid1,uid2,...uid45000)
What is the meaning of those values between the parenthesis ? Are you need to compare in the list all the UID values or some of them?
beacause you can avoiding put all the UIDS manually like this.
delete from TABLE1 where UID in (SELECT T.UID FROM TABLE1 as T where T.UID is not NULL and USER_ID <> '12345');
Before to doing this please check what do you want between the parenthesis and run the command in a TEST environment first with dummy values
Take in consideration that you have in the table varchars types in the UIDS field and thats the reason that this operation take a lot of time more than if you are using integer values
The other way is that you need to create a new table and put the data that you need to store for the old table, next truncate the original table and reinsert the same values of the new table to the old table again
Please before to run a solution check all your restrictions with your teamates and make a test with dummy values
I would split your uid filter list in chunks (100 by chunk or other, need to test) and iterate or multithread over it
Related
I have two tables with the following structure and example content. Table one has the membership_no set to the correct values, but table two has some incorrect values in the membership_no column. I am needing to query both tables and check to see when the membership_no values are not equal, then update table two's membership_no column with the value from table one.
Table One:
id membership_no
====================
800960 800960
800965 800965
Table Two:
id membership_no
====================
800960 800970
800965 800975
Update query so far. It is not catching all of the incorrect values from table two.
UPDATE
tabletwo
INNER JOIN
tableone ON tabletwo.id = tableone.id
SET
tabletwo.membership_no = tableone.membership_no;
EDIT: Including SHOW CREATE and SELECT queries for unmatched membership_no column values.
Table One SHOW:
CREATE TABLE `n2z7m3_kiduka_accounts_j15` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`membership_no` int(11) NOT NULL,
...
`membershipyear` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=800987 DEFAULT CHARSET=utf8
Table Two SHOW:
CREATE TABLE `n2z7m3_kiduka_accounts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`membership_no` int(11) NOT NULL,
...
`membershipyear` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=801072 DEFAULT CHARSET=utf8
SELECT query for unmatched membership_no column values:
SELECT
u.name,
a.membership_no as 'Joomla 1.5 accounts table',
j.membership_no as 'Joomla 3.0 accounts table'
FROM
n2z7m3_kiduka_accounts_j15 AS a
INNER JOIN n2z7m3_users AS u ON a.user_id = u.id
INNER JOIN n2z7m3_kiduka_accounts AS j ON a.user_id = j.membership_no
and a.membership_no != j.membership_no
ORDER BY u.name;
While Tim's Answer is perfectly valid, another variation is to add the filter qualifier to the ON clause such that:
UPDATE tabletwo
INNER JOIN
tableone ON tabletwo.id = tableone.id AND tabletwo.membership_no <> tableone.membership_no
SET
tabletwo.membership_no = tableone.membership_no;
This means that you don't have the WHERE filter so it will process all rows, but will act on only those with differing membership_no values. Because it is an INNER JOIN the results will be both tables or no tables (Skipped/NULL result).
EDIT:
If you suspect you have a problem still, what does the MySQL command respond, do you have a specific error notice? With 80k columns, it may take a while for the comand to actually process , so are you giving the command time to complete or is PHP or the system causing the command to abort due to execution time expiry? (Update your execution time on PHP and MySQL and rerun query just to see if that causes it to complete successfully?)
Suggestion
As another sggestion I think your UNIQUE KEY should also be your AI key so for both tables:
DROP INDEX `user_id` ON <table> #removes the current unique index.
then
CREATE UNIQUE INDEX `id` ON <table> #addes unique index to the A_I column.
You just need to add a WHERE clause:
UPDATE
tabletwo
INNER JOIN
tableone
ON tabletwo.id = tableone.id
SET
tabletwo.membership_no = tableone.membership_no
WHERE tabletwo.membership_no <> tableone.membership_no
I am trying to delete all rows from table1 that have a matching PK in table2. I am getting error 1175 though my WHERE clause is using a Key. I'm familiar with toggling the safe mode but this should not be an issue as again my WHERE clause does contain the Private Key for both tables. Any suggestions to resolve would be greatly appreciated. Further details below.
table1 structure:
CREATE TABLE `table1` (
`pkfield` varchar(10) NOT NULL,
`field1` varchar(3) DEFAULT NULL,
`field2` varchar(1) DEFAULT NULL,
`field3` varchar(1) DEFAULT NULL,
PRIMARY KEY (`pkfield`),
UNIQUE KEY `pkfield_UNIQUE` (`pkfield`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
table2 stucture:
CREATE TABLE `table2` (
`pkfield` varchar(10) NOT NULL,
PRIMARY KEY (`pkfield`),
UNIQUE KEY `pkfield_UNIQUE` (`pkfield`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Delete Query:
DELETE table1.*, table2.* FROM table1 INNER JOIN table2
WHERE table1.pkfield=table2.pkfield;
Action Output Response:
Error Code: 1175. You are using safe update mode and you tried to update a table without a WHERE that uses a KEY column To disable safe mode, toggle the option in Preferences -> SQL Queries and reconnect.
Thank you.
I think an EXISTS clause might accomplish what you described:
DELETE
FROM table1 t1
WHERE EXISTS
(SELECT 'x'
FROM table2 t2
where t2.pkfield = t1.pkfield)
I included a mock condition in the WHERE part of the query to bypass the Safe Update though I still don't know why the previous condition doesn't fulfill the requirement of having a PK in the WHERE.
DELETE FROM table1 WHERE (pkfield IN (SELECT pkfield FROM table2)) and pkfield<>""
The mock condition just checks to confirm the field is not empty which it would never be being the table key.
I have the following (simplified) database schema:
CREATE TABLE IF NOT EXISTS `wm_renderings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`formula_id` int(11) NOT NULL,
`creation_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`svg` text COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`id`),
KEY `formula_id` (`formula_id`)
);
where formula_id is a foreign key.
I want to get the latest rendering for every formula_id. But when I write
SELECT `id`, `formula_id`, `svg`
FROM `wm_renderings`
GROUP BY `formula_id`
ORDER BY `creation_time` DESC
I would get a "random" rendering for each formula_id.
My approach would be to get all formula ids and then send a query for every single formula_id:
SELECT `id`, `formula_id`, `svg`
FROM `wm_renderings`
WHERE `formula_id` = 42
ORDER BY `creation_time` DESC
LIMIT 1
However, that would be a lot of queries.
Can I get the same with only one query?
The way to get the max/latest per group could be done is various way and one such way is to use left join
select t1.* from wm_renderings t1
left join wm_renderings t2 on t1.formula_id = t2.formula_id
and t1.creation_time < t2.creation_time
where t2.id is null
Here is the documentation on it
http://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html
Left join and Uncorrelated subs-query are considered to be better in terms of performance.
I have two tables, which I need to merge, and they are:
CREATE TABLE IF NOT EXISTS `legacy_bookmarks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`url` text,
`title` text,
`snippet` text,
`datetime` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `datetime` (`datetime`),
FULLTEXT KEY `title` (`title`,`snippet`)
)
And:
CREATE TABLE IF NOT EXISTS `legacy_links` (
`id` mediumint(11) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(11) NOT NULL,
`bookmark_id` int(11) NOT NULL,
`status` enum('public','private') NOT NULL DEFAULT 'public',
UNIQUE KEY `id` (`id`),
KEY `bookmark_id` (`bookmark_id`)
)
As you can see, "legacy_links" contains the ID for "legacy_bookmarks". Am I able to merge the two, based on this relationship?
I can easily change the name of the ID column in "legacy_bookmarks" to "bookmark_id", if that makes things any easier.
Just so you know, the order of the columns, and their types, must be exact, because the data from this combined table is then to be imported into the new "bookmarks" table.
Also, I'd need to able to include additional columns (a "modification" column, populated with the "datetime" values), and change the order of the ones I have.
Any takers?
[Up to you to change the order of the columns]
CREATE TABLE `legacy_linkss` AS
SELECT l.id, l.url, l.title, l.snippet, l.datetime AS modification, b.user_id, b.status
FROM
`legacy_links` l
JOIN `legacy_bookmarks` b ON b.id = l.bookmark_id
;
Afterwards, after checking the consistency and adding manually the constraints, you may:
DROP TABLE `legacy_links`;
DROP TABLE `legacy_bookmarks`;
RENAME TABLE `legacy_linkss` TO `legacy_links`;
Yes, it's called a join, and you would do it like so:
SELECT *
FROM legacy_bookmarks lb
INNER JOIN legacy_links ll ON ll.bookmark_id = lb.id
I have a table that contains two bigint columns: beginNumber, endNumber, defined as UNIQUE. The ID is the Primary Key.
ID | beginNumber | endNumber | Name | Criteria
The second table contains a number. I want to retrieve the record from table1 when the Number from table2 is found to be between any two numbers. The is the query:
select distinct t1.Name, t1.Country
from t1
where t2.Number
BETWEEN t1.beginIpNum AND t1.endNumber
The query is taking too much time as I have so many records. I don't have experience in DB. But, I read that indexing the table will improve the search so MySQL does not have to pass through every row searching about m Number and this can be done by, for example, having UNIQE values. I made the beginNumber & endNumber in table1 as UNIQUE. Is this all what I can do ? Is there any possible way to improve the time ? Please, provide detailed answers.
EDIT:
table1:
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`beginNumber` bigint(20) DEFAULT NULL,
`endNumber` bigint(20) DEFAULT NULL,
`Name` varchar(255) DEFAULT NULL,
`Criteria` varchar(455) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `beginNumber_UNIQUE` (`beginNumber`),
UNIQUE KEY `endNumber_UNIQUE` (`endNumber `)
) ENGINE=InnoDB AUTO_INCREMENT=327 DEFAULT CHARSET=utf8
table2:
CREATE TABLE `t2` (
`id2` int(11) NOT NULL AUTO_INCREMENT,
`description` varchar(255) DEFAULT NULL,
`Number` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id2`),
UNIQUE KEY ` description _UNIQUE` (`description `)
) ENGINE=InnoDB AUTO_INCREMENT=433 DEFAULT CHARSET=utf8
This is a toy example of the tables but it shows the concerned part.
I'd suggest an index on t2.Number like this:
ALTER TABLE t2 ADD INDEX numindex(Number);
Your query won't work as written because it won't know which t2 to use. Try this:
SELECT DISTINCT t1.Name, t1.Criteria
FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE t2.Number BETWEEN t1.beginNumber AND t1.endNumber);
Without the t2.Number index EXPLAIN gives this query plan:
1 PRIMARY t1 ALL 1 Using where; Using temporary
2 DEPENDENT SUBQUERY t2 ALL 1 Using where
With an index on t2.Number, you get this plan:
PRIMARY t1 ALL 1 Using where; Using temporary
DEPENDENT SUBQUERY t2 index numindex numindex 9 1 Using where; Using index
The important part to understand is that an ALL comparison is slower than an index comparison.
This is a good place to use binary tree index (default is hashmap). Btree indexes are best when you often sort or use between on column.
CREATE INDEX index_name
ON table_name (column_name)
USING BTREE