MySQL Query with Subqueries taking longer than it should - mysql

I've been trying to find the cause for the slowdown in the query. The query is originally a DELETE query, but I've been using a SELECT * from
This is the query in question
SELECT * FROM table1
where table1.id IN (
#Per friends suggestion I wrapped the subquery in a subquery (yo dawg) to "cache" it, it works on other queries, but not on this time.
SELECT id FROM (
(
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids )
) AS moreIds
)
The table in question is 16 gigs.
The server it's being ran against is beefy enough not to be a bottleneck.
Fields id,salesperson_id and type are all indexed.Checked it 5 times.
The subquery itself runs extremely fast. Subquery:
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
)
In the processlist the query is stuck in the state of "SENDING DATA". But Workbench indicates that the query is still running.
Here's an EXPLAIN SELECT of the query
'1', 'PRIMARY', 'table1', 'index', NULL, 'SALES_FK_ON_SALES_STATE', '5', NULL, '36688459', 'Using where; Using index'
'2', 'DEPENDENT SUBQUERY', '<derived3>', 'ALL', NULL, NULL, NULL, NULL, '500', 'Using where'
'3', 'DERIVED', '<derived4>', 'ALL', NULL, NULL, NULL, NULL, '500', ''
'4', 'DERIVED', 'table4', 'index_merge', 'PRIMARY,IDX_9F61CEFC727ACA70', 'PRIMARY,IDX_9F61CEFC727ACA70', '4,5', NULL, '67', 'Using union(PRIMARY,IDX_9F61CEFC727ACA70); Using where; Using index'
'4', 'DERIVED', 'table3', 'ref', 'PRIMARY,IDX_C077730FFFA0C224', 'IDX_C077730FFFA0C224', '5', 'hugeDb.table4.id', '381', 'Using where; Using index'
'4', 'DERIVED', 'table2', 'ref', 'PRIMARY,UNIQ_36E3BDB1A76ED395', 'UNIQ_36E3BDB1A76ED395', '5', 'hugeDb.table3.id', '1', 'Using where; Using index'
'4', 'DERIVED', 'table1', 'ref', 'SALESPERSON,SALES_FK_ON_SALES_STATE', 'SALES_FK_ON_SALES_STATE', '5', 'hugeDb.table2.id', '115', 'Using where'
Here are the SHOW CREATE TABLES
CREATE TABLE `table4` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`logo_file_id` int(11) DEFAULT NULL,
`contact_address_id` int(11) DEFAULT NULL,
`billing_address_id` int(11) DEFAULT NULL,
`parent_id` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`fax` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`contact_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`license_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`list_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`routing_address_id` int(11) DEFAULT NULL,
`billed_separately` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_9F61CEFCA7E1931C` (`logo_file_id`),
KEY `IDX_9F61CEFC320EF6E2` (`contact_address_id`),
KEY `IDX_9F61CEFC79D0C0E4` (`billing_address_id`),
KEY `IDX_9F61CEFC727ACA70` (`parent_id`),
KEY `IDX_9F61CEFC40F0487C` (`routing_address_id`),
-- CONSTRAINT `FK_9F61CEFC320EF6E2` FOREIGN KEY (`contact_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFC79D0C0E4` FOREIGN KEY (`billing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCA7E1931C` FOREIGN KEY (`logo_file_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCE346079F` FOREIGN KEY (`routing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_9F61CEFC727ACA70` FOREIGN KEY (`parent_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=750 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table3` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`office_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`profile_id` int(11) DEFAULT NULL,
`deleted` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_C077730FFFA0C224` (`office_id`),
KEY `IDX_C077730FA76ED395` (`user_id`),
KEY `IDX_C077730FCCFA12B8` (`profile_id`),
-- CONSTRAINT `FK_C077730FA76ED395` FOREIGN KEY (`user_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_C077730FCCFA12B8` FOREIGN KEY (`profile_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_C077730FFFA0C224` FOREIGN KEY (`office_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=382425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table2` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_36E3BDB1A76ED395` (`user_id`),
CONSTRAINT `FK_36E3BDB1A76ED395` FOREIGN KEY (`user_id`) REFERENCES `table3` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=174049 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`salesperson_id` int(11) DEFAULT NULL,
`count_active_contracts` int(11) NOT NULL,
`average_initial_price` decimal(12,2) NOT NULL,
`average_contract_value` decimal(12,2) NOT NULL,
`total_sold` int(11) NOT NULL,
`total_active` int(11) NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`type` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`services_scheduled_today` int(11) NOT NULL,
`services_scheduled_week` int(11) NOT NULL,
`services_scheduled_month` int(11) NOT NULL,
`services_scheduled_summer` int(11) NOT NULL,
`serviced_today` int(11) NOT NULL,
`serviced_this_week` int(11) NOT NULL,
`serviced_this_month` int(11) NOT NULL,
`serviced_this_summer` int(11) NOT NULL,
`autopay_account_percentage` decimal(3,2) NOT NULL,
`value_per_door` decimal(12,2) NOT NULL,
`total_paid` int(11) NOT NULL,
`sales_status_summary` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`total_serviced` int(11) NOT NULL,
`services_scheduled_year` int(11) NOT NULL,
`serviced_this_year` int(11) NOT NULL,
`services_scheduled_yesterday` int(11) NOT NULL,
`serviced_yesterday` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `SALESPERSON` (`type`),
KEY `SALES_FK_ON_SALES_STATE` (`salesperson_id`),
CONSTRAINT `SALES_FK_ON_SALES_STATE` FOREIGN KEY (`salesperson_id`) REFERENCES `table2` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=181662521 DEFAULT CHARSET=utf8;

When you see "DEPENDENT SUBQUERY" in the explain, it isn't caching the result of the subquery. It's re-executing the subquery many times (once for each distinct value in the outermost query). I see in the explain that your outermost query is examining 36 million rows. So this is probably running the subquery many, many times.
This is documented here: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
For DEPENDENT SUBQUERY, the subquery is re-evaluated only once for each set of different values of the variables from its outer context. For UNCACHEABLE SUBQUERY, the subquery is re-evaluated for each row of the outer context.
One way to avoid this is to use a subquery as a derived table instead of as the argument to an IN() predicate. This is a better way to do a semi-join like you're doing.
SELECT ... FROM TableA
WHERE TableA.id IN (SELECT id FROM ...)
Should be equivalent to:
SELECT ... FROM TableA
JOIN (SELECT DISTINCT id FROM ...) AS TableB
ON TableA.id = TableB.id
The use of DISTINCT in the subquery means there's only one row per id returned by the subquery, so the join won't multiply the number of rows from TableA if there are multiple matches. This makes it a semi-join.
The following should do better:
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = 'Snapshot'
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids ON table1.id = ids.id;
You might also try to get rid of the index_merge. You're getting that because you're using OR for two different indexed columns in table4. It uses both indexes, and then unions them. Sometimes† it's better to use a UNION of two subqueries explicitly, instead of relying on the index_merge.
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN (
SELECT id FROM table4 WHERE id=25
UNION
SELECT id FROM table4 WHERE parent_id=25
) AS t4 ON table3.office_id = t4.id
WHERE table1.type = 'Snapshot'
LIMIT 500
) AS ids ON table1.id = ids.id;
You're also using LEFT JOIN unnecessarily, so I replaced it with JOIN. The MySQL optimizer will silently convert it to an inner join, but I think you should study what LEFT JOIN means, and use it when it's called for.
† I say "sometimes" because which method is best might depend on your data, so you should test it both ways.

Due to me needing to limit a delete query with joins(which isn't possible in mysql), there is an another option. Which is in no way the better one (Can't beat Bill's answer).
But it works, and the query is extremely fast, albeit, not very flexible. Because it has a minimum amount of rows it can pull, which for a 38M row table is 575k (no idea why)
But here it is:
SELECT COUNT(*) FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 113 OR table4.parent_id =113
AND RAND()<=0.001;
But Bill's answer should be more than enough for everyone.
P.S. I'll ask the question about RAND() in a Where Clause and will post the link here. Maybe it will help some desperate dev in 2025.

You got carried away with nesting, etc.
SELECT table1.*
FROM
(
SELECT table1.id
FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25
OR table4.parent_id =25
LIMIT 500
) AS ids
JOIN table1 USING(id)
Some discussion:
It is better to find the 500 ids and throw them into a tmp table than to haul around all the columns of table1.*. Hence the subquery with LIMIT 500.
Bill's UNION seems to be unnecessary since the Optimizer decided to use "index merge union". This may be only the second time I have seen that feature in use!
IN ( SELECT ... ) is probably never faster than an equivalent JOIN or EXISTS, whichever is appropriate. (JOIN is appropriate for your case.)
For table4, you have a perfectly good 'natural PK in logo_file_id, why not get rid of id and promote that to PK? (Similarly in table2.)
Aarrgghh... By doing my previous suggestion, you can bypass table2!
table1 has 181M rows? INT is always 4 bytes. You have a lot of columns that sound like small counters; consider using TINYINT UNSIGNED (1 byte; range: 0..255) or SMALLINT UNSIGNED. That should shrink the size of the table significantly, thereby speeding up cacheability and use of the table somewhat.

Related

How to optimze this Mysql simple query

This query is taking 450ms
SELECT `u`.`user_id`, `c`.`company`
FROM `users` AS `u`
LEFT JOIN `companies` AS `c` ON `c`.`user_id` = `u`.`user_id`
WHERE `u`.`user_id` = 'search_term'
OR `u`.`lname` LIKE 'search_term%'
OR `u`.`email` LIKE 'search_term%'
OR `c`.`company` LIKE 'search_termeo%'
tables:
users (260250 rows)
companies (570 rows)
structures:
- users:
CREATE TABLE `users` (
`user_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`region_id` int(10) unsigned NOT NULL,
`fname` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`lname` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`email` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL,
`password` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`phone` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`user_id`),
KEY `idx_lname` (`lname`),
KEY `idx_email` (`email`),
UNIQUE KEY `unq_region_id_email` (`region_id`, `email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
- companies:
CREATE TABLE `companies` (
`user_id` int(10) unsigned NOT NULL,
`company` varchar(35) COLLATE utf8mb4_unicode_ci NOT NULL,
`vat_num` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`user_id`),
KEY `idx_company` (`company`) USING BTREE,
CONSTRAINT `users_companies_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
The result of explain query
I think 450ms is too much for such query and such little amount of data
and I want to know if there is somthing to optimize
query run in querious v3 under iMac 2017, 3,4 GHz, 16Go
Mysql: 5.7.26 on MAMP pro v5.7
OR conditions when not on the same field or range based (such as <, >, LIKE) really decrease MySQL's ability to take advantage of indexes; you can restructure queries by breaking them down into separate simpler ones that you can then UNION. Separating it out like this allows MySQL to take advantage of a different index of each query within the UNIONs
SELECT `u`.`user_id`, `c`.`company`
FROM `users` AS `u` LEFT JOIN `companies` AS `c` ON `c`.`user_id` = `u`.`user_id`
WHERE `u`.`user_id` = 'search_term'
UNION DISTINCT
SELECT `u`.`user_id`, `c`.`company`
FROM `users` AS `u` LEFT JOIN `companies` AS `c` ON `c`.`user_id` = `u`.`user_id`
WHERE `u`.`lname` LIKE 'search_term%'
UNION DISTINCT
SELECT `u`.`user_id`, `c`.`company`
FROM `users` AS `u` LEFT JOIN `companies` AS `c` ON `c`.`user_id` = `u`.`user_id`
WHERE `u`.`email` LIKE 'search_term%'
UNION DISTINCT
SELECT `u`.`user_id`, `c`.`company`
FROM `users` AS `u` INNER JOIN `companies` AS `c` ON `c`.`user_id` = `u`.`user_id`
WHERE `c`.`company` LIKE 'search_termeo%'
;
Also, note that I changed the last one's JOIN to an INNER since any condition on the right-hand table of a LEFT JOIN (that isn't "without a match from that table") is basically an INNER JOIN anyway.
UNION DISTINCT is used to prevent records that satisfied multiple conditions from being repeated, however... if companies.company is not unique (i.e. company id 1 called "Blah" and company id 12 also called "Blah") then those will also be merged where they would not be in your original query; if it is a potential issue, that can be remedied by also including company_id in each SELECT.

select count, group by and having optimization

I have this query
SELECT
t2.counter_id,
t2.hash_counter,
count(1) AS cnt
FROM
table1 t1
RIGHT JOIN
table2 t2 USING(counter_id)
WHERE
t2.hash_id = 973
GROUP BY
t1.counter_id
HAVING
cnt < 8000
Here are the tables.
CREATE TABLE `table1` (
`id` varchar(255) NOT NULL,
`platform` varchar(32) DEFAULT NULL,
`version` varchar(10) DEFAULT NULL,
`edition` varchar(2) NOT NULL DEFAULT 'us',
`counter_id` int(11) NOT NULL,
`created_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `counter_id` (`counter_id`)
) ENGINE=InnoDB
CREATE TABLE `table2` (
`counter_id` int(11) NOT NULL AUTO_INCREMENT,
`hash_id` int(11) DEFAULT NULL,
`hash_counter` int(11) DEFAULT NULL,
PRIMARY KEY (`counter_id`),
UNIQUE KEY `counter_key` (`hash_id`,`hash_counter`)
) ENGINE=InnoDB
The "EXPLAIN" shows "Using index; Using temporary; Using filesort" for table t2. Is there any way to get rid off temporary/filesort ? or any other ideas about optimizing this guy.
Your comment above gives more insight into what you want. It is always better to explain more about what you are trying to achieve - just looking at the non-working SQL leads people down the wrong path.
So, you want to know which table2 rows have < 8000 table1 rows?
Why not this:
select *
from table2 as t2
where hash_id = 973
and 8000 < (select count(*) from table1 as t1 where t1.counter_id = t2.counter_id)
;

Improving the MySQL Query

I have the following query which filters the row with replyAutoId=0 and then fetches the most recent record of each propertyId. Now the query takes 0.23225 sec for fetching just 5,435 from 21,369 rows and I want to improve this. All I am asking is, Is there a better way of writing this query ? Any suggestions ?
SELECT pc1.* FROM (SELECT * FROM propertyComment WHERE replyAutoId=0) as pc1
LEFT JOIN propertyComment as pc2
ON pc1.propertyId= pc2.propertyId AND pc1.updatedDate < pc2.updatedDate
WHERE pc2.propertyId IS NULL
The SHOW CREATE TABLE propertyComment Output:
CREATE TABLE `propertyComment` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`propertyId` int(11) NOT NULL,
`agentId` int(11) NOT NULL,
`comment` longtext COLLATE utf8_unicode_ci NOT NULL,
`replyAutoId` int(11) NOT NULL,
`updatedDate` datetime NOT NULL,
`contactDate` date NOT NULL,
`status` enum('Y','N') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'N',
`clientStatusId` int(11) NOT NULL,
`adminsId` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `propertyId` (`propertyId`),
KEY `agentId` (`agentId`),
KEY `status` (`status`),
KEY `adminsId` (`adminsId`),
KEY `replyAutoId` (`replyAutoId`)
) ENGINE=MyISAM AUTO_INCREMENT=21404 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Try to get rid of the nested query.
The following query should give the same result as your original query:
SELECT pc1.*
FROM propertyComment AS pc1
LEFT JOIN propertyComment AS pc2
ON pc1.propertyID = pc2.propertyId AND pc1.updatedDate < pc2.updatedDate
WHERE pc1.replyAutoId = 0 AND pc2.propertyID IS NULL
SELECT pc1.* FROM (SELECT * WHERE replyAutoId=0) as pc1
LEFT JOIN (SELECT propertyID, updatedDate from propertyComment order by 1,2) as pc2
ON pc1.propertyId= pc2.propertyId AND pc1.updatedDate < pc2.updatedDate
WHERE pc2.propertyId IS NULL
You also don't have any indexes?
If you did on primary key, you're not joining on it, so why include it?
Why not only select the columns you're interested from B table? This will limit the number of columns you're selecting from table B. Since you're pulling everything from table A where replyAutoID = 0, it wouldn't make much sense to limit the columns there. This should speed it up little.

How to optimize this query as the in array seems to slow things down significantly

I am looking to find out the best way to optimize a query like this:
SELECT
a.ID,
a.ECPCodeID,
a.RegDate,
a.BusName,
a.City,
a.AccountNum,
b.ID as RepCodeID,
b.RepCode
FROM ECPs_Registration a,
Reps_Codes b
WHERE (SUBSTR(a.PostalCode,1,5)IN(SELECT
SUBSTR(Zip,1,5)
FROM Reps_Zip
WHERE RepCodeID = b.ID)
AND a.AccountNum NOT IN(SELECT
ShipTo
FROM Reps_ShipTo))
OR a.AccountNum IN(SELECT
ShipTo
FROM Reps_ShipTo
WHERE RepCodeID = b.ID)
ORDER BY b.RepCode,a.BusName,a.City
I know there are more factors involved such as indexes and such, I just am asking about the query part of it for now. Mainly, since I have to go through the Reps_ShipTo and Reps_Zip tables for tons of records. I thought about changing something like:
a.AccountNum NOT IN (SELECT ShipTo FROM Reps_ShipTo)
INTO
(SELECT count(*) FROM Reps_ShipTo WHERE a.AccountNum = ShipTo) = 0
Not sure if that is proper or if there is a better way. Any help would be appreciated. Thanks.
EDIT:
Schema:
CREATE TABLE IF NOT EXISTS `ECPs_Codes` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ECPCode` char(4) NOT NULL,
PRIMARY KEY (`ID`),
KEY `ECPCode` (`ECPCode`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 ;
CREATE TABLE IF NOT EXISTS `ECPs_Registration` (
`RegDate` datetime NOT NULL,
`ID` int(10) NOT NULL AUTO_INCREMENT,
`ECPCodeID` int(11) NOT NULL,
`FirstName` varchar(200) NOT NULL,
`LastName` varchar(200) NOT NULL,
`BusName` varchar(200) NOT NULL,
`Address` varchar(200) NOT NULL,
`Address2` varchar(200) NOT NULL,
`City` varchar(100) NOT NULL,
`Province` char(2) NOT NULL,
`Country` varchar(100) NOT NULL,
`PostalCode` varchar(10) NOT NULL,
`Email` varchar(200) NOT NULL,
`AccountNum` int(8) NOT NULL,
PRIMARY KEY (`ID`),
KEY `ECPCodeID` (`ECPCodeID`),
KEY `PostalCode` (`PostalCode`),
KEY `AccountNum` (`AccountNum`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `Reps_Codes` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(50) NOT NULL,
`RepCode` varchar(16) NOT NULL,
`AllAccess` tinyint(4) NOT NULL,
PRIMARY KEY (`ID`),
KEY `RepCode` (`RepCode`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `Reps_ShipTo` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`RepCodeID` int(11) NOT NULL,
`ShipTo` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`ID`),
KEY `RepID` (`RepCodeID`),
KEY `ShipTo` (`ShipTo`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE IF NOT EXISTS `Reps_Zip` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`RepCodeID` int(11) NOT NULL,
`Zip` varchar(10) NOT NULL,
PRIMARY KEY (`ID`),
KEY `RepCodeID` (`RepCodeID`),
KEY `Zip` (`Zip`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
There are two things that massively hurt performance on your query.
You are joining two tables by combining multiple conditions, each needing subqueries
You're doing a join on two tables using SUBSTR(Zip,1,5)=SUBSTR(postalcode,1,5)
The logic behind your query seems to be something like:
For every ECPs_Registration find the matching record in Rep_Codes
using the following rules:
If there is a matching record in Reps_ShipTo, to for that registration, use that table to look it up (primary match)
If there isn't a matching record in Reps_ShipTo, seek through Reps_Zip for a matching RepCode by Zipcode-match (secondary)
Now if the above fully describes your situation, you should probably start off by redesigning your database.
The Reps_ShipTo table creates a 0:N relationship between ECPs_Registration and Rep_Codes. Such relations don't need an extra table - they can simply be stored as nullable foreign keys - in your case a RepCodeId in ECPs_Registration would do the trick, and would remove the entire Reps_ShipTo table from the database.
You should probably also create (yes, redundant) extra columns that only store the first 5 letters of the zip codes in both ECPs_Registration and Reps_Zip. This will allow simple equality matches instead of the SUBSTR-functions. Or, you might decide to do this match only once for every record, and store the result in above RepCodeId, which totally eliminates the dual join.
The following query assumes you for some reason don't want to or can't change your database:
SELECT
a.ID, a.ECPCodeID, a.RegDate, a.BusName, a.City, a.AccountNum,
CASE (b1.ID IS NOT NULL, b1.ID, b2.ID) as RepCodeID,
CASE (b1.ID IS NOT NULL, b1.RepCode, b2.RepCode) as MyRepCode
FROM ECPs_Registration a
LEFT JOIN Reps_ShipTo ON (Reps_ShipTo.Shipto=a.AccountNum)
LEFT JOIN Rep_Codes b1 ON (b1.ID=Reps_ShipTo.RepCodeId)
LEFT JOIN Reps_Zip ON (SUBSTR(Zip,1,5)=SUBSTR(a.postalcode,1,5))
LEFT JOIN Rep_Codes b2 ON (b2.ID=Reps_Zip.RepCodeID)
ORDER BY MyRepCode,a.BusName,a.City
Without your database schema and sample data, I have no way to test if above query actually works and has the same result as your original.
SELECT
a.ID,
a.ECPCodeID,
a.RegDate,
a.BusName,
a.City,
a.AccountNum,
b.ID as RepCodeID,
b.RepCode
FROM ECPs_Registration a, Reps_Codes b
INNER JOIN Reps_Zip as r on SUBSTR(a.PostalCode,1,5) = SUBSTR(r.Zip,1,5)
LEFT JOIN Reps_ShipTo as rs on a.AccountNum = rs.ShipTo
LEFT JOIN ShipTo as s on a.AccountNum = s.ShipTo
WHERE (s.id is null or rs.id is null)
ORDER BY b.RepCode,a.BusName,a.City

Can I do a sort of DELETE with JOIN?

I have this kind of table in my MySql Database :
CREATE TABLE `forum_categories` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`title` VARCHAR(255) NOT NULL,
`description` VARCHAR(255) NOT NULL,
`date` DATETIME NOT NULL,
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM
ROW_FORMAT=DEFAULT
CREATE TABLE `forum_topics` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`category_id` INT(11) UNSIGNED NOT NULL,
`title` VARCHAR(255) NOT NULL,
`author` VARCHAR(255) NOT NULL,
`date` DATETIME NOT NULL,
`visits` INT(11) UNSIGNED NOT NULL DEFAULT '0',
`sticky` TINYINT(11) UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM
ROW_FORMAT=DEFAULT
And I'd like, for example, to remove the category (from the table forum_categories) with id=4.
But, when I do this, I'd like to remove all rows on the table forum_topics with category_id=4.
Is it possible to do a sort of DELETE+JOIN?
Unfortunatly (as you can see) my host provider doesnt support InnoDB (what a shame..), so I can't use FOREIGN KEYS :(
SOLUTION
Solved with :
DELETE forum_categories.*, forum_topics.* , forum_visits.*, forum_messages.*
FROM forum_categories
JOIN forum_topics ON forum_categories.id=forum_topics.category_id
JOIN forum_visits ON forum_topics.id=forum_visits.topic
JOIN forum_messages ON forum_topics.id=forum_messages.topic_id
WHERE forum_categories.id=4
you can use the multi-table syntax also:
delete a.*, b.* from forum_categories a inner join forum_topics b on a.id = b.category_id where a.id = 4
Setup a TRIGGER to provide the "cascading" effect.
This MySQL cascading example should provide what you are looking for. It specifically calls out how to do it with MyISAM-based tables.
looks like you might be stuck with
DELETE FROM fourm_topics WHERE category_id = 4
DELETE FROM forum_categories WHERE id = 4
in the same call.
I addressed this question a while back
Mysql - delete multi table