Optimize MySQL query to find Duplicates? - mysql

I originally asked this question here.
I'm using the following query to return all duplicate records with the same first and last name. The trick is that the contact_id, has to be in descending order.
Problem is that the database has a few million records in the "contacts" table. They queory takes several minutes to complete.
I have the contact_firstName, contact_lastName, contact_client_id, and contact_id all indexed in the database.
Any other ideas on how this query can be optimized a little further?
SELECT c.contact_id, c.contact_purl, c.contact_firstName, c.contact_lastName, c.contact_organization
FROM (
SELECT contact_purl, contact_firstName, contact_lastName, MIN(contact_id) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl HAVING COUNT(contact_id) > 1) t
INNER JOIN contacts c
ON t.contact_purl = c.contact_purl
AND c.contact_client_id = 1
AND t.MinID <> c.contact_id
ORDER BY contact_id asc
EXPLAIN:
SCHEMA:
CREATE TABLE IF NOT EXISTS `contacts` (
`contact_id` int(11) NOT NULL AUTO_INCREMENT,
`contact_client_id` int(11) DEFAULT NULL,
`contact_sales_id` int(11) DEFAULT NULL,
`contact_campaign_id` int(11) DEFAULT NULL,
`contact_purl` varchar(100) NOT NULL,
`contact_purl1` varchar(50) DEFAULT NULL,
`contact_purl2` varchar(50) DEFAULT NULL,
`contact_firstName` varchar(50) NOT NULL,
`contact_lastName` varchar(50) NOT NULL,
`contact_organization` varchar(100) DEFAULT NULL,
`contact_url_organization` varchar(200) DEFAULT NULL,
`contact_position` varchar(50) DEFAULT NULL,
`contact_email` varchar(100) DEFAULT NULL,
`contact_phone` varchar(20) DEFAULT NULL,
`contact_fax` varchar(20) NOT NULL,
`contact_address1` varchar(100) DEFAULT NULL,
`contact_address2` varchar(100) DEFAULT NULL,
`contact_city` varchar(100) DEFAULT NULL,
`contact_state` varchar(20) DEFAULT NULL,
`contact_zip` varchar(10) DEFAULT NULL,
`contact_IP` varchar(50) DEFAULT NULL,
`contact_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`contact_pw` varchar(200) NOT NULL,
`contact_subscribed` varchar(1) NOT NULL DEFAULT 'Y',
`contact_import` varchar(200) DEFAULT NULL,
`contacts_c_1` varchar(500) DEFAULT NULL,
`contacts_c_2` varchar(500) DEFAULT NULL,
`contacts_c_3` varchar(500) DEFAULT NULL,
`contacts_c_4` varchar(500) DEFAULT NULL,
`contacts_c_5` varchar(500) DEFAULT NULL,
`contacts_c_6` varchar(500) DEFAULT NULL,
`contacts_c_7` varchar(500) DEFAULT NULL,
`contacts_c_8` varchar(500) DEFAULT NULL,
`contacts_c_9` varchar(500) DEFAULT NULL,
`contacts_c_10` varchar(500) DEFAULT NULL,
`contacts_c_11` varchar(500) DEFAULT NULL,
`contacts_c_12` varchar(500) DEFAULT NULL,
`contacts_c_13` varchar(500) DEFAULT NULL,
`contacts_c_14` varchar(500) DEFAULT NULL,
`contacts_c_15` varchar(500) DEFAULT NULL,
`contacts_c_16` varchar(500) DEFAULT NULL,
`contacts_c_17` varchar(500) DEFAULT NULL,
`contacts_c_18` varchar(500) DEFAULT NULL,
`contacts_c_19` varchar(500) DEFAULT NULL,
`contacts_c_20` varchar(500) DEFAULT NULL,
`contacts_c_21` varchar(500) DEFAULT NULL,
`contacts_c_22` varchar(500) DEFAULT NULL,
`contacts_c_23` varchar(500) DEFAULT NULL,
`contacts_c_24` varchar(500) DEFAULT NULL,
`contacts_c_25` varchar(500) DEFAULT NULL,
`contacts_c_26` varchar(500) DEFAULT NULL,
`contacts_c_27` varchar(500) DEFAULT NULL,
`contacts_c_28` varchar(500) DEFAULT NULL,
`contacts_c_29` varchar(500) DEFAULT NULL,
`contacts_c_30` varchar(500) DEFAULT NULL,
`contacts_c_31` varchar(500) DEFAULT NULL,
`contacts_c_32` varchar(500) DEFAULT NULL,
`contacts_c_33` varchar(500) DEFAULT NULL,
`contacts_c_34` varchar(500) DEFAULT NULL,
`contacts_c_35` varchar(500) DEFAULT NULL,
`contacts_c_36` varchar(500) DEFAULT NULL,
`contacts_c_37` varchar(500) DEFAULT NULL,
`contacts_c_38` varchar(500) DEFAULT NULL,
`contacts_c_39` varchar(500) DEFAULT NULL,
`contacts_c_40` varchar(500) DEFAULT NULL,
`contacts_c_41` varchar(500) DEFAULT NULL,
`contacts_c_42` varchar(500) DEFAULT NULL,
`contacts_c_43` varchar(500) DEFAULT NULL,
`contacts_c_44` varchar(500) DEFAULT NULL,
`contacts_c_45` varchar(500) DEFAULT NULL,
`contacts_c_46` varchar(500) DEFAULT NULL,
`contacts_c_47` varchar(500) DEFAULT NULL,
`contacts_c_48` varchar(500) DEFAULT NULL,
`contacts_c_49` varchar(500) DEFAULT NULL,
`contacts_c_50` varchar(500) DEFAULT NULL,
`contacts_i_1` varchar(100) DEFAULT NULL,
`contacts_i_2` varchar(100) DEFAULT NULL,
`contacts_i_3` varchar(100) DEFAULT NULL,
`contacts_i_4` varchar(100) DEFAULT NULL,
`contacts_i_5` varchar(100) DEFAULT NULL,
`contacts_i_6` varchar(100) DEFAULT NULL,
`contacts_i_7` varchar(100) DEFAULT NULL,
`contacts_i_8` varchar(100) DEFAULT NULL,
`contacts_i_9` varchar(100) DEFAULT NULL,
`contacts_i_10` varchar(100) DEFAULT NULL,
`contacts_i_11` varchar(100) DEFAULT NULL,
`contacts_i_12` varchar(100) DEFAULT NULL,
`contacts_i_13` varchar(100) DEFAULT NULL,
`contacts_i_14` varchar(100) DEFAULT NULL,
`contacts_i_15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`contact_id`),
KEY `contact_campaign_id` (`contact_campaign_id`),
KEY `contact_client_id` (`contact_client_id`),
KEY `contact_purl2` (`contact_purl2`),
KEY `contact_purl1` (`contact_purl1`),
KEY `contact_purl` (`contact_purl`)
)

You could do it simply making a join on itself, like this:
SELECT DISTINCT c1.contact_id, c1.contact_firstName, c1.contact_lastName,
RIGHT(c1.contact_lastName,1) AS nameNum
FROM
contacts c1 INNER JOIN contacts c2
ON c1.contact_firstName = c2.contact_firstName
AND c1.contact_lastName = c2.contact_lastName
AND c2.contact_client_id = 1
AND c1.contact_id <> c2.contact_id
ORDER BY c1.contact_id DESC

Would it make sense to keep track of the number of contacts for each client separately and store it in another column vs doing that sub-select query every time?
With that number of records, it might be more efficient to store some of those calculations instead of having to query them in real time.

Related

Find the list of all the customers with a name that contains a letter between two letters

my table is here
customers
CREATE TABLE `customers` (
`customer_id` int(11) DEFAULT NULL,
`account_num` double DEFAULT NULL,
`lname` varchar(50) DEFAULT NULL,
`fname` varchar(50) DEFAULT NULL,
`mi` varchar(50) DEFAULT NULL,
`address1` varchar(50) DEFAULT NULL,
`address2` varchar(50) DEFAULT NULL,
`address3` varchar(50) DEFAULT NULL,
`address4` varchar(50) DEFAULT NULL,
`postal_code` varchar(50) DEFAULT NULL,
`region_id` int(11) DEFAULT NULL,
`phone1` varchar(50) DEFAULT NULL,
`phone2` varchar(50) DEFAULT NULL,
`birthdate` datetime DEFAULT NULL,
`marital_status` varchar(50) DEFAULT NULL,
`yearly_income` varchar(50) DEFAULT NULL,
`gender` varchar(50) DEFAULT NULL,
`total_children` smallint(6) DEFAULT NULL,
`num_children_at_home` smallint(6) DEFAULT NULL,
`education` varchar(50) DEFAULT NULL,
`member_card` varchar(50) DEFAULT NULL,
`occupation` varchar(50) DEFAULT NULL,
`houseowner` varchar(50) DEFAULT NULL,
`num_cars_owned` smallint(6) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I have to find out
Find the list of all the customers with a name that contains a letter between ā€œaā€ and ā€œdā€ as the second letter
my query is not working as needed
SELECT *
FROM customers
WHERE fname REGEXP '^[A-D]';
You can try this.
SELECT *
FROM customers
WHERE fname REGEXP '^.[A-Da-d]{1}';
SQLFiddle
SELECT *
FROM customers
WHERE SUBSTR(fname,2,1) REGEXP '^[A-D]';
It can be achieved by simple LIKE statement instead of using regex.
SELECT *
FROM customers
WHERE upper(fname) LIKE '_[A-D]%';

How can I connect the Primary Key, Foreign Key and Alternative Key all together?

I downloaded this database (it inlcudes the schema): https://wyobiz.wy.gov/business/database.aspx
I want to connect the entire database to basically one table.
So far I have this:
/*Table structure for table `filing` */
DROP TABLE IF EXISTS `filing`;
CREATE TABLE `filing` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`FILING_ID` varchar(200) DEFAULT NULL,
`FILING_TYPE` varchar(200) DEFAULT NULL,
`FILING_SUBTYPE` varchar(200) DEFAULT NULL,
`WORD_DESIGN_TYPE` varchar(200) DEFAULT NULL,
`DURATION_TERM_TYPE` varchar(200) DEFAULT NULL,
`STATUS` varchar(200) DEFAULT NULL,
`SUB_STATUS` varchar(200) DEFAULT NULL,
`STANDING_TAX` varchar(200) DEFAULT NULL,
`STANDING_RA` varchar(200) DEFAULT NULL,
`STANDING_OTHER` varchar(200) DEFAULT NULL,
`PURPOSE` varchar(200) DEFAULT NULL,
`APPLICANT_TYPE` varchar(200) DEFAULT NULL,
`FILING_NUM` varchar(200) DEFAULT NULL,
`FILING_NAME` varchar(200) DEFAULT NULL,
`OLD_NAME` varchar(200) DEFAULT NULL,
`FICTITIOUS_NAME` varchar(200) DEFAULT NULL,
`DOMESTIC_YN` varchar(200) DEFAULT NULL,
`FILING_DATE` varchar(200) DEFAULT NULL,
`DELAYED_EFFECTIVE_DATE` varchar(200) DEFAULT NULL,
`EXPIRATION_DATE` varchar(200) DEFAULT NULL,
`INACTIVE_DATE` varchar(200) DEFAULT NULL,
`RA_RESIGN_CERT_LETTER_DATE` varchar(200) DEFAULT NULL,
`CONVERTED_YN` varchar(200) DEFAULT NULL,
`CONVERTED_FROM` varchar(200) DEFAULT NULL,
`CONVERTED_FROM_NAME` varchar(200) DEFAULT NULL,
`CONVERTED_DATE` varchar(200) DEFAULT NULL,
`ISSUE_ON_RECORD_YN` varchar(200) DEFAULT NULL,
`TRANSFERRED_TO` varchar(200) DEFAULT NULL,
`TRANSFERRED_DATE` varchar(200) DEFAULT NULL,
`FORMATION_LOCALE` varchar(200) DEFAULT NULL,
`CONTINUED_FROM_LOCALE` varchar(200) DEFAULT NULL,
`DOMESTICATED_FROM_LOCALE` varchar(200) DEFAULT NULL,
`FORM_HOME_JURIS_DATE` varchar(200) DEFAULT NULL,
`COMMON_SHARES` varchar(200) DEFAULT NULL,
`COMMON_PAR_VALUE` varchar(200) DEFAULT NULL,
`PREFERRED_SHARES` varchar(200) DEFAULT NULL,
`PREFERRED_PAR_VALUE` varchar(200) DEFAULT NULL,
`ADDITIONAL_STOCK_YN` varchar(200) DEFAULT NULL,
`PRINCIPLE_ADDR1` varchar(200) DEFAULT NULL,
`PRINCIPLE_ADDR2` varchar(200) DEFAULT NULL,
`PRINCIPLE_ADDR3` varchar(200) DEFAULT NULL,
`PRINCIPLE_CITY` varchar(200) DEFAULT NULL,
`PRINCIPLE_STATE` varchar(200) DEFAULT NULL,
`PRINCIPLE_POSTAL_CODE` varchar(200) DEFAULT NULL,
`PRINCIPLE_COUNTRY` varchar(200) DEFAULT NULL,
`MAIL_ADDR1` varchar(200) DEFAULT NULL,
`MAIL_ADDR2` varchar(200) DEFAULT NULL,
`MAIL_ADDR3` varchar(200) DEFAULT NULL,
`MAIL_CITY` varchar(1000) DEFAULT NULL,
`MAIL_STATE` varchar(1000) DEFAULT NULL,
`MAIL_POSTAL_CODE` varchar(1000) DEFAULT NULL,
`MAIL_COUNTRY` varchar(1000) DEFAULT NULL,
`STATE_OF_ORG` varchar(1000) DEFAULT NULL,
`ORG_DATE` varchar(1000) DEFAULT NULL,
`REG_US_OFFICE_YN` varchar(1000) DEFAULT NULL,
`REG_US_DATE` varchar(1000) DEFAULT NULL,
`REG_US_SERIAL_NUM` varchar(1000) DEFAULT NULL,
`REG_US_STATUS` varchar(1000) DEFAULT NULL,
`REG_US_APP_REFUSED_YN` varchar(1000) DEFAULT NULL,
`FIRST_USED_ANYWHERE_DATE` blob,
`FIRST_USED_WYO_DATE` blob,
`AR_EXEMPT_YN` blob,
`TRADEMARK_KEYWORDS` blob,
PRIMARY KEY (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
/*Data for the table `filing` */
/*Table structure for table `filing_annual_report` */
DROP TABLE IF EXISTS `filing_annual_report`;
CREATE TABLE `filing_annual_report` (
`FILING_ANNUAL_REPORT_ID` int(11) NOT NULL,
`FILING_ID` int(11) DEFAULT NULL,
`STATUS` varchar(1000) DEFAULT NULL,
`ANNUAL_REPORT_NUM` varchar(1000) DEFAULT NULL,
`FILING_YEAR` varchar(1000) DEFAULT NULL,
`FILING_DATE` varchar(1000) DEFAULT NULL,
`LICENSE_TAX_AMT` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`FILING_ANNUAL_REPORT_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*Data for the table `filing_annual_report` */
/*Table structure for table `party` */
DROP TABLE IF EXISTS `party`;
CREATE TABLE `party` (
`PARTY_ID` int(11) NOT NULL,
`PARTY_TYPE` varchar(1000) DEFAULT NULL,
`SOURCE_ID` varchar(1000) DEFAULT NULL,
`SOURCE_TYPE` varchar(1000) DEFAULT NULL,
`ORG_NAME` varchar(1000) DEFAULT NULL,
`FIRST_NAME` varchar(1000) DEFAULT NULL,
`MIDDLE_NAME` varchar(1000) DEFAULT NULL,
`LAST_NAME` varchar(1000) DEFAULT NULL,
`INDIVIDUAL_TITLE` varchar(1000) DEFAULT NULL,
`ADDR1` varchar(1000) DEFAULT NULL,
`ADDR2` varchar(1000) DEFAULT NULL,
`ADDR3` varchar(1000) DEFAULT NULL,
`CITY` varchar(1000) DEFAULT NULL,
`COUNTY` varchar(1000) DEFAULT NULL,
`STATE` varchar(1000) DEFAULT NULL,
`POSTAL_CODE` varchar(1000) DEFAULT NULL,
`COUNTRY` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`PARTY_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*Data for the table `party` */
Once you have imported the CSV Data into your DB, the schema seems to indicate PARTY is linked to FILING using the FK PARTY.SOURCE_ID and FILING_ANNUAL_REPORT is linked to FILING using the FK FILING_ANNUAL_REPORT.FILING_ID.
The query to get all entries in one table would be
select * from FILING f
join PARTY p on p.SOURCE_ID=f.FILING_ID
join FILING_ANNUAL_REPORT a on a.FILING_ID=f.FILING_ID
To speed up the query build indexes before running the query:
create index ff on FILING(filing_id);
create index pf on PARTY(source_id);
create index af on FILING_ANNUAL_REPORT(filing_id);

uses of federated table doesn't work

I wanted to create a federated table so I have the following statement:
CREATE TABLE `ldap` (
`Key_Id` varchar(35) DEFAULT NULL,
`distinguishedName` varchar(250) DEFAULT NULL,
`name` varchar(50) DEFAULT NULL,
`givenName` varchar(50) DEFAULT NULL,
`sn` varchar(50) DEFAULT NULL,
`employeeID` varchar(7) DEFAULT NULL,
`employeeType` varchar(1) DEFAULT NULL,
`title` varchar(100) DEFAULT NULL,
`description` varchar(250) DEFAULT NULL,
`department` varchar(100) DEFAULT NULL,
`company` varchar(50) DEFAULT NULL,
`telephoneNumber` varchar(50) DEFAULT NULL,
`facsimileTelephoneNumber` varchar(30) DEFAULT NULL,
`homePhone` varchar(30) DEFAULT NULL,
`mobile` varchar(30) DEFAULT NULL,
`otherTelephone` varchar(250) DEFAULT NULL,
`otherMobile` varchar(30) DEFAULT NULL,
`otherHomePhone` varchar(30) DEFAULT NULL,
`ST` varchar(50) DEFAULT NULL,
`Mail` varchar(50) DEFAULT NULL,
`physicalDeliveryOfficeName` varchar(50) DEFAULT NULL,
`streetAddress` varchar(250) DEFAULT NULL,
`postalCode` varchar(50) DEFAULT NULL,
`L` varchar(50) DEFAULT NULL,
`sAMAccountName` varchar(50) DEFAULT NULL,
`houseIdentifier` varchar(100) DEFAULT NULL,
`info` varchar(250) DEFAULT NULL,
`showInAddressBook` tinyint(1) DEFAULT NULL,
`displayName` varchar(150) DEFAULT NULL,
KEY `IX_LDAP_distinguishedName` (`distinguishedName`),
KEY `IX_LDAP_employeeID` (`employeeID`)
)
ENGINE=FEDERATED
DEFAULT CHARSET=latin1
CONNECTION='mysql://admin:xxx#cdmysql:3305/node/ldap';
I have in the ldap table more then 1000 rows. I want to recreate this table on the other db and I thought I have to use federated table. But when I use the code above, the ldap table is created, but with no rows. I don't have any errors. what can be the problem?
My version of MySQL is 5.1.73
look at this Post
how can i enable federated engine in mysql after installation?
There is the answer, I had the same issue than you and to solve it I had to Enabled Federated engine in Mysql.
Saludos! :)

MySQL remove uuid function from table

I have a table that was inherited from a different system and one of the fields has the UUID function enabled so no matter what ID I generate and try to insert, the table creates a completely different one automatically.
I would like to use a PHP function to create the ID instead but I can't workout how to remove the UUID function from the field.
I am not sure what information will be needed to help you so please feel free to ask.
The field in question is
id, char(36)
Table definaition is..
CREATE TABLE `users` (
`id` char(36) NOT NULL,
`user_name` varchar(60) default NULL,
`user_hash` varchar(32) default NULL,
`diary_weekly_view` varchar(500) NOT NULL,
`week_start` date NOT NULL,
`diary_monthly_view` varchar(300) NOT NULL,
`diary_view` int(1) NOT NULL default '1',
`account_search` varchar(1000) NOT NULL,
`cases_search` varchar(500) NOT NULL,
`serials_search` varchar(500) NOT NULL,
`type` varchar(8) NOT NULL,
`email` varchar(255) NOT NULL,
`notes` varchar(3000) NOT NULL,
`authenticate_id` varchar(100) default NULL,
`sugar_login` tinyint(1) default '1',
`first_name` varchar(30) default NULL,
`last_name` varchar(30) default NULL,
`reports_to_id` char(36) default NULL,
`is_admin` tinyint(1) default '0',
`receive_notifications` tinyint(1) default '1',
`date_entered` datetime NOT NULL,
`date_modified` datetime NOT NULL,
`modified_user_id` char(36) default NULL,
`created_by` char(36) default NULL,
`phone_office` varchar(50) default NULL,
`phone_mobile` varchar(50) default NULL,
`status` varchar(25) default NULL,
`address_street` varchar(150) default NULL,
`address_city` varchar(100) default NULL,
`address_state` varchar(100) default NULL,
`address_country` varchar(25) default NULL,
`address_postalcode` varchar(9) default NULL,
`user_preferences` text,
`deleted` tinyint(1) NOT NULL default '0',
`portal_only` tinyint(1) default '0',
`employee_status` varchar(25) default NULL,
`is_group` tinyint(1) default '0',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf
OK, sussed it, I removed the index and then recreated the index with INDEX

MySQL MIN GROUP BY on large tables ( > 8000 rows)

I have the following query:
SELECT contact_purl, contact_firstName, contact_lastName, MIN( contact_id ) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl
HAVING COUNT( contact_id ) > 1
The purpose is to find any contacts with a duplicate "contact_purl," and return the first entry.
I'm running into a very strange problem... If the table has less than 8,000 rows, the query will render in less than 1 second. HOWEVER, if the table has more than 8,000 rows, the query will take consistently 338 seconds on average.
Here is the query plan for the table with ~5000 rows:
And for ~8000 rows:
The table...
CREATE TABLE IF NOT EXISTS `contacts` (
`contact_id` int(11) NOT NULL AUTO_INCREMENT,
`contact_client_id` int(11) DEFAULT NULL,
`contact_sales_id` int(11) DEFAULT NULL,
`contact_campaign_id` int(11) DEFAULT NULL,
`contact_purl` varchar(100) NOT NULL,
`contact_purl1` varchar(50) DEFAULT NULL,
`contact_purl2` varchar(50) DEFAULT NULL,
`contact_firstName` varchar(50) NOT NULL,
`contact_lastName` varchar(50) NOT NULL,
`contact_organization` varchar(100) DEFAULT NULL,
`contact_url_organization` varchar(200) DEFAULT NULL,
`contact_position` varchar(50) DEFAULT NULL,
`contact_email` varchar(100) DEFAULT NULL,
`contact_phone` varchar(20) DEFAULT NULL,
`contact_fax` varchar(20) NOT NULL,
`contact_address1` varchar(100) DEFAULT NULL,
`contact_address2` varchar(100) DEFAULT NULL,
`contact_city` varchar(100) DEFAULT NULL,
`contact_state` varchar(20) DEFAULT NULL,
`contact_zip` varchar(10) DEFAULT NULL,
`contact_IP` varchar(50) DEFAULT NULL,
`contact_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`contact_pw` varchar(200) NOT NULL,
`contact_subscribed` varchar(1) NOT NULL DEFAULT 'Y',
`contact_import` varchar(200) DEFAULT NULL,
`contacts_c_1` varchar(500) DEFAULT NULL,
`contacts_c_2` varchar(500) DEFAULT NULL,
`contacts_c_3` varchar(500) DEFAULT NULL,
`contacts_c_4` varchar(500) DEFAULT NULL,
`contacts_c_5` varchar(500) DEFAULT NULL,
`contacts_c_6` varchar(500) DEFAULT NULL,
`contacts_c_7` varchar(500) DEFAULT NULL,
`contacts_c_8` varchar(500) DEFAULT NULL,
`contacts_c_9` varchar(500) DEFAULT NULL,
`contacts_c_10` varchar(500) DEFAULT NULL,
`contacts_c_11` varchar(500) DEFAULT NULL,
`contacts_c_12` varchar(500) DEFAULT NULL,
`contacts_c_13` varchar(500) DEFAULT NULL,
`contacts_c_14` varchar(500) DEFAULT NULL,
`contacts_c_15` varchar(500) DEFAULT NULL,
`contacts_c_16` varchar(500) DEFAULT NULL,
`contacts_c_17` varchar(500) DEFAULT NULL,
`contacts_c_18` varchar(500) DEFAULT NULL,
`contacts_c_19` varchar(500) DEFAULT NULL,
`contacts_c_20` varchar(500) DEFAULT NULL,
`contacts_c_21` varchar(500) DEFAULT NULL,
`contacts_c_22` varchar(500) DEFAULT NULL,
`contacts_c_23` varchar(500) DEFAULT NULL,
`contacts_c_24` varchar(500) DEFAULT NULL,
`contacts_c_25` varchar(500) DEFAULT NULL,
`contacts_c_26` varchar(500) DEFAULT NULL,
`contacts_c_27` varchar(500) DEFAULT NULL,
`contacts_c_28` varchar(500) DEFAULT NULL,
`contacts_c_29` varchar(500) DEFAULT NULL,
`contacts_c_30` varchar(500) DEFAULT NULL,
`contacts_c_31` varchar(500) DEFAULT NULL,
`contacts_c_32` varchar(500) DEFAULT NULL,
`contacts_c_33` varchar(500) DEFAULT NULL,
`contacts_c_34` varchar(500) DEFAULT NULL,
`contacts_c_35` varchar(500) DEFAULT NULL,
`contacts_c_36` varchar(500) DEFAULT NULL,
`contacts_c_37` varchar(500) DEFAULT NULL,
`contacts_c_38` varchar(500) DEFAULT NULL,
`contacts_c_39` varchar(500) DEFAULT NULL,
`contacts_c_40` varchar(500) DEFAULT NULL,
`contacts_c_41` varchar(500) DEFAULT NULL,
`contacts_c_42` varchar(500) DEFAULT NULL,
`contacts_c_43` varchar(500) DEFAULT NULL,
`contacts_c_44` varchar(500) DEFAULT NULL,
`contacts_c_45` varchar(500) DEFAULT NULL,
`contacts_c_46` varchar(500) DEFAULT NULL,
`contacts_c_47` varchar(500) DEFAULT NULL,
`contacts_c_48` varchar(500) DEFAULT NULL,
`contacts_c_49` varchar(500) DEFAULT NULL,
`contacts_c_50` varchar(500) DEFAULT NULL,
`contacts_i_1` varchar(100) DEFAULT NULL,
`contacts_i_2` varchar(100) DEFAULT NULL,
`contacts_i_3` varchar(100) DEFAULT NULL,
`contacts_i_4` varchar(100) DEFAULT NULL,
`contacts_i_5` varchar(100) DEFAULT NULL,
`contacts_i_6` varchar(100) DEFAULT NULL,
`contacts_i_7` varchar(100) DEFAULT NULL,
`contacts_i_8` varchar(100) DEFAULT NULL,
`contacts_i_9` varchar(100) DEFAULT NULL,
`contacts_i_10` varchar(100) DEFAULT NULL,
`contacts_i_11` varchar(100) DEFAULT NULL,
`contacts_i_12` varchar(100) DEFAULT NULL,
`contacts_i_13` varchar(100) DEFAULT NULL,
`contacts_i_14` varchar(100) DEFAULT NULL,
`contacts_i_15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`contact_id`),
KEY `contact_campaign_id` (`contact_campaign_id`),
KEY `contact_client_id` (`contact_client_id`),
KEY `contact_purl2` (`contact_purl2`),
KEY `contact_purl1` (`contact_purl1`),
KEY `contact_purl` (`contact_purl`)
)
I have recently Optimized and Defragmented the table as well.
Any ideas on what would be causing this?
First off, thank you for posting your table structure, query, and EXPLAIN output in your question. I think you're crossing the memory / disk temporary table size boundary, thus the large performance change. If you put a unique index on the contact_purl column, MySQL won't allow duplicates to be inserted. This would make your query unnecessary. Otherwise, I'd create an index on (contact_client_id, contact_purl) so MySQL can figure out what rows you want from the indexes directly. You could also try separating the search for the columns and retrieving them by using a subquery. Something like this maybe:
SELECT contact_purl, contact_firstName, contact_lastName, contact_id
FROM contacts, (SELECT MIN(contact_id) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl
HAVING COUNT( contact_id ) > 1) nodups WHERE nodups.MinID = contacts.contact_id