Need help adding fields using MySQL - mysql

I am using a MySQL DB to manage employees. I maintain the DB in PHPMyAdmin. I want to add 70 new fields into a table by using SQL. I thought this would work. Can you tell me why it doesn't
CREATE TABLE IF NOT EXISTS `dist` (
`e_employee1` varchar(255) NOT NULL,
`e_name1` varchar(255) NOT NULL,
`e_title1` varchar(255) NOT NULL,
`e_phone1` int(11) NOT NULL,
`e_ext1` int(11) NOT NULL,
`e_phone21` int(11) NOT NULL,
`e_email1` varchar(255) NOT NULL,
`e_employee2` varchar(255) NOT NULL,
`e_name2` varchar(255) NOT NULL,
`e_title2` varchar(255) NOT NULL,
`e_phone2` int(11) NOT NULL,
`e_ext2` int(11) NOT NULL,
`e_phone22` int(11) NOT NULL,
`e_email12` varchar(255) NOT NULL,
`e_employee3` varchar(255) NOT NULL,
`e_name3` varchar(255) NOT NULL,
`e_title3` varchar(255) NOT NULL,
`e_phone3` int(11) NOT NULL,
`e_ext3` int(11) NOT NULL,
`e_phone23` int(11) NOT NULL,
`e_email3` varchar(255) NOT NULL,
`e_employee4` varchar(255) NOT NULL,
`e_name4` varchar(255) NOT NULL,
`e_title4` varchar(255) NOT NULL,
`e_phone4` int(11) NOT NULL,
`e_ext4` int(11) NOT NULL,
`e_phone24` int(11) NOT NULL,
`e_email4` varchar(255) NOT NULL,
`e_employee5` varchar(255) NOT NULL,
`e_name5` varchar(255) NOT NULL,
`e_title5` varchar(255) NOT NULL,
`e_phone5` int(11) NOT NULL,
`e_ext5` int(11) NOT NULL,
`e_phone25` int(11) NOT NULL,
`e_email5` varchar(255) NOT NULL,
`e_employee6` varchar(255) NOT NULL,
`e_name6` varchar(255) NOT NULL,
`e_title6` varchar(255) NOT NULL,
`e_phone6` int(11) NOT NULL,
`e_ext6` int(11) NOT NULL,
`e_phone26` int(11) NOT NULL,
`e_email6` varchar(255) NOT NULL,
`e_employee7` varchar(255) NOT NULL,
`e_name7` varchar(255) NOT NULL,
`e_title7` varchar(255) NOT NULL,
`e_phone7` int(11) NOT NULL,
`e_ext7` int(11) NOT NULL,
`e_phone27` int(11) NOT NULL,
`e_email7` varchar(255) NOT NULL,
`e_employee8` varchar(255) NOT NULL,
`e_name8` varchar(255) NOT NULL,
`e_title8` varchar(255) NOT NULL,
`e_phone8` int(11) NOT NULL,
`e_ext8` int(11) NOT NULL,
`e_phone28` int(11) NOT NULL,
`e_email8` varchar(255) NOT NULL,
`e_employee9` varchar(255) NOT NULL,
`e_name9` varchar(255) NOT NULL,
`e_title9` varchar(255) NOT NULL,
`e_phone9` int(11) NOT NULL,
`e_ext9` int(11) NOT NULL,
`e_phone29` int(11) NOT NULL,
`e_email9` varchar(255) NOT NULL,
`e_employee10` varchar(255) NOT NULL,
`e_name10` varchar(255) NOT NULL,
`e_title10` varchar(255) NOT NULL,
`e_phone10` int(11) NOT NULL,
`e_ext10` int(11) NOT NULL,
`e_phone210` int(11) NOT NULL,
`e_email10` varchar(255) NOT NULL,
KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=14 ;

You are using CREATE TABLE. You should be using ALTER TABLE if you want to add fields. PHPMyAdmin should be showing you an error indicating that the table already exists.

Related

prevent django test runner from creating stale table

I have mariadb database that used to have CHARSET utf8 COLLATE utf8_general_ci config but now CHARSET utf8mb4 COLLATE utf8mb4_unicode_ci. All tables have the same CHARSET and COLLATE as those of the database.
When I run ./manage.py test, stacktrace looks like this:
....
django.db.utils.OperationalError: (1118, 'Row size too large (> 8126). Changing some columns to TEXT or BLOB may help. In current row format, BLOB prefix of 0 bytes is stored inline.')
I managed to find out what the troubling table is, and the sql query looks like the following. Note that I changed names of table and fields for security:
CREATE TABLE `troubling_table`
(
`id` INTEGER auto_increment NOT NULL PRIMARY KEY,
`no_tax` VARCHAR(20) NOT NULL,
`cd_pc` VARCHAR(7) NOT NULL,
`cd_wdept` VARCHAR(12) NOT NULL,
`id_write` VARCHAR(20) NULL,
`cd_docu` VARCHAR(10) NULL,
`dt_acct` VARCHAR(8) NULL,
`st_docu` VARCHAR(3) NULL,
`tp_drcr` VARCHAR(3) NULL,
`cd_acct` VARCHAR(20) NULL,
`amt` NUMERIC(19, 4) NULL,
`cd_partner` VARCHAR(20) NULL,
`nm_partner` VARCHAR(50) NULL,
`tp_job` VARCHAR(40) NULL,
`cls_job` VARCHAR(40) NULL,
`ads_hd` VARCHAR(400) NULL,
`nm_ceo` VARCHAR(40) NULL,
`dt_start` VARCHAR(8) NULL,
`dt_end` VARCHAR(8) NULL,
`am_taxstd` NUMERIC(19, 4) NULL,
`am_addtax` NUMERIC(19, 4) NULL,
`tp_tax` VARCHAR(10) NULL,
`no_company` VARCHAR(20) NULL,
`dts_insert` VARCHAR(20) NULL,
`id_insert` VARCHAR(20) NULL,
`dts_update` VARCHAR(20) NULL,
`id_update` VARCHAR(20) NULL,
`nm_note` VARCHAR(100) NULL,
`cd_bizarea` VARCHAR(12) NULL,
`cd_dept` VARCHAR(12) NULL,
`cd_cc` VARCHAR(12) NULL,
`cd_pjt` VARCHAR(20) NULL,
`cd_fund` VARCHAR(20) NULL,
`cd_budget` VARCHAR(20) NULL,
`no_cash` VARCHAR(20) NULL,
`st_mutual` VARCHAR(3) NULL,
`cd_card` VARCHAR(20) NULL,
`no_deposit` VARCHAR(20) NULL,
`cd_bank` VARCHAR(20) NULL,
`ucd_mng1` VARCHAR(20) NULL,
`ucd_mng2` VARCHAR(20) NULL,
`ucd_mng3` VARCHAR(20) NULL,
`ucd_mng4` VARCHAR(20) NULL,
`ucd_mng5` VARCHAR(20) NULL,
`cd_employ` VARCHAR(20) NULL,
`cd_mng` VARCHAR(20) NULL,
`no_bdocu` VARCHAR(20) NULL,
`no_bdoline` NUMERIC(4, 0) NULL,
`tp_docu` VARCHAR(3) NULL,
`no_acct` NUMERIC(5, 0) NULL,
`tp_trade` VARCHAR(10) NULL,
`no_check` VARCHAR(20) NULL,
`no_check1` VARCHAR(20) NULL,
`cd_exch` VARCHAR(10) NULL,
`rt_exch` NUMERIC(10, 4) NULL,
`cd_trade` VARCHAR(10) NULL,
`no_check2` VARCHAR(50) NULL,
`no_check3` VARCHAR(50) NULL,
`no_check4` VARCHAR(100) NULL,
`tp_cross` VARCHAR(1) NULL,
`erp_cd` VARCHAR(50) NULL,
`am_ex` NUMERIC(19, 4) NULL,
`tp_export` VARCHAR(1) NULL,
`no_to` VARCHAR(20) NULL,
`dt_shipping` VARCHAR(8) NULL,
`tp_gubun` VARCHAR(3) NULL,
`no_invoice` VARCHAR(20) NULL,
`no_item` VARCHAR(20) NULL,
`md_tax1` VARCHAR(4) NULL,
`nm_item1` VARCHAR(50) NULL,
`nm_size1` VARCHAR(20) NULL,
`qt_tax1` NUMERIC(17, 4) NULL,
`am_prc1` NUMERIC(19, 4) NULL,
`am_supply1` NUMERIC(19, 4) NULL,
`am_tax1` NUMERIC(19, 4) NULL,
`nm_note1` VARCHAR(20) NULL,
`cd_bizplan` VARCHAR(20) NULL,
`cd_bgacct` VARCHAR(10) NULL,
`cd_mngd1` VARCHAR(20) NULL,
`nm_mngd1` VARCHAR(100) NULL,
`cd_mngd2` VARCHAR(20) NULL,
`nm_mngd2` VARCHAR(100) NULL,
`cd_mngd3` VARCHAR(20) NULL,
`nm_mngd3` VARCHAR(100) NULL,
`cd_mngd4` VARCHAR(20) NULL,
`nm_mngd4` VARCHAR(100) NULL,
`cd_mngd5` VARCHAR(20) NULL,
`nm_mngd5` VARCHAR(100) NULL,
`cd_mngd6` VARCHAR(20) NULL,
`nm_mngd6` VARCHAR(100) NULL,
`cd_mngd7` VARCHAR(20) NULL,
`nm_mngd7` VARCHAR(100) NULL,
`cd_mngd8` VARCHAR(20) NULL,
`nm_mngd8` VARCHAR(100) NULL,
`yn_iss` VARCHAR(1) NULL,
`final_status` VARCHAR(2) NULL,
`no_bill` VARCHAR(24) NULL,
`tp_bill` VARCHAR(1) NULL,
`tp_record` VARCHAR(1) NULL,
`tp_etcacct` VARCHAR(1) NULL,
`st_gware` VARCHAR(3) NULL,
`sell_dam_nm` VARCHAR(30) NULL,
`sell_dam_email` VARCHAR(50) NULL,
`sell_dam_mobil` VARCHAR(20) NULL,
`nm_pumm` VARCHAR(100) NULL,
`jeonjasend15_yn` VARCHAR(1) NULL,
`dt_write` VARCHAR(8) NULL,
`st_tax` VARCHAR(1) NULL,
`md_tax2` VARCHAR(4) NULL,
`nm_item2` VARCHAR(50) NULL,
`nm_size2` VARCHAR(20) NULL,
`qt_tax2` NUMERIC(17, 4) NULL,
`am_prc2` NUMERIC(19, 4) NULL,
`am_supply2` NUMERIC(19, 4) NULL,
`am_tax2` NUMERIC(19, 4) NULL,
`nm_note2` VARCHAR(20) NULL,
`md_tax3` VARCHAR(4) NULL,
`nm_item3` VARCHAR(50) NULL,
`nm_size3` VARCHAR(20) NULL,
`qt_tax3` NUMERIC(17, 4) NULL,
`am_prc3` NUMERIC(19, 4) NULL,
`am_supply3` NUMERIC(19, 4) NULL,
`am_tax3` NUMERIC(19, 4) NULL,
`nm_note3` VARCHAR(20) NULL,
`md_tax4` VARCHAR(4) NULL,
`nm_item4` VARCHAR(50) NULL,
`nm_size4` VARCHAR(20) NULL,
`qt_tax4` NUMERIC(17, 4) NULL,
`am_prc4` NUMERIC(19, 4) NULL,
`am_supply4` NUMERIC(19, 4) NULL,
`am_tax4` NUMERIC(19, 4) NULL,
`nm_note4` VARCHAR(20) NULL,
`no_asset` VARCHAR(20) NULL,
`nm_bigo` VARCHAR(100) NULL,
`nm_ptr` VARCHAR(20) NULL,
`ex_hp` VARCHAR(15) NULL,
`ex_emil` VARCHAR(100) NULL,
`no_biztax` VARCHAR(8) NULL,
`yn_import` VARCHAR(1) NULL,
`ref_no_docu` VARCHAR(20) NULL,
`cd_fx` VARCHAR(2) NULL,
`fx_bill` VARCHAR(20) NULL,
`no_iss` VARCHAR(24) NULL,
`file_attach` VARCHAR(100) NULL,
`tp_evidence` VARCHAR(4) NULL,
`st_bizbox` VARCHAR(1) NULL,
`tp_input` VARCHAR(30) NULL,
`sell_dam_tel` VARCHAR(20) NULL,
`no_car` VARCHAR(20) NULL,
`no_carbody` VARCHAR(17) NULL,
`dec_lease` VARCHAR(100) NULL,
`no_tdocu` VARCHAR(20) NULL,
`no_tdoline` NUMERIC(4, 0) NULL,
`cd_bizcar` VARCHAR(20) NULL,
`cd_taxacct` VARCHAR(10) NULL,
`yn_fixasset` VARCHAR(1) NULL
)
So if I run this query in sql editor, the error looks the same as that of django. This error didn't happend when I created database with the now gone pair of characterset and collate. But when I run the test it raises error. Different charset may be one of the reaons.
So I deleted the model and apply migrations, since that model is no longer in use. So, stale.
But even after that model is erased, django test runner still seems to bother to create that stale table.
Does django test runner go through every migraion files from start? Is that why I can't run test against model that used to be created with old charset and collate?
How can I prevent django test runner from creating stale, no longer existed, table because old table conflicts in charset and collate with new table, without changing db charset and collate?

Count bookings per trip - return 0 if none

I have two tables,
CREATE TABLE `voyages` (
`voyage_id` int(11) NOT NULL,
`voyage_type` int(11) NOT NULL,
`voyage_groupBooking` tinyint(4) NOT NULL DEFAULT 0,
`voyage_live` tinyint(4) NOT NULL DEFAULT 0,
`voyage_featured` tinyint(4) NOT NULL DEFAULT 0,
`voyage_name` varchar(60) NOT NULL,
`voyage_slug` varchar(60) NOT NULL,
`voyage_shortDescription` varchar(150) NOT NULL,
`voyage_shortPageDescription` text NOT NULL,
`voyage_tag` varchar(20) DEFAULT NULL,
`voyage_detail` text NOT NULL,
`voyage_ageBracket` text NOT NULL DEFAULT '14-18',
`voyage_included` varchar(150) NOT NULL,
`voyage_image` text DEFAULT NULL,
`voyage_startDate` date NOT NULL,
`voyage_startTime` time NOT NULL,
`voyage_endDate` date NOT NULL,
`voyage_cost` decimal(11,2) NOT NULL,
`voyage_miles` int(11) DEFAULT NULL,
`voyage_hours` int(11) DEFAULT NULL,
`voyage_ports` int(11) DEFAULT NULL,
`voyage_deposit` int(2) NOT NULL DEFAULT 0,
`voyage_crewBerth` tinyint(2) NOT NULL,
`voyage_Afterguard` tinyint(2) NOT NULL,
`voyage_map` text DEFAULT NULL,
`voyage_mapZoom` tinyint(4) NOT NULL DEFAULT 8,
`voyage_addressName` varchar(150) NOT NULL,
`voyage_streetAddress` varchar(150) NOT NULL,
`voyage_locality` varchar(150) NOT NULL,
`voyage_postalCode` varchar(150) NOT NULL,
`voyage_region` varchar(150) NOT NULL,
`voyage_country` varchar(150) NOT NULL,
`voyage_gallery` text DEFAULT NULL,
`voyage_deleted` tinyint(4) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And
CREATE TABLE `bookings` (
`booking_id` int(11) NOT NULL,
`booking_status` tinyint(4) NOT NULL DEFAULT 0,
`booking_reference` varchar(60) NOT NULL,
`booking_stripeCustomerReference` varchar(150) NOT NULL,
`booking_stripeDepositInvoice` varchar(150) DEFAULT NULL,
`booking_stripeBalanceInvoice` varchar(150) DEFAULT NULL,
`booking_depositCharged` decimal(10,2) NOT NULL DEFAULT 0.00,
`booking_balanceCharged` decimal(10,2) NOT NULL DEFAULT 0.00,
`booking_totalPaid` decimal(10,2) NOT NULL DEFAULT 0.00,
`booking_voyageID` int(11) NOT NULL,
`booking_firstName` varchar(60) NOT NULL,
`booking_lastName` varchar(60) NOT NULL,
`booking_dob` date NOT NULL,
`booking_gender` varchar(15) NOT NULL,
`booking_nationality` varchar(60) NOT NULL,
`booking_passport` varchar(25) DEFAULT NULL,
`booking_email` varchar(150) NOT NULL,
`booking_mobile` varchar(60) DEFAULT NULL,
`booking_house` varchar(150) NOT NULL,
`booking_street` varchar(150) NOT NULL,
`booking_city` varchar(60) NOT NULL,
`booking_county` varchar(60) DEFAULT NULL,
`booking_postcode` varchar(20) NOT NULL,
`booking_medical` text NOT NULL,
`booking_allergies` text NOT NULL,
`booking_swim` tinyint(4) NOT NULL,
`booking_diet` tinyint(4) NOT NULL,
`booking_emergFirstName` varchar(60) NOT NULL,
`booking_emergLastName` varchar(60) NOT NULL,
`booking_emergHouse` varchar(150) NOT NULL,
`booking_emergStreet` varchar(150) NOT NULL,
`booking_emergCity` varchar(60) NOT NULL,
`booking_emergCounty` varchar(60) DEFAULT NULL,
`booking_emergPostCode` varchar(20) NOT NULL,
`booking_emergMobile` varchar(60) NOT NULL,
`booking_emergPhone` varchar(60) DEFAULT NULL,
`booking_emergRelationship` varchar(150) NOT NULL,
`booking_dec1` tinyint(4) NOT NULL,
`booking_dec2` tinyint(4) NOT NULL,
`booking_dec3` tinyint(4) NOT NULL,
`booking_dec4` tinyint(4) NOT NULL,
`booking_media1` tinyint(4) DEFAULT 0,
`booking_media2` int(11) DEFAULT 0,
`booking_contractEmail` varchar(150) NOT NULL,
`booking_contractName` varchar(60) NOT NULL,
`booking_contractDate` date NOT NULL,
`booking_adminNotes` text DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I'm trying to count how many bookings there are based on each voyage and if none return a 0.
e.g:
Voyage 1 = 0,
Voyage 2 = 3,
Voyage 3 = 5
and so on.
At them moment I have the following, but it doesn't seem to be working. I only have one row of test data in the bookings table at the moment and 17 voyages in the voyage table.
SELECT voyage_name, voyage_id, bookings.booking_voyageID,
COUNT(bookings.booking_voyageID) AS bookingcount
FROM voyages
LEFT JOIN bookings ON voyages.voyage_id = bookings.booking_voyageID
ORDER BY voyage_name asc
I need my SQL query to return a count of 0 if there are no bookings.
I dont think count will suffice your problem, since if there is a row for voyage, it will return 1 and will not return 0.
SUM is what you want with custom column.
Try this
SELECT
voyage_id,
bookings.booking_voyageID,
SUM(CASE WHEN bookings.booking_voyageID IS NULL THEN 0 ELSE 1 END) as
bookingcount
FROM
voyages
LEFT JOIN
bookings
ON
voyages.voyage_id = bookings.booking_voyageID
GROUP BY
voyage_id
ORDER BY
voyage_id;
SQL FIDDLE

Rails 4 Migrations from MySQL Tables

I have a pre seeded database for Countries/Regions/Cities. Is it possible to generate the migration file automatically for these tables?
CREATE TABLE IF NOT EXISTS `cities` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`country_id` int(11) unsigned NOT NULL,
`region_id` int(11) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`Latitude` float NOT NULL,
`Longitude` float NOT NULL,
`TimeZone` varchar(10) NOT NULL,
`DmaId` smallint(6) DEFAULT NULL,
`County` varchar(25) DEFAULT NULL,
`Code` varchar(4) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=42965 ;
CREATE TABLE IF NOT EXISTS `countries` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`FIPS104` varchar(2) NOT NULL,
`ISO2` varchar(2) NOT NULL,
`ISO3` varchar(3) NOT NULL,
`ISON` varchar(4) NOT NULL,
`Internet` varchar(2) NOT NULL,
`Capital` varchar(25) DEFAULT NULL,
`MapReference` varchar(50) DEFAULT NULL,
`NationalitySingular` varchar(35) DEFAULT NULL,
`NationalityPlural` varchar(35) DEFAULT NULL,
`Currency` varchar(30) DEFAULT NULL,
`CurrencyCode` varchar(3) DEFAULT NULL,
`Population` bigint(20) DEFAULT NULL,
`Title` varchar(50) DEFAULT NULL,
`Comment` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=276 ;
CREATE TABLE IF NOT EXISTS `regions` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`country_id` int(11) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`Code` varchar(8) NOT NULL,
`ADM1Code` char(4) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5400 ;
You can use
rake db:schema:dump

MySQL MIN GROUP BY on large tables ( > 8000 rows)

I have the following query:
SELECT contact_purl, contact_firstName, contact_lastName, MIN( contact_id ) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl
HAVING COUNT( contact_id ) > 1
The purpose is to find any contacts with a duplicate "contact_purl," and return the first entry.
I'm running into a very strange problem... If the table has less than 8,000 rows, the query will render in less than 1 second. HOWEVER, if the table has more than 8,000 rows, the query will take consistently 338 seconds on average.
Here is the query plan for the table with ~5000 rows:
And for ~8000 rows:
The table...
CREATE TABLE IF NOT EXISTS `contacts` (
`contact_id` int(11) NOT NULL AUTO_INCREMENT,
`contact_client_id` int(11) DEFAULT NULL,
`contact_sales_id` int(11) DEFAULT NULL,
`contact_campaign_id` int(11) DEFAULT NULL,
`contact_purl` varchar(100) NOT NULL,
`contact_purl1` varchar(50) DEFAULT NULL,
`contact_purl2` varchar(50) DEFAULT NULL,
`contact_firstName` varchar(50) NOT NULL,
`contact_lastName` varchar(50) NOT NULL,
`contact_organization` varchar(100) DEFAULT NULL,
`contact_url_organization` varchar(200) DEFAULT NULL,
`contact_position` varchar(50) DEFAULT NULL,
`contact_email` varchar(100) DEFAULT NULL,
`contact_phone` varchar(20) DEFAULT NULL,
`contact_fax` varchar(20) NOT NULL,
`contact_address1` varchar(100) DEFAULT NULL,
`contact_address2` varchar(100) DEFAULT NULL,
`contact_city` varchar(100) DEFAULT NULL,
`contact_state` varchar(20) DEFAULT NULL,
`contact_zip` varchar(10) DEFAULT NULL,
`contact_IP` varchar(50) DEFAULT NULL,
`contact_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`contact_pw` varchar(200) NOT NULL,
`contact_subscribed` varchar(1) NOT NULL DEFAULT 'Y',
`contact_import` varchar(200) DEFAULT NULL,
`contacts_c_1` varchar(500) DEFAULT NULL,
`contacts_c_2` varchar(500) DEFAULT NULL,
`contacts_c_3` varchar(500) DEFAULT NULL,
`contacts_c_4` varchar(500) DEFAULT NULL,
`contacts_c_5` varchar(500) DEFAULT NULL,
`contacts_c_6` varchar(500) DEFAULT NULL,
`contacts_c_7` varchar(500) DEFAULT NULL,
`contacts_c_8` varchar(500) DEFAULT NULL,
`contacts_c_9` varchar(500) DEFAULT NULL,
`contacts_c_10` varchar(500) DEFAULT NULL,
`contacts_c_11` varchar(500) DEFAULT NULL,
`contacts_c_12` varchar(500) DEFAULT NULL,
`contacts_c_13` varchar(500) DEFAULT NULL,
`contacts_c_14` varchar(500) DEFAULT NULL,
`contacts_c_15` varchar(500) DEFAULT NULL,
`contacts_c_16` varchar(500) DEFAULT NULL,
`contacts_c_17` varchar(500) DEFAULT NULL,
`contacts_c_18` varchar(500) DEFAULT NULL,
`contacts_c_19` varchar(500) DEFAULT NULL,
`contacts_c_20` varchar(500) DEFAULT NULL,
`contacts_c_21` varchar(500) DEFAULT NULL,
`contacts_c_22` varchar(500) DEFAULT NULL,
`contacts_c_23` varchar(500) DEFAULT NULL,
`contacts_c_24` varchar(500) DEFAULT NULL,
`contacts_c_25` varchar(500) DEFAULT NULL,
`contacts_c_26` varchar(500) DEFAULT NULL,
`contacts_c_27` varchar(500) DEFAULT NULL,
`contacts_c_28` varchar(500) DEFAULT NULL,
`contacts_c_29` varchar(500) DEFAULT NULL,
`contacts_c_30` varchar(500) DEFAULT NULL,
`contacts_c_31` varchar(500) DEFAULT NULL,
`contacts_c_32` varchar(500) DEFAULT NULL,
`contacts_c_33` varchar(500) DEFAULT NULL,
`contacts_c_34` varchar(500) DEFAULT NULL,
`contacts_c_35` varchar(500) DEFAULT NULL,
`contacts_c_36` varchar(500) DEFAULT NULL,
`contacts_c_37` varchar(500) DEFAULT NULL,
`contacts_c_38` varchar(500) DEFAULT NULL,
`contacts_c_39` varchar(500) DEFAULT NULL,
`contacts_c_40` varchar(500) DEFAULT NULL,
`contacts_c_41` varchar(500) DEFAULT NULL,
`contacts_c_42` varchar(500) DEFAULT NULL,
`contacts_c_43` varchar(500) DEFAULT NULL,
`contacts_c_44` varchar(500) DEFAULT NULL,
`contacts_c_45` varchar(500) DEFAULT NULL,
`contacts_c_46` varchar(500) DEFAULT NULL,
`contacts_c_47` varchar(500) DEFAULT NULL,
`contacts_c_48` varchar(500) DEFAULT NULL,
`contacts_c_49` varchar(500) DEFAULT NULL,
`contacts_c_50` varchar(500) DEFAULT NULL,
`contacts_i_1` varchar(100) DEFAULT NULL,
`contacts_i_2` varchar(100) DEFAULT NULL,
`contacts_i_3` varchar(100) DEFAULT NULL,
`contacts_i_4` varchar(100) DEFAULT NULL,
`contacts_i_5` varchar(100) DEFAULT NULL,
`contacts_i_6` varchar(100) DEFAULT NULL,
`contacts_i_7` varchar(100) DEFAULT NULL,
`contacts_i_8` varchar(100) DEFAULT NULL,
`contacts_i_9` varchar(100) DEFAULT NULL,
`contacts_i_10` varchar(100) DEFAULT NULL,
`contacts_i_11` varchar(100) DEFAULT NULL,
`contacts_i_12` varchar(100) DEFAULT NULL,
`contacts_i_13` varchar(100) DEFAULT NULL,
`contacts_i_14` varchar(100) DEFAULT NULL,
`contacts_i_15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`contact_id`),
KEY `contact_campaign_id` (`contact_campaign_id`),
KEY `contact_client_id` (`contact_client_id`),
KEY `contact_purl2` (`contact_purl2`),
KEY `contact_purl1` (`contact_purl1`),
KEY `contact_purl` (`contact_purl`)
)
I have recently Optimized and Defragmented the table as well.
Any ideas on what would be causing this?
First off, thank you for posting your table structure, query, and EXPLAIN output in your question. I think you're crossing the memory / disk temporary table size boundary, thus the large performance change. If you put a unique index on the contact_purl column, MySQL won't allow duplicates to be inserted. This would make your query unnecessary. Otherwise, I'd create an index on (contact_client_id, contact_purl) so MySQL can figure out what rows you want from the indexes directly. You could also try separating the search for the columns and retrieving them by using a subquery. Something like this maybe:
SELECT contact_purl, contact_firstName, contact_lastName, contact_id
FROM contacts, (SELECT MIN(contact_id) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl
HAVING COUNT( contact_id ) > 1) nodups WHERE nodups.MinID = contacts.contact_id

Optimize MySQL query to find Duplicates?

I originally asked this question here.
I'm using the following query to return all duplicate records with the same first and last name. The trick is that the contact_id, has to be in descending order.
Problem is that the database has a few million records in the "contacts" table. They queory takes several minutes to complete.
I have the contact_firstName, contact_lastName, contact_client_id, and contact_id all indexed in the database.
Any other ideas on how this query can be optimized a little further?
SELECT c.contact_id, c.contact_purl, c.contact_firstName, c.contact_lastName, c.contact_organization
FROM (
SELECT contact_purl, contact_firstName, contact_lastName, MIN(contact_id) AS MinID
FROM contacts
WHERE contact_client_id = 1
GROUP BY contact_purl HAVING COUNT(contact_id) > 1) t
INNER JOIN contacts c
ON t.contact_purl = c.contact_purl
AND c.contact_client_id = 1
AND t.MinID <> c.contact_id
ORDER BY contact_id asc
EXPLAIN:
SCHEMA:
CREATE TABLE IF NOT EXISTS `contacts` (
`contact_id` int(11) NOT NULL AUTO_INCREMENT,
`contact_client_id` int(11) DEFAULT NULL,
`contact_sales_id` int(11) DEFAULT NULL,
`contact_campaign_id` int(11) DEFAULT NULL,
`contact_purl` varchar(100) NOT NULL,
`contact_purl1` varchar(50) DEFAULT NULL,
`contact_purl2` varchar(50) DEFAULT NULL,
`contact_firstName` varchar(50) NOT NULL,
`contact_lastName` varchar(50) NOT NULL,
`contact_organization` varchar(100) DEFAULT NULL,
`contact_url_organization` varchar(200) DEFAULT NULL,
`contact_position` varchar(50) DEFAULT NULL,
`contact_email` varchar(100) DEFAULT NULL,
`contact_phone` varchar(20) DEFAULT NULL,
`contact_fax` varchar(20) NOT NULL,
`contact_address1` varchar(100) DEFAULT NULL,
`contact_address2` varchar(100) DEFAULT NULL,
`contact_city` varchar(100) DEFAULT NULL,
`contact_state` varchar(20) DEFAULT NULL,
`contact_zip` varchar(10) DEFAULT NULL,
`contact_IP` varchar(50) DEFAULT NULL,
`contact_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`contact_pw` varchar(200) NOT NULL,
`contact_subscribed` varchar(1) NOT NULL DEFAULT 'Y',
`contact_import` varchar(200) DEFAULT NULL,
`contacts_c_1` varchar(500) DEFAULT NULL,
`contacts_c_2` varchar(500) DEFAULT NULL,
`contacts_c_3` varchar(500) DEFAULT NULL,
`contacts_c_4` varchar(500) DEFAULT NULL,
`contacts_c_5` varchar(500) DEFAULT NULL,
`contacts_c_6` varchar(500) DEFAULT NULL,
`contacts_c_7` varchar(500) DEFAULT NULL,
`contacts_c_8` varchar(500) DEFAULT NULL,
`contacts_c_9` varchar(500) DEFAULT NULL,
`contacts_c_10` varchar(500) DEFAULT NULL,
`contacts_c_11` varchar(500) DEFAULT NULL,
`contacts_c_12` varchar(500) DEFAULT NULL,
`contacts_c_13` varchar(500) DEFAULT NULL,
`contacts_c_14` varchar(500) DEFAULT NULL,
`contacts_c_15` varchar(500) DEFAULT NULL,
`contacts_c_16` varchar(500) DEFAULT NULL,
`contacts_c_17` varchar(500) DEFAULT NULL,
`contacts_c_18` varchar(500) DEFAULT NULL,
`contacts_c_19` varchar(500) DEFAULT NULL,
`contacts_c_20` varchar(500) DEFAULT NULL,
`contacts_c_21` varchar(500) DEFAULT NULL,
`contacts_c_22` varchar(500) DEFAULT NULL,
`contacts_c_23` varchar(500) DEFAULT NULL,
`contacts_c_24` varchar(500) DEFAULT NULL,
`contacts_c_25` varchar(500) DEFAULT NULL,
`contacts_c_26` varchar(500) DEFAULT NULL,
`contacts_c_27` varchar(500) DEFAULT NULL,
`contacts_c_28` varchar(500) DEFAULT NULL,
`contacts_c_29` varchar(500) DEFAULT NULL,
`contacts_c_30` varchar(500) DEFAULT NULL,
`contacts_c_31` varchar(500) DEFAULT NULL,
`contacts_c_32` varchar(500) DEFAULT NULL,
`contacts_c_33` varchar(500) DEFAULT NULL,
`contacts_c_34` varchar(500) DEFAULT NULL,
`contacts_c_35` varchar(500) DEFAULT NULL,
`contacts_c_36` varchar(500) DEFAULT NULL,
`contacts_c_37` varchar(500) DEFAULT NULL,
`contacts_c_38` varchar(500) DEFAULT NULL,
`contacts_c_39` varchar(500) DEFAULT NULL,
`contacts_c_40` varchar(500) DEFAULT NULL,
`contacts_c_41` varchar(500) DEFAULT NULL,
`contacts_c_42` varchar(500) DEFAULT NULL,
`contacts_c_43` varchar(500) DEFAULT NULL,
`contacts_c_44` varchar(500) DEFAULT NULL,
`contacts_c_45` varchar(500) DEFAULT NULL,
`contacts_c_46` varchar(500) DEFAULT NULL,
`contacts_c_47` varchar(500) DEFAULT NULL,
`contacts_c_48` varchar(500) DEFAULT NULL,
`contacts_c_49` varchar(500) DEFAULT NULL,
`contacts_c_50` varchar(500) DEFAULT NULL,
`contacts_i_1` varchar(100) DEFAULT NULL,
`contacts_i_2` varchar(100) DEFAULT NULL,
`contacts_i_3` varchar(100) DEFAULT NULL,
`contacts_i_4` varchar(100) DEFAULT NULL,
`contacts_i_5` varchar(100) DEFAULT NULL,
`contacts_i_6` varchar(100) DEFAULT NULL,
`contacts_i_7` varchar(100) DEFAULT NULL,
`contacts_i_8` varchar(100) DEFAULT NULL,
`contacts_i_9` varchar(100) DEFAULT NULL,
`contacts_i_10` varchar(100) DEFAULT NULL,
`contacts_i_11` varchar(100) DEFAULT NULL,
`contacts_i_12` varchar(100) DEFAULT NULL,
`contacts_i_13` varchar(100) DEFAULT NULL,
`contacts_i_14` varchar(100) DEFAULT NULL,
`contacts_i_15` varchar(100) DEFAULT NULL,
PRIMARY KEY (`contact_id`),
KEY `contact_campaign_id` (`contact_campaign_id`),
KEY `contact_client_id` (`contact_client_id`),
KEY `contact_purl2` (`contact_purl2`),
KEY `contact_purl1` (`contact_purl1`),
KEY `contact_purl` (`contact_purl`)
)
You could do it simply making a join on itself, like this:
SELECT DISTINCT c1.contact_id, c1.contact_firstName, c1.contact_lastName,
RIGHT(c1.contact_lastName,1) AS nameNum
FROM
contacts c1 INNER JOIN contacts c2
ON c1.contact_firstName = c2.contact_firstName
AND c1.contact_lastName = c2.contact_lastName
AND c2.contact_client_id = 1
AND c1.contact_id <> c2.contact_id
ORDER BY c1.contact_id DESC
Would it make sense to keep track of the number of contacts for each client separately and store it in another column vs doing that sub-select query every time?
With that number of records, it might be more efficient to store some of those calculations instead of having to query them in real time.