I have a complex query that is dynamically assembled based upon search criteria. However, in its simplest form, it is still very slow. The main table it runs against has ~10M records. I ran an explain against a 'base' query and the first row of the explain looks bad (at least to a novice dba like me). I have read a couple tutorials about EXPLAIN, but I still am unsure how to fix the query. So, the first row of the results seems to indicate the problem, but I don't know what to do with it. I couldn't make a composite key that long even if I wanted to and some of the field names in that possible_keys column are not even in the patients table. Any help will be greatly appreciated.
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,patients,range,"PRIMARY,location,appt_date,status,radiologist,contract,lastname,paperwork,images_archived,hash,created,document_attached,all_images_archived,last_image_archived,modality,study_uid,company,second_access,firstname,report_delivered,ssn,order_entry_status,dob,tech,doctor,mobile_facility,accession,location_appt_date,location_created,location_lastname,ref,person_seq",location_appt_date,55,NULL,573534,"Using index condition; Using where; Using temporary; Using filesort"
1,SIMPLE,receivable_transactions,ref,patient_seq,patient_seq,4,ris-dev.patients.seq,1,NULL
1,SIMPLE,patients_dispatch,ref,patient_seq,patient_seq,4,ris-dev.patients.seq,1,NULL
1,SIMPLE,mobile_facility,ref,"unique_index,name,location",unique_index,115,"ris-dev.patients.mobile_facility,const",1,"Using where"
1,SIMPLE,mobile_facility_service_areas,eq_ref,PRIMARY,PRIMARY,4,ris-dev.mobile_facility.service_area,1,NULL
Edit: same EXPLAIN, but reformatted to be easier to read:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE patients range PRIMARY location_appt_date 55 NULL 573534 Using index condition; Using where; Using temporary; Using filesort
location
appt_date
status
radiologist
contract
lastname
paperwork
images_archived
hash
created
document_attached
all_images_archived
last_image_archived
modality
study_uid
company
second_access
firstname
report_delivered
ssn
order_entry_status
dob
tech
doctor
mobile_facility
accession
location_appt_date
location_created
location_lastname
ref
person_seq
1 SIMPLE receivable_transactions ref patient_seq patient_seq 4 ris-dev.patients.seq 1 NULL
1 SIMPLE patients_dispatch ref patient_seq patient_seq 4 ris-dev.patients.seq 1 NULL
1 SIMPLE mobile_facility ref unique_index unique_index 115 ris-dev.patients.mobile_facility,const 1 Using where
name
location
1 SIMPLE mobile_facility_service_areas eq_ref PRIMARY PRIMARY 4 ris-dev.mobile_facility.service_area 1 NULL
The explain is setup against the following query and table structures.
SELECT patients.fax_in_queue, patients.modality, patients.stat, patients.created, patients.seq, patients.lastname,
patients.firstname, patients.appt_date, patients.status, patients.contract, patients.location, patients.unique_hash,
patients.images_archived, patients.report_delivered, patients.doctor, patients.mobile_facility, patients.history,
patients.dob, patients.all_images_archived, patients.order_entry_status, patients.tech, patients.radiologist,
patients.last_image_archived, patients.state, patients.ss_comments, patients.completed, patients.report_status,
patients.have_paperwork, patients.facility_room_number, patients.facility_station_name, patients.facility_bed,
patients.findings_level, patients.document_attached, patients.study_start, patients.company, patients.accession,
patients.number_images, patients.client_number_images, patients.sex, patients.threshhold , GROUP_CONCAT(CONCAT(CONCAT(receivable_transactions.modifier, " "),
receivable_transactions.description) SEPARATOR ", ") AS rt_desc , patients_dispatch.seq AS doc_seq, patients_dispatch.requisition_last_sent,
patients_dispatch.requisition_signed_by_file_seq, patients_dispatch.requisition_signed, patients_dispatch.order_reason, patients_dispatch.order_comments,
patients_dispatch.order_taken, patients_dispatch.order_tech_last_notified, patients_dispatch.order_tech_in_transit, patients_dispatch.order_tech_in,
patients_dispatch.order_tech_out, patients_dispatch.order_tech_ack, patients_dispatch.addr1 AS d_addr1, patients_dispatch.addr2 AS d_addr2,
patients_dispatch.city AS d_city, patients_dispatch.state AS d_state, patients_dispatch.zip AS d_zip, CONCAT(patients.status, order_tech_out,
order_tech_in, order_tech_in_transit) as pseudo_status , mobile_facility.requisition_fax, mobile_facility.station_list, mobile_facility.address1 as mf_addr1,
mobile_facility.address2 as mf_addr2, mobile_facility.city as mf_city, mobile_facility.state as mf_state, mobile_facility.zip as mf_zip,
mobile_facility.phone as mf_phone, mobile_facility.phone2 as mf_phone2, mobile_facility_service_areas.name as mf_service_area
FROM patients LEFT JOIN receivable_transactions ON patients.seq = receivable_transactions.patient_seq
LEFT JOIN patients_dispatch ON patients.seq = patients_dispatch.patient_seq
LEFT JOIN mobile_facility ON patients.location = mobile_facility.location AND patients.mobile_facility = mobile_facility.name
LEFT JOIN mobile_facility_service_areas ON mobile_facility.service_area = mobile_facility_service_areas.seq
WHERE patients.location = "XYZCompany" AND ((patients.appt_date >= '2020-03-19' AND patients.appt_date <= '2020-03-19 23:59:59')
OR (patients.appt_date <= '2020-03-19' AND patients.status < 'X'))
GROUP BY patients.seq DESC
ORDER BY patients.status, patients.order_entry_status, pseudo_status, patients.order_entry_status,patients.lastname);
CREATE TABLE `patients` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`person_seq` int(11) NOT NULL,
`firstname` varchar(20) NOT NULL DEFAULT '',
`lastname` varchar(30) NOT NULL DEFAULT '',
`middlename` varchar(20) NOT NULL DEFAULT '',
`ref` varchar(50) NOT NULL DEFAULT '',
`location` varchar(50) NOT NULL DEFAULT '',
`doctor` varchar(50) NOT NULL,
`radiologist` varchar(20) NOT NULL DEFAULT '',
`contract` varchar(50) NOT NULL,
`history` mediumtext NOT NULL,
`dob` varchar(15) NOT NULL DEFAULT '0000-00-00',
`appt_date` date NOT NULL DEFAULT '0000-00-00',
`status` tinyint(4) NOT NULL DEFAULT '0',
`tech` varchar(50) NOT NULL DEFAULT '',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`ss_comments` mediumtext NOT NULL,
`mobile_facility` varchar(60) NOT NULL DEFAULT '',
`facility_room_number` varchar(50) NOT NULL,
`facility_bed` varchar(20) NOT NULL,
`facility_station_name` varchar(50) NOT NULL,
`stat` tinyint(4) NOT NULL DEFAULT '0',
`have_paperwork` tinyint(4) NOT NULL DEFAULT '0',
`completed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`sex` char(1) NOT NULL DEFAULT '',
`unique_hash` varchar(100) NOT NULL DEFAULT '',
`number_images` int(11) NOT NULL DEFAULT '0',
`client_number_images` int(11) NOT NULL,
`images_archived` tinyint(4) NOT NULL DEFAULT '0',
`completed_fax` varchar(10) NOT NULL DEFAULT '0' COMMENT 'This is the number the completed report is faxed to.',
`report_delivered` tinyint(4) NOT NULL DEFAULT '0',
`report_delivered_time` datetime NOT NULL,
`document_attached` tinyint(4) NOT NULL DEFAULT '0',
`modality` varchar(3) NOT NULL,
`last_image_archived` datetime NOT NULL,
`all_images_archived` tinyint(4) NOT NULL DEFAULT '0',
`fax_in_queue` varchar(12) NOT NULL,
`accession` varchar(100) NOT NULL,
`study_uid` varchar(100) NOT NULL,
`order_entry_status` tinyint(4) NOT NULL,
`compare_to` varchar(15) NOT NULL,
`state` varchar(3) NOT NULL,
`company` int(11) NOT NULL,
`second_access` varchar(50) NOT NULL,
`threshhold` datetime NOT NULL,
`report_status` tinyint(4) NOT NULL,
`second_id` varchar(50) NOT NULL,
`rad_alerted` tinyint(4) NOT NULL,
`assigned` datetime NOT NULL,
`findings_level` tinyint(4) NOT NULL,
`report_viewed` tinyint(4) NOT NULL,
`study_received` datetime NOT NULL,
`study_start` datetime NOT NULL,
`study_end` datetime NOT NULL,
`completed_email` varchar(50) NOT NULL,
`completed_send` varchar(255) NOT NULL,
`ssn` varchar(12) NOT NULL,
`exorder_number` varchar(30) NOT NULL,
`exvisit_number` varchar(30) NOT NULL,
`row_updated` timestamp NOT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`seq`),
KEY `location` (`location`),
KEY `appt_date` (`appt_date`),
KEY `status` (`status`),
KEY `radiologist` (`radiologist`),
KEY `contract` (`contract`),
KEY `lastname` (`lastname`),
KEY `paperwork` (`have_paperwork`),
KEY `images_archived` (`images_archived`),
KEY `hash` (`unique_hash`),
KEY `created` (`created`),
KEY `document_attached` (`document_attached`),
KEY `all_images_archived` (`all_images_archived`),
KEY `last_image_archived` (`last_image_archived`),
KEY `modality` (`modality`),
KEY `study_uid` (`study_uid`),
KEY `company` (`company`),
KEY `second_access` (`second_access`),
KEY `firstname` (`firstname`),
KEY `report_delivered` (`report_delivered`),
KEY `ssn` (`ssn`),
KEY `order_entry_status` (`order_entry_status`),
KEY `dob` (`dob`),
KEY `tech` (`tech`),
KEY `doctor` (`doctor`),
KEY `mobile_facility` (`mobile_facility`),
KEY `accession` (`accession`),
KEY `location_appt_date` (`location`,`appt_date`),
KEY `location_created` (`location`,`created`),
KEY `location_lastname` (`location`,`lastname`),
KEY `ref` (`ref`),
KEY `person_seq` (`person_seq`)
) ENGINE=InnoDB AUTO_INCREMENT=10242952 DEFAULT CHARSET=latin1;
CREATE TABLE `receivable_transactions` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`patient_seq` int(11) NOT NULL DEFAULT '0',
`cptcode` varchar(15) NOT NULL DEFAULT '',
`modifier` char(2) NOT NULL DEFAULT '',
`description` varchar(100) NOT NULL DEFAULT '',
`amount` decimal(6,2) NOT NULL DEFAULT '0.00',
`type` char(2) NOT NULL DEFAULT '',
`transaction` varchar(10) NOT NULL DEFAULT '',
`radiologist` varchar(20) NOT NULL DEFAULT '',
`status` tinyint(4) NOT NULL DEFAULT '0',
`completed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`report_meta_seq` int(11) NOT NULL DEFAULT '0',
`report_header` varchar(255) NOT NULL,
`report_body` blob NOT NULL,
`report_impression` mediumtext NOT NULL,
`report_hide` tinyint(4) NOT NULL,
`radiologist_group` varchar(50) NOT NULL,
`addendum` int(4) NOT NULL DEFAULT '0',
`addendum_type` varchar(20) NOT NULL,
`peer_review` int(4) NOT NULL DEFAULT '0',
`qa_reason` varchar(255) NOT NULL DEFAULT '',
`qa_agree` decimal(2,1) NOT NULL DEFAULT '0.0',
`findings` tinyint(4) NOT NULL,
`comments` mediumtext NOT NULL,
`company` int(11) NOT NULL,
`row_updated` timestamp NOT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`seq`),
KEY `patient_seq` (`patient_seq`),
KEY `cptcode` (`cptcode`),
KEY `transaction` (`transaction`),
KEY `type` (`type`),
KEY `created` (`created`),
KEY `radiologist` (`radiologist`),
KEY `status` (`status`),
KEY `report_meta_seq` (`report_meta_seq`),
KEY `Billing Check Dropdown` (`status`,`completed`),
KEY `qa_agree` (`qa_agree`),
KEY `peer_review` (`peer_review`),
KEY `addendum` (`addendum`),
KEY `company` (`company`),
KEY `completed` (`completed`)
) ENGINE=InnoDB AUTO_INCREMENT=9380351 DEFAULT CHARSET=latin1;
CREATE TABLE `patients_dispatch` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`patient_seq` int(11) NOT NULL,
`order_taken` datetime NOT NULL,
`order_taken_by` varchar(50) NOT NULL,
`order_person_calling` varchar(50) NOT NULL,
`order_supervising_physician` varchar(50) NOT NULL,
`order_trip_count` tinyint(4) NOT NULL,
`order_trip_count_max` tinyint(4) NOT NULL,
`order_trip_visit` tinyint(4) NOT NULL,
`order_tech_in` datetime NOT NULL,
`order_tech_out` datetime NOT NULL,
`order_ssn` varchar(12) NOT NULL,
`order_service_request_time` datetime NOT NULL,
`order_reason` varchar(255) NOT NULL,
`order_tech_ack` datetime NOT NULL,
`order_tech_assigned` datetime NOT NULL,
`order_tech_last_notified` datetime NOT NULL,
`requisition_last_sent` datetime NOT NULL,
`requisition_signed` datetime NOT NULL,
`requisition_signed_by` varchar(50) NOT NULL,
`requisition_signed_by_text` varchar(75) NOT NULL,
`requisition_signed_by_file_seq` int(11) NOT NULL,
`order_comments` mediumtext NOT NULL,
`order_tech_in_transit` datetime NOT NULL,
`fasting` tinyint(1) NOT NULL,
`collection_time` time DEFAULT NULL,
`addr1` varchar(100) NOT NULL,
`addr2` varchar(100) NOT NULL,
`city` varchar(30) NOT NULL,
`state` varchar(3) NOT NULL,
`zip` varchar(12) NOT NULL,
`phone` varchar(15) NOT NULL,
`mileage_start` int(11) NOT NULL,
`mileage_end` int(11) NOT NULL,
PRIMARY KEY (`seq`),
KEY `patient_seq` (`patient_seq`)
) ENGINE=InnoDB AUTO_INCREMENT=2261091 DEFAULT CHARSET=latin1;
CREATE TABLE `mobile_facility` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(60) NOT NULL,
`location` varchar(50) DEFAULT NULL,
`address1` varchar(50) NOT NULL,
`address2` varchar(50) NOT NULL,
`city` varchar(50) NOT NULL,
`state` varchar(2) NOT NULL,
`zip` varchar(10) NOT NULL,
`phone` varchar(15) NOT NULL,
`phone2` varchar(15) NOT NULL,
`fax` varchar(110) NOT NULL,
`rads_can_read` text NOT NULL,
`rads_cant_read` text NOT NULL,
`only_techs` text NOT NULL,
`never_modalities` varchar(255) NOT NULL COMMENT 'A serialized list of modalities a facility may not use.',
`station_list` mediumtext NOT NULL,
`email` varchar(255) NOT NULL,
`misc1` varchar(255) NOT NULL,
`latitude` float NOT NULL DEFAULT '0',
`longitude` float NOT NULL DEFAULT '0',
`affiliation` int(11) NOT NULL COMMENT 'mobile_facility_affiliations seq',
`branch` int(11) NOT NULL COMMENT 'mobile_facility_branches seq',
`service_area` int(11) NOT NULL COMMENT 'mobile_facility_service_areas seq',
`other_id` varchar(50) NOT NULL COMMENT 'Usually used for HL7',
`facility_type` varchar(2) DEFAULT NULL,
`no_stat` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'Should the facility allow stat priority on patients?',
`facility_notes` varchar(512) DEFAULT NULL,
`requisition_fax` varchar(110) NOT NULL,
`report_template` text NOT NULL,
`all_orders_stat` tinyint(1) NOT NULL,
`sms_notification` varchar(15) NOT NULL,
`tat` varchar(10) NOT NULL,
`npi` varchar(15) NOT NULL,
`NMXR` tinyint(4) NOT NULL DEFAULT '0',
`billing_type` varchar(10) NOT NULL,
`salesman` varchar(75) NOT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`default_bill_to` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`seq`),
UNIQUE KEY `unique_index` (`name`,`location`),
KEY `name` (`name`),
KEY `location` (`location`)
) ENGINE=InnoDB AUTO_INCREMENT=155104 DEFAULT CHARSET=latin1;
CREATE TABLE `mobile_facility_service_areas` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`location` varchar(50) NOT NULL,
PRIMARY KEY (`seq`)
) ENGINE=InnoDB AUTO_INCREMENT=841 DEFAULT CHARSET=latin1;
It's only using the index on location, but that only narrows down the search to about a half a million rows. You'd like it to use an index to further narrow down by the appt_date.
However, the use of OR in your WHERE clause is causing a problem. It can't decide how to use the index.
Here's what I suggest:
Drop the index on location because it's redundant with the other indexes that have location as their first column.
Replace the index on location_appt_date with an index on location_appt_date_status.
ALTER TABLE patients
DROP KEY location,
DROP KEY location_appt_date,
ADD KEY location_appt_date_status (location, appt_date, status);
Refactor the query to use UNION instead of OR:
SELECT ... (all the columns you have) ...
FROM (
SELECT * FROM patients USE INDEX (location_appt_date_status)
WHERE location = 'XYZCompany' AND appt_date >= '2020-03-19' AND appt_date < '2020-03-20'
UNION
SELECT * FROM patients USE INDEX (location_appt_date_status)
WHERE location = 'XYZCompany' AND appt_date <= '2020-03-19' AND status < 'X'
) AS p
LEFT JOIN receivable_transactions FORCE INDEX (patient_seq)
ON p.seq = receivable_transactions.patient_seq
LEFT JOIN patients_dispatch FORCE INDEX (patient_seq)
ON p.seq = patients_dispatch.patient_seq
INNER JOIN mobile_facility FORCE INDEX (unique_index)
ON p.location = mobile_facility.location AND p.mobile_facility = mobile_facility.name
INNER JOIN mobile_facility_service_areas FORCE INDEX (PRIMARY)
ON mobile_facility.service_area = mobile_facility_service_areas.seq
GROUP BY p.seq
ORDER BY p.status, p.order_entry_status, pseudo_status, p.order_entry_status, p.lastname
You might not need all the USE INDEX() / FORCE INDEX() optimizer hints I used. I did those because I was testing with empty tables, and that can confuse the optimizer.
Let me focus on the part that affects optimization the most:
FROM patients AS p
LEFT JOIN receivable_transactions AS rt ON p.seq = rt.patient_seq
LEFT JOIN patients_dispatch AS pd ON p.seq = pd.patient_seq
LEFT JOIN mobile_facility AS mf ON p.location = mf.location
AND p.mobile_facility AS mf = mf.name
LEFT JOIN mobile_facility_service_areas AS sa ON mf.service_area = sa.seq
WHERE p.location = "XYZCompany"
AND ((p.appt_date >= '2020-03-19'
AND p.appt_date <= '2020-03-19 23:59:59')
OR (p.appt_date <= '2020-03-19'
AND p.status < 'X')
)
GROUP BY p.seq DESC
ORDER BY p.status, p.order_entry_status, pseudo_status, p.order_entry_status,
p.lastname);
The biggest issue is the OR. It often prevents most optimizations. The usual fix is to turn it into a UNION:
( SELECT ...
FROM .. JOIN ..
WHERE p.location = "XYZCompany"
AND p.appt_date >= '2020-03-19'
AND p.appt_date < '2020-03-19' + INTERVAL 1 DAY
...
)
UNION ALL
( SELECT ...
FROM .. JOIN ..
WHERE p.location = "XYZCompany"
AND p.appt_date <= '2020-03-19'
AND p.status < 'X'
...
)
Each select can benefit from this composite index on patients:
(location, appt_date, status)
The < 'X' is problematic because two ranges (appt_date and status) cannot both be used effectively. What are the possible values of status? If there is only one value before 'X', say 'M', then this would be much better: p.status = 'M' together with another index: (location, status, appt_date)
SELECT lots of stuff, then GROUP BY p.seq -- This is probably create strange results. (Search for ONLY_FULL_GROUP_BY for more discussion). It may be better to first get the patients.seq values (since that is all you are filtering on), then join to the other tables. This would eliminate the GROUP BY, or at least force you to deal with which row to fetch from each of the other tables.
range location_appt_date 55 573534 Using index condition; Using where; Using temporary; Using filesort -- says
55 = 2+50 (for varchar(50)) + 3 (for date) -- neither is NULL.
Based on the 55, I wonder if it is so well optimized that the OR->UNION is not needed.
"Using index condition" is internally called ICP (Index Condition Pushdown) if you want further understanding.
"Using filesort" may be an understatement -- There are probably two sorts, one for GROUP BY, one for ORDER BY. EXPLAIN FORMAT=JSON SELECT ... would make it clear. (And hence my hint that the GROUP BY should be avoided.
You have some redundant indexes (not relevant to much other than disk space): INDEX(a,b), INDEX(a) --> toss INDEX(a).
patients has an awful number of indexes.
The other tables seem to have adequate indexes for your query.
I'm getting some strange timing values from Mysql running a "simple" query.
This is the DDL of the table:
CREATE TABLE `frame` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`createdBy` varchar(255) DEFAULT NULL,
`createdDate` datetime(6) NOT NULL,
`lastModifiedBy` varchar(255) DEFAULT NULL,
`lastModifiedDate` datetime(6) DEFAULT NULL,
`sid` varchar(36) NOT NULL,
`version` bigint(20) NOT NULL,
`brand` varchar(255) DEFAULT NULL,
`category` varchar(255) DEFAULT NULL,
`colorCode` varchar(255) DEFAULT NULL,
`colorDescription` varchar(255) DEFAULT NULL,
`description` longtext,
`imageUrl` varchar(255) DEFAULT NULL,
`lastPurchase` datetime(6) DEFAULT NULL,
`lastPurchasePrice` decimal(19,2) DEFAULT NULL,
`lastSell` datetime(6) DEFAULT NULL,
`lastSellPrice` decimal(19,2) DEFAULT NULL,
`line` varchar(255) DEFAULT NULL,
`manufacturer` varchar(255) DEFAULT NULL,
`manufacturerCode` varchar(255) DEFAULT NULL,
`name` varchar(255) NOT NULL,
`preset` bit(1) NOT NULL DEFAULT b'0',
`purchasePrice` decimal(19,2) DEFAULT NULL,
`salesPrice` decimal(19,2) DEFAULT NULL,
`sku` varchar(255) NOT NULL,
`stock` bit(1) NOT NULL DEFAULT b'1',
`thumbUrl` varchar(255) DEFAULT NULL,
`upc` varchar(255) DEFAULT NULL,
`arm` int(11) DEFAULT NULL,
`bridge` int(11) DEFAULT NULL,
`caliber` int(11) DEFAULT NULL,
`gender` varchar(255) DEFAULT NULL,
`lensColor` varchar(255) DEFAULT NULL,
`material` varchar(255) DEFAULT NULL,
`model` varchar(255) NOT NULL,
`sphere` decimal(10,2) DEFAULT NULL,
`type` varchar(255) NOT NULL,
`taxRate_id` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_k7s4esovkoacsc264bcjrre13` (`sid`),
UNIQUE KEY `UK_ajh6mr6a6qg6mgy8t9nevdym1` (`sku`),
UNIQUE KEY `UK_boqikmg9o89j8q0o5ujkj33b3` (`upc`),
KEY `idx_manufacturer` (`manufacturer`),
KEY `idx_brand` (`brand`),
KEY `idx_line` (`line`),
KEY `idx_colorcode` (`colorCode`),
KEY `idx_preset` (`preset`),
KEY `idx_manufacturer_model_color_caliber` (`manufacturer`,`model`,`colorCode`,`caliber`),
KEY `FK1nau29fd70s1nq905dgs6ft85` (`taxRate_id`),
CONSTRAINT `FK1nau29fd70s1nq905dgs6ft85` FOREIGN KEY (`taxRate_id`) REFERENCES `taxrate` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=392179 DEFAULT CHARSET=utf8;
The query is created programatically from my application. The "strange" syntax (NULL IS NULL OR condition) is very convenient to me in order to make more compact my code and removing the need to create a different query based on the numbers of parameters.
For who understand how Hibernate HQL and JPA works, this is the query:
This query is generated when the user is not setting any filter, so all parameters in my condition are null and this is how the query comes out.
SELECT SQL_NO_CACHE COUNT(frame0_.`id`) AS col_0_0_ FROM `Frame` frame0_
WHERE (NULL IS NULL OR NULL LIKE CONCAT('%', NULL, '%') OR frame0_.`manufacturer` LIKE CONCAT('%', NULL, '%') OR frame0_.`manufacturerCode`=NULL OR frame0_.`sku`=NULL OR frame0_.`upc`=NULL OR frame0_.`line` LIKE CONCAT('%', NULL, '%') OR frame0_.`model` LIKE CONCAT('%', NULL, '%')) AND (NULL IS NULL OR frame0_.`manufacturer`=NULL) AND (NULL IS NULL OR frame0_.`line`=NULL) AND (NULL IS NULL OR frame0_.`caliber`=NULL) AND (NULL IS NULL OR frame0_.`type`=NULL) AND (NULL IS NULL OR frame0_.`material`=NULL) AND (NULL IS NULL OR frame0_.`model`=NULL) AND (NULL IS NULL OR frame0_.`colorCode`=NULL)
The query takes about 0.105s on a table of 137548 rows.
The EXPLAIN of the previous query returns:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE frame0_ \N ALL \ N \N \N \N 137548 100.00 \N
The previous query is identical to this one:
SELECT SQL_NO_CACHE COUNT(frame0_.`id`) AS col_0_0_ FROM `Frame` frame0_
This query takes just 0.05s for the same result in the same table.
Why for Mysql they are different and the first is taking so much time? Is there a way to improve performance of the first query keeping the syntax "NULL IS NULL or condition"?
I think Ryan has right, so the many or statement made the query so bad.
You should programatically build queries for better performance. So if the user not select by a possible filter, than you shouldn't include to the query!
(HQL)
if(!StringUtils.isEmpty(manufacturer)) {
query.and(m.manufacturer.eq(manufacturer))
}
I have this query:
SELECT
c.*,
cv.views
FROM
content AS c
JOIN
content_views AS cv ON cv.content = c.record_num
WHERE
c.enabled = 1
ORDER BY
cv.views
Quite simple, but it's really slow... Is there a way to make it faster ?
This is my EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE c ref enabled_2,enabled enabled 4 const 23947 Using temporary; Using filesort
1 SIMPLE cv eq_ref PRIMARY PRIMARY 4 c.record_num 1
EDIT 2016-02-24
Please note that usually, I use a LIMIT so the number of records returned in the EXPLAIN isn't entirely accurate, however for the sake of simplicity and because the performance doesn't change with the LIMIT or without it, I have removed it.
As requested in the comments, this is the result of my SHOW CREATE TABLE. As you can see, one of my table is MyISAM while the other is InnoDB.
CREATE TABLE `content` (
`title` varchar(255) NOT NULL DEFAULT '',
`filename` varchar(255) NOT NULL DEFAULT '',
`filename_2` varchar(255) NOT NULL,
`filename_3` varchar(255) NOT NULL,
`orig_filename` varchar(255) NOT NULL,
`trailer_filename` varchar(255) NOT NULL,
`thumbnail` varchar(255) NOT NULL DEFAULT '',
`embed` text NOT NULL,
`description` text NOT NULL,
`paysite` int(11) NOT NULL DEFAULT '0',
`keywords` varchar(255) NOT NULL,
`model` varchar(255) NOT NULL DEFAULT '',
`scheduled_date` date NOT NULL DEFAULT '0000-00-00',
`date_added` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`encoded_date` datetime NOT NULL,
`rating` int(5) NOT NULL DEFAULT '0',
`length` int(11) NOT NULL DEFAULT '0',
`submitter` int(11) NOT NULL DEFAULT '0',
`ip` varchar(15) NOT NULL,
`approved` int(11) NOT NULL DEFAULT '0',
`hotlinked` varchar(1024) NOT NULL,
`plug_url` varchar(255) NOT NULL,
`enabled` int(11) NOT NULL DEFAULT '0',
`main_thumb` int(11) NOT NULL DEFAULT '3',
`xml` varchar(32) NOT NULL,
`photos` int(11) NOT NULL DEFAULT '0',
`mobile` varchar(255) NOT NULL,
`modeltmp` varchar(255) NOT NULL,
`movie_width` int(11) NOT NULL,
`movie_height` int(11) NOT NULL,
`token` varchar(255) DEFAULT NULL,
`source_thumb_url` varchar(255) NOT NULL,
`related` varchar(1024) NOT NULL,
`force_related` varchar(255) NOT NULL,
`record_num` int(11) NOT NULL AUTO_INCREMENT,
`webvtt_src` text NOT NULL,
`category_thumb` int(11) NOT NULL,
`related_date` date NOT NULL,
`publish_ready` tinyint(1) NOT NULL,
PRIMARY KEY (`record_num`),
KEY `encoded_date` (`encoded_date`,`photos`,`enabled`),
KEY `filename` (`filename`),
KEY `scheduled_date` (`scheduled_date`),
KEY `enabled_2` (`enabled`,`length`,`photos`),
KEY `enabled` (`enabled`,`encoded_date`,`photos`),
KEY `rating` (`rating`,`enabled`,`photos`),
KEY `token` (`token`),
KEY `submitter` (`submitter`),
FULLTEXT KEY `keywords` (`keywords`,`title`),
FULLTEXT KEY `title` (`title`),
FULLTEXT KEY `description` (`description`),
FULLTEXT KEY `keywords_2` (`keywords`)
) ENGINE=MyISAM AUTO_INCREMENT=124207 DEFAULT CHARSET=latin1
CREATE TABLE `content_views` (
`views` int(11) NOT NULL,
`content` int(11) NOT NULL,
PRIMARY KEY (`content`),
KEY `views` (`views`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
For this query:
SELECT c.*, cv.views
FROM content c JOIN
content_views cv
ON cv.content = c.record_num
WHERE c.enabled = 1
ORDER BY cv.views;
The best indexes are probably content(enabled, record_num) and content_views(content, views). I am guessing that the performance even with these indexes will be similar to what you have now.
I've build web application as a tool to eliminate unnecessary data in peoples table, this application mainly to filter all data of peoples who valid to get an election rights. At first, it wasn't a problem when the main table still had few rows, but it is really bad (6 seconds) when the table is filled with about 200K rows (really worse because the table will be up to 6 million rows).
I have table design like below, and I am doing a join with 4 tables (region table start from province, city, district and town). Each region table is related to each other with their own id:
CREATE TABLE `peoples` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`id_prov` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_town` smallint(4) NOT NULL,
`tps` smallint(4) NOT NULL,
`urut_xls` varchar(20) NOT NULL,
`nik` varchar(20) NOT NULL,
`name` varchar(60) NOT NULL,
`place_of_birth` varchar(60) NOT NULL,
`birth_date` varchar(30) NOT NULL,
`age` tinyint(3) NOT NULL DEFAULT '0',
`sex` varchar(20) NOT NULL,
`marital_s` varchar(20) NOT NULL,
`address` varchar(160) NOT NULL,
`note` varchar(60) NOT NULL,
`m_name` tinyint(1) NOT NULL DEFAULT '0',
`m_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`format_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`m_sex` tinyint(1) NOT NULL DEFAULT '0' COMMENT ,
`m_m_status` tinyint(1) NOT NULL DEFAULT '0' ,
`sex_double` tinyint(1) NOT NULL DEFAULT '0',
`id_import` bigint(10) NOT NULL,
`id_workspace` tinyint(4) unsigned NOT NULL DEFAULT '0',
`stat_valid` smallint(1) NOT NULL DEFAULT '0' ,
`add_manual` tinyint(1) unsigned NOT NULL DEFAULT '0' ,
`insert_by` varchar(12) NOT NULL,
`update_by` varchar(12) DEFAULT NULL,
`mark_as_duplicate` smallint(1) NOT NULL DEFAULT '0' ,
`mark_as_trash` smallint(1) NOT NULL DEFAULT '0' ,
`in_date_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ind_import` (`id_import`),
KEY `ind_duplicate` (`mark_as_duplicate`),
KEY `id_workspace` (`id_workspace`),
KEY `tambah_manual` (`tambah_manual`),
KEY `il` (`stat_valid`,`mark_as_trash`,`in_date_time`),
KEY `region` (`id_prov`,`id_kab`,`id_kec`,`id_kel`,`tps`),
KEY `name` (`name`),
KEY `place_of_birth` (`place_of_birth`),
KEY `ind_birth` (`birthdate`(10)),
KEY `ind_sex` (`sex`(2))
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
town:
CREATE TABLE `town` (
`id` smallint(4) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_prov` smallint(2) NOT NULL,
`name_town` varchar(60) NOT NULL,
`handprint` blob,
`pps_1` varchar(60) DEFAULT NULL,
`pps_2` varchar(60) DEFAULT NULL,
`pps_3` varchar(60) DEFAULT NULL,
`tpscount` smallint(2) DEFAULT NULL,
`pps_4` varchar(60) DEFAULT NULL,
`pps_5` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id_prov`,`id_kab`,`id_kec`,`id`),
KEY `name_town` (`name_town`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
and the query like
SELECT `E`.`id`, `E`.`id_prov`, `E`.`id_city`, `E`.`id_district`, `E`.`id_town`,
`B`.`name_prov`,`C`.`name_city`,`D`.`name_district`, `A`.`name_town`,
`E`.`tps`, `E`.`urut_xls`, `E`.`nik`,`E`.`name`,`E`.`place_of_birth`,
`E`.`birth_date`, `E`.age, `E`.`sex`, `E`.`marital_s`, `E`.`address`,
`E`.`note`
FROM peoples E
JOIN test_prov B ON E.id_prov = B.id
JOIN test_city C ON E.id_city = C.id
AND (C.id_prov=B.id)
JOIN test_district D ON E.id_district = D.id
AND ((D.id_city = C.id) AND (D.id_prov= B.id))
JOIN test_town A ON E.id_town = A.id
AND ((A.id_district = D.id)
AND (A.id_city = C.id)
AND (A.id_prov = B.id))
AND E.stat_valid=1
AND E.mark_as_trash=0
mark_as_trash is a mark column which only contain 1 and zero just to know if the data has been mark as a deleted record, and stat_valid is the filtered result value - if value is 1 then the data is valid to get the rights of election.
I've tried to see the explain but no column is used as an index lookup. I believe that's the problem why the application so slow in 200K rows. The query above only shows two conditions, but the application has a feature to filter by name, place of birth, birth date, age with ranges and so on.
How can I make this perform better?
Can a city be in two provinces? If not then why do you check C.id_prov=B.id if E.id_city = C.id should give you just one row?
Also it seems that your query is slow because you're selecting 200k rows. Indexes will improve performance but do you really need all the rows at once? You should use pagination (limit, offset).