Improve a query from Explain results - mysql

I have a complex query that is dynamically assembled based upon search criteria. However, in its simplest form, it is still very slow. The main table it runs against has ~10M records. I ran an explain against a 'base' query and the first row of the explain looks bad (at least to a novice dba like me). I have read a couple tutorials about EXPLAIN, but I still am unsure how to fix the query. So, the first row of the results seems to indicate the problem, but I don't know what to do with it. I couldn't make a composite key that long even if I wanted to and some of the field names in that possible_keys column are not even in the patients table. Any help will be greatly appreciated.
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,patients,range,"PRIMARY,location,appt_date,status,radiologist,contract,lastname,paperwork,images_archived,hash,created,document_attached,all_images_archived,last_image_archived,modality,study_uid,company,second_access,firstname,report_delivered,ssn,order_entry_status,dob,tech,doctor,mobile_facility,accession,location_appt_date,location_created,location_lastname,ref,person_seq",location_appt_date,55,NULL,573534,"Using index condition; Using where; Using temporary; Using filesort"
1,SIMPLE,receivable_transactions,ref,patient_seq,patient_seq,4,ris-dev.patients.seq,1,NULL
1,SIMPLE,patients_dispatch,ref,patient_seq,patient_seq,4,ris-dev.patients.seq,1,NULL
1,SIMPLE,mobile_facility,ref,"unique_index,name,location",unique_index,115,"ris-dev.patients.mobile_facility,const",1,"Using where"
1,SIMPLE,mobile_facility_service_areas,eq_ref,PRIMARY,PRIMARY,4,ris-dev.mobile_facility.service_area,1,NULL
Edit: same EXPLAIN, but reformatted to be easier to read:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE patients range PRIMARY location_appt_date 55 NULL 573534 Using index condition; Using where; Using temporary; Using filesort
location
appt_date
status
radiologist
contract
lastname
paperwork
images_archived
hash
created
document_attached
all_images_archived
last_image_archived
modality
study_uid
company
second_access
firstname
report_delivered
ssn
order_entry_status
dob
tech
doctor
mobile_facility
accession
location_appt_date
location_created
location_lastname
ref
person_seq
1 SIMPLE receivable_transactions ref patient_seq patient_seq 4 ris-dev.patients.seq 1 NULL
1 SIMPLE patients_dispatch ref patient_seq patient_seq 4 ris-dev.patients.seq 1 NULL
1 SIMPLE mobile_facility ref unique_index unique_index 115 ris-dev.patients.mobile_facility,const 1 Using where
name
location
1 SIMPLE mobile_facility_service_areas eq_ref PRIMARY PRIMARY 4 ris-dev.mobile_facility.service_area 1 NULL
The explain is setup against the following query and table structures.
SELECT patients.fax_in_queue, patients.modality, patients.stat, patients.created, patients.seq, patients.lastname,
patients.firstname, patients.appt_date, patients.status, patients.contract, patients.location, patients.unique_hash,
patients.images_archived, patients.report_delivered, patients.doctor, patients.mobile_facility, patients.history,
patients.dob, patients.all_images_archived, patients.order_entry_status, patients.tech, patients.radiologist,
patients.last_image_archived, patients.state, patients.ss_comments, patients.completed, patients.report_status,
patients.have_paperwork, patients.facility_room_number, patients.facility_station_name, patients.facility_bed,
patients.findings_level, patients.document_attached, patients.study_start, patients.company, patients.accession,
patients.number_images, patients.client_number_images, patients.sex, patients.threshhold , GROUP_CONCAT(CONCAT(CONCAT(receivable_transactions.modifier, " "),
receivable_transactions.description) SEPARATOR ", ") AS rt_desc , patients_dispatch.seq AS doc_seq, patients_dispatch.requisition_last_sent,
patients_dispatch.requisition_signed_by_file_seq, patients_dispatch.requisition_signed, patients_dispatch.order_reason, patients_dispatch.order_comments,
patients_dispatch.order_taken, patients_dispatch.order_tech_last_notified, patients_dispatch.order_tech_in_transit, patients_dispatch.order_tech_in,
patients_dispatch.order_tech_out, patients_dispatch.order_tech_ack, patients_dispatch.addr1 AS d_addr1, patients_dispatch.addr2 AS d_addr2,
patients_dispatch.city AS d_city, patients_dispatch.state AS d_state, patients_dispatch.zip AS d_zip, CONCAT(patients.status, order_tech_out,
order_tech_in, order_tech_in_transit) as pseudo_status , mobile_facility.requisition_fax, mobile_facility.station_list, mobile_facility.address1 as mf_addr1,
mobile_facility.address2 as mf_addr2, mobile_facility.city as mf_city, mobile_facility.state as mf_state, mobile_facility.zip as mf_zip,
mobile_facility.phone as mf_phone, mobile_facility.phone2 as mf_phone2, mobile_facility_service_areas.name as mf_service_area
FROM patients LEFT JOIN receivable_transactions ON patients.seq = receivable_transactions.patient_seq
LEFT JOIN patients_dispatch ON patients.seq = patients_dispatch.patient_seq
LEFT JOIN mobile_facility ON patients.location = mobile_facility.location AND patients.mobile_facility = mobile_facility.name
LEFT JOIN mobile_facility_service_areas ON mobile_facility.service_area = mobile_facility_service_areas.seq
WHERE patients.location = "XYZCompany" AND ((patients.appt_date >= '2020-03-19' AND patients.appt_date <= '2020-03-19 23:59:59')
OR (patients.appt_date <= '2020-03-19' AND patients.status < 'X'))
GROUP BY patients.seq DESC
ORDER BY patients.status, patients.order_entry_status, pseudo_status, patients.order_entry_status,patients.lastname);
CREATE TABLE `patients` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`person_seq` int(11) NOT NULL,
`firstname` varchar(20) NOT NULL DEFAULT '',
`lastname` varchar(30) NOT NULL DEFAULT '',
`middlename` varchar(20) NOT NULL DEFAULT '',
`ref` varchar(50) NOT NULL DEFAULT '',
`location` varchar(50) NOT NULL DEFAULT '',
`doctor` varchar(50) NOT NULL,
`radiologist` varchar(20) NOT NULL DEFAULT '',
`contract` varchar(50) NOT NULL,
`history` mediumtext NOT NULL,
`dob` varchar(15) NOT NULL DEFAULT '0000-00-00',
`appt_date` date NOT NULL DEFAULT '0000-00-00',
`status` tinyint(4) NOT NULL DEFAULT '0',
`tech` varchar(50) NOT NULL DEFAULT '',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`ss_comments` mediumtext NOT NULL,
`mobile_facility` varchar(60) NOT NULL DEFAULT '',
`facility_room_number` varchar(50) NOT NULL,
`facility_bed` varchar(20) NOT NULL,
`facility_station_name` varchar(50) NOT NULL,
`stat` tinyint(4) NOT NULL DEFAULT '0',
`have_paperwork` tinyint(4) NOT NULL DEFAULT '0',
`completed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`sex` char(1) NOT NULL DEFAULT '',
`unique_hash` varchar(100) NOT NULL DEFAULT '',
`number_images` int(11) NOT NULL DEFAULT '0',
`client_number_images` int(11) NOT NULL,
`images_archived` tinyint(4) NOT NULL DEFAULT '0',
`completed_fax` varchar(10) NOT NULL DEFAULT '0' COMMENT 'This is the number the completed report is faxed to.',
`report_delivered` tinyint(4) NOT NULL DEFAULT '0',
`report_delivered_time` datetime NOT NULL,
`document_attached` tinyint(4) NOT NULL DEFAULT '0',
`modality` varchar(3) NOT NULL,
`last_image_archived` datetime NOT NULL,
`all_images_archived` tinyint(4) NOT NULL DEFAULT '0',
`fax_in_queue` varchar(12) NOT NULL,
`accession` varchar(100) NOT NULL,
`study_uid` varchar(100) NOT NULL,
`order_entry_status` tinyint(4) NOT NULL,
`compare_to` varchar(15) NOT NULL,
`state` varchar(3) NOT NULL,
`company` int(11) NOT NULL,
`second_access` varchar(50) NOT NULL,
`threshhold` datetime NOT NULL,
`report_status` tinyint(4) NOT NULL,
`second_id` varchar(50) NOT NULL,
`rad_alerted` tinyint(4) NOT NULL,
`assigned` datetime NOT NULL,
`findings_level` tinyint(4) NOT NULL,
`report_viewed` tinyint(4) NOT NULL,
`study_received` datetime NOT NULL,
`study_start` datetime NOT NULL,
`study_end` datetime NOT NULL,
`completed_email` varchar(50) NOT NULL,
`completed_send` varchar(255) NOT NULL,
`ssn` varchar(12) NOT NULL,
`exorder_number` varchar(30) NOT NULL,
`exvisit_number` varchar(30) NOT NULL,
`row_updated` timestamp NOT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`seq`),
KEY `location` (`location`),
KEY `appt_date` (`appt_date`),
KEY `status` (`status`),
KEY `radiologist` (`radiologist`),
KEY `contract` (`contract`),
KEY `lastname` (`lastname`),
KEY `paperwork` (`have_paperwork`),
KEY `images_archived` (`images_archived`),
KEY `hash` (`unique_hash`),
KEY `created` (`created`),
KEY `document_attached` (`document_attached`),
KEY `all_images_archived` (`all_images_archived`),
KEY `last_image_archived` (`last_image_archived`),
KEY `modality` (`modality`),
KEY `study_uid` (`study_uid`),
KEY `company` (`company`),
KEY `second_access` (`second_access`),
KEY `firstname` (`firstname`),
KEY `report_delivered` (`report_delivered`),
KEY `ssn` (`ssn`),
KEY `order_entry_status` (`order_entry_status`),
KEY `dob` (`dob`),
KEY `tech` (`tech`),
KEY `doctor` (`doctor`),
KEY `mobile_facility` (`mobile_facility`),
KEY `accession` (`accession`),
KEY `location_appt_date` (`location`,`appt_date`),
KEY `location_created` (`location`,`created`),
KEY `location_lastname` (`location`,`lastname`),
KEY `ref` (`ref`),
KEY `person_seq` (`person_seq`)
) ENGINE=InnoDB AUTO_INCREMENT=10242952 DEFAULT CHARSET=latin1;
CREATE TABLE `receivable_transactions` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`patient_seq` int(11) NOT NULL DEFAULT '0',
`cptcode` varchar(15) NOT NULL DEFAULT '',
`modifier` char(2) NOT NULL DEFAULT '',
`description` varchar(100) NOT NULL DEFAULT '',
`amount` decimal(6,2) NOT NULL DEFAULT '0.00',
`type` char(2) NOT NULL DEFAULT '',
`transaction` varchar(10) NOT NULL DEFAULT '',
`radiologist` varchar(20) NOT NULL DEFAULT '',
`status` tinyint(4) NOT NULL DEFAULT '0',
`completed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`report_meta_seq` int(11) NOT NULL DEFAULT '0',
`report_header` varchar(255) NOT NULL,
`report_body` blob NOT NULL,
`report_impression` mediumtext NOT NULL,
`report_hide` tinyint(4) NOT NULL,
`radiologist_group` varchar(50) NOT NULL,
`addendum` int(4) NOT NULL DEFAULT '0',
`addendum_type` varchar(20) NOT NULL,
`peer_review` int(4) NOT NULL DEFAULT '0',
`qa_reason` varchar(255) NOT NULL DEFAULT '',
`qa_agree` decimal(2,1) NOT NULL DEFAULT '0.0',
`findings` tinyint(4) NOT NULL,
`comments` mediumtext NOT NULL,
`company` int(11) NOT NULL,
`row_updated` timestamp NOT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`seq`),
KEY `patient_seq` (`patient_seq`),
KEY `cptcode` (`cptcode`),
KEY `transaction` (`transaction`),
KEY `type` (`type`),
KEY `created` (`created`),
KEY `radiologist` (`radiologist`),
KEY `status` (`status`),
KEY `report_meta_seq` (`report_meta_seq`),
KEY `Billing Check Dropdown` (`status`,`completed`),
KEY `qa_agree` (`qa_agree`),
KEY `peer_review` (`peer_review`),
KEY `addendum` (`addendum`),
KEY `company` (`company`),
KEY `completed` (`completed`)
) ENGINE=InnoDB AUTO_INCREMENT=9380351 DEFAULT CHARSET=latin1;
CREATE TABLE `patients_dispatch` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`patient_seq` int(11) NOT NULL,
`order_taken` datetime NOT NULL,
`order_taken_by` varchar(50) NOT NULL,
`order_person_calling` varchar(50) NOT NULL,
`order_supervising_physician` varchar(50) NOT NULL,
`order_trip_count` tinyint(4) NOT NULL,
`order_trip_count_max` tinyint(4) NOT NULL,
`order_trip_visit` tinyint(4) NOT NULL,
`order_tech_in` datetime NOT NULL,
`order_tech_out` datetime NOT NULL,
`order_ssn` varchar(12) NOT NULL,
`order_service_request_time` datetime NOT NULL,
`order_reason` varchar(255) NOT NULL,
`order_tech_ack` datetime NOT NULL,
`order_tech_assigned` datetime NOT NULL,
`order_tech_last_notified` datetime NOT NULL,
`requisition_last_sent` datetime NOT NULL,
`requisition_signed` datetime NOT NULL,
`requisition_signed_by` varchar(50) NOT NULL,
`requisition_signed_by_text` varchar(75) NOT NULL,
`requisition_signed_by_file_seq` int(11) NOT NULL,
`order_comments` mediumtext NOT NULL,
`order_tech_in_transit` datetime NOT NULL,
`fasting` tinyint(1) NOT NULL,
`collection_time` time DEFAULT NULL,
`addr1` varchar(100) NOT NULL,
`addr2` varchar(100) NOT NULL,
`city` varchar(30) NOT NULL,
`state` varchar(3) NOT NULL,
`zip` varchar(12) NOT NULL,
`phone` varchar(15) NOT NULL,
`mileage_start` int(11) NOT NULL,
`mileage_end` int(11) NOT NULL,
PRIMARY KEY (`seq`),
KEY `patient_seq` (`patient_seq`)
) ENGINE=InnoDB AUTO_INCREMENT=2261091 DEFAULT CHARSET=latin1;
CREATE TABLE `mobile_facility` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(60) NOT NULL,
`location` varchar(50) DEFAULT NULL,
`address1` varchar(50) NOT NULL,
`address2` varchar(50) NOT NULL,
`city` varchar(50) NOT NULL,
`state` varchar(2) NOT NULL,
`zip` varchar(10) NOT NULL,
`phone` varchar(15) NOT NULL,
`phone2` varchar(15) NOT NULL,
`fax` varchar(110) NOT NULL,
`rads_can_read` text NOT NULL,
`rads_cant_read` text NOT NULL,
`only_techs` text NOT NULL,
`never_modalities` varchar(255) NOT NULL COMMENT 'A serialized list of modalities a facility may not use.',
`station_list` mediumtext NOT NULL,
`email` varchar(255) NOT NULL,
`misc1` varchar(255) NOT NULL,
`latitude` float NOT NULL DEFAULT '0',
`longitude` float NOT NULL DEFAULT '0',
`affiliation` int(11) NOT NULL COMMENT 'mobile_facility_affiliations seq',
`branch` int(11) NOT NULL COMMENT 'mobile_facility_branches seq',
`service_area` int(11) NOT NULL COMMENT 'mobile_facility_service_areas seq',
`other_id` varchar(50) NOT NULL COMMENT 'Usually used for HL7',
`facility_type` varchar(2) DEFAULT NULL,
`no_stat` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'Should the facility allow stat priority on patients?',
`facility_notes` varchar(512) DEFAULT NULL,
`requisition_fax` varchar(110) NOT NULL,
`report_template` text NOT NULL,
`all_orders_stat` tinyint(1) NOT NULL,
`sms_notification` varchar(15) NOT NULL,
`tat` varchar(10) NOT NULL,
`npi` varchar(15) NOT NULL,
`NMXR` tinyint(4) NOT NULL DEFAULT '0',
`billing_type` varchar(10) NOT NULL,
`salesman` varchar(75) NOT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`default_bill_to` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`seq`),
UNIQUE KEY `unique_index` (`name`,`location`),
KEY `name` (`name`),
KEY `location` (`location`)
) ENGINE=InnoDB AUTO_INCREMENT=155104 DEFAULT CHARSET=latin1;
CREATE TABLE `mobile_facility_service_areas` (
`seq` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`location` varchar(50) NOT NULL,
PRIMARY KEY (`seq`)
) ENGINE=InnoDB AUTO_INCREMENT=841 DEFAULT CHARSET=latin1;

It's only using the index on location, but that only narrows down the search to about a half a million rows. You'd like it to use an index to further narrow down by the appt_date.
However, the use of OR in your WHERE clause is causing a problem. It can't decide how to use the index.
Here's what I suggest:
Drop the index on location because it's redundant with the other indexes that have location as their first column.
Replace the index on location_appt_date with an index on location_appt_date_status.
ALTER TABLE patients
DROP KEY location,
DROP KEY location_appt_date,
ADD KEY location_appt_date_status (location, appt_date, status);
Refactor the query to use UNION instead of OR:
SELECT ... (all the columns you have) ...
FROM (
SELECT * FROM patients USE INDEX (location_appt_date_status)
WHERE location = 'XYZCompany' AND appt_date >= '2020-03-19' AND appt_date < '2020-03-20'
UNION
SELECT * FROM patients USE INDEX (location_appt_date_status)
WHERE location = 'XYZCompany' AND appt_date <= '2020-03-19' AND status < 'X'
) AS p
LEFT JOIN receivable_transactions FORCE INDEX (patient_seq)
ON p.seq = receivable_transactions.patient_seq
LEFT JOIN patients_dispatch FORCE INDEX (patient_seq)
ON p.seq = patients_dispatch.patient_seq
INNER JOIN mobile_facility FORCE INDEX (unique_index)
ON p.location = mobile_facility.location AND p.mobile_facility = mobile_facility.name
INNER JOIN mobile_facility_service_areas FORCE INDEX (PRIMARY)
ON mobile_facility.service_area = mobile_facility_service_areas.seq
GROUP BY p.seq
ORDER BY p.status, p.order_entry_status, pseudo_status, p.order_entry_status, p.lastname
You might not need all the USE INDEX() / FORCE INDEX() optimizer hints I used. I did those because I was testing with empty tables, and that can confuse the optimizer.

Let me focus on the part that affects optimization the most:
FROM patients AS p
LEFT JOIN receivable_transactions AS rt ON p.seq = rt.patient_seq
LEFT JOIN patients_dispatch AS pd ON p.seq = pd.patient_seq
LEFT JOIN mobile_facility AS mf ON p.location = mf.location
AND p.mobile_facility AS mf = mf.name
LEFT JOIN mobile_facility_service_areas AS sa ON mf.service_area = sa.seq
WHERE p.location = "XYZCompany"
AND ((p.appt_date >= '2020-03-19'
AND p.appt_date <= '2020-03-19 23:59:59')
OR (p.appt_date <= '2020-03-19'
AND p.status < 'X')
)
GROUP BY p.seq DESC
ORDER BY p.status, p.order_entry_status, pseudo_status, p.order_entry_status,
p.lastname);
The biggest issue is the OR. It often prevents most optimizations. The usual fix is to turn it into a UNION:
( SELECT ...
FROM .. JOIN ..
WHERE p.location = "XYZCompany"
AND p.appt_date >= '2020-03-19'
AND p.appt_date < '2020-03-19' + INTERVAL 1 DAY
...
)
UNION ALL
( SELECT ...
FROM .. JOIN ..
WHERE p.location = "XYZCompany"
AND p.appt_date <= '2020-03-19'
AND p.status < 'X'
...
)
Each select can benefit from this composite index on patients:
(location, appt_date, status)
The < 'X' is problematic because two ranges (appt_date and status) cannot both be used effectively. What are the possible values of status? If there is only one value before 'X', say 'M', then this would be much better: p.status = 'M' together with another index: (location, status, appt_date)
SELECT lots of stuff, then GROUP BY p.seq -- This is probably create strange results. (Search for ONLY_FULL_GROUP_BY for more discussion). It may be better to first get the patients.seq values (since that is all you are filtering on), then join to the other tables. This would eliminate the GROUP BY, or at least force you to deal with which row to fetch from each of the other tables.
range location_appt_date 55 573534 Using index condition; Using where; Using temporary; Using filesort -- says
55 = 2+50 (for varchar(50)) + 3 (for date) -- neither is NULL.
Based on the 55, I wonder if it is so well optimized that the OR->UNION is not needed.
"Using index condition" is internally called ICP (Index Condition Pushdown) if you want further understanding.
"Using filesort" may be an understatement -- There are probably two sorts, one for GROUP BY, one for ORDER BY. EXPLAIN FORMAT=JSON SELECT ... would make it clear. (And hence my hint that the GROUP BY should be avoided.
You have some redundant indexes (not relevant to much other than disk space): INDEX(a,b), INDEX(a) --> toss INDEX(a).
patients has an awful number of indexes.
The other tables seem to have adequate indexes for your query.

Related

Optimizing MySQL query with INNER JOINS, LEFT JOINS, GROUP BY and HAVING

I'm having trouble optimizing this really big query and I can't change the table structure except creating additional indexes and small adjustments.
SELECT
'Fattura Prodotti Postali' AS `type`,
SUM(dpd.qta) AS `products_count_quantity`,
COUNT(dpd.IDlavorazione_dett) AS `products_count`,
GROUP_CONCAT(DISTINCT dp.prod_totali - CAST(dp.opzione1 AS UNSIGNED) SEPARATOR ' |-| ') AS `process_products_count`,
GROUP_CONCAT(DISTINCT dp.IDdistinta) AS `product_code`,
GROUP_CONCAT(DISTINCT dp.data_distinta) AS `process_date`,
GROUP_CONCAT(DISTINCT dp.IDesito) AS `process_status_id`,
GROUP_CONCAT(DISTINCT dp.note) AS `process_note`,
GROUP_CONCAT(DISTINCT dp.IDlavorazione) AS `unique_id`,
SUM(dpd.tariffa) AS `products_total`,
SUM(IF(o.IDdoc IS NULL,1,0)*dpd.qta) AS `tobill_count_quantity`,
SUM(IF(NOT o.IDdoc IS NULL,1,0)*dpd.qta) AS `billed_count_quantity`,
SUM(IF(o.IDdoc IS NULL,1,0)) AS `tobill_count`,
SUM(IF(o.IDdoc IS NULL, dpd.tariffa,0)) AS `tobill_total`,
SUM(IF(NOT o.IDdoc IS NULL,1,0)) AS `billed_count`,
SUM(IF(NOT o.IDdoc IS NULL, dpd.tariffa,0)) AS `billed_total`,
SUM(IF(o.IDdoc IS NULL, dpd.tariffa,0)-IF(NOT o.IDdoc IS NULL, dpd.tariffa,0)) AS `bill_diff`,
COUNT(dpd.IDlavorazione_dett) AS `products_count`,
SUM(dpd.tariffa) AS `products_total`,
SUM(dpd.tariffa*(dpd.iva/100)) AS `products_vat`,
SUM(dpd.tariffa*(dpd.sconto/100.0)) AS `products_discount`,
SUM(dpd.tariffa*(1.0-dpd.sconto/100.0)*(dpd.iva/100)) AS `products_discount_vat`,
SUM(dpd.tariffa*(1+(dpd.iva/100.0))) AS `products_total_vat`,
SUM(ROUND(dpd.tariffa*(1.0-dpd.sconto/100.0),5)) AS `products_total_discount`,
SUM(dpd.tariffa*(1.0-dpd.sconto/100.0)*(1+(dpd.iva/100.0))) AS `products_total_discount_vat`,
SUM(dpd.qta) AS `products_quantity`,
SUM(IF(o.IDdoc IS NULL,1,0)) AS `tobill_count`,
SUM(IF(o.IDdoc IS NULL, dpd.tariffa,0)) AS `tobill_total`,
SUM(IF(o.IDdoc IS NULL, dpd.tariffa*(1+(dpd.iva/100.0)),0)) AS `tobill_total_vat`,
SUM(IF(NOT o.IDdoc IS NULL,1,0)) AS `billed_count`,
SUM(IF(NOT o.IDdoc IS NULL, dpd.tariffa,0)) AS `billed_total`,
SUM(IF(NOT o.IDdoc IS NULL, dpd.tariffa*(1+(dpd.iva/100.0)),0)) AS `billed_total_vat`
FROM doc_prodottipostali_dett dpd
INNER JOIN tracking t ON (dpd.IDlavorazione_dett=t.product_id)
LEFT JOIN prodotti_pp ppp ON (dpd.IDprodotto=ppp.IDprodotto)
LEFT JOIN categorie_pp cpp ON (ppp.categoria=cpp.IDcategoria)
LEFT JOIN categorie_pp cppp ON (CAST(dpd.IDcategoria AS UNSIGNED)=cppp.IDcategoria)
INNER JOIN doc_prodottipostali dp ON (dpd.IDlavorazione=dp.IDlavorazione)
LEFT JOIN ordini o ON (dpd.IDfattura=o.IDdoc)
WHERE
(
(dp.tipo = 'PT' AND t.date >= '2022-05-15 00:00:00' AND t.date <= '2022-07-26 23:59:59') AND
((IF(dpd.IDprodotto>0, cpp.codice, IF(CAST(dpd.IDcategoria AS UNSIGNED)>0, cppp.codice, 'NONE')) NOT IN ('LAW','AR','CAD','EMESSOCAD') OR IF(dpd.IDprodotto>0, cpp.codice, IF(CAST(dpd.IDcategoria AS UNSIGNED)>0, cppp.codice, 'NONE')) IS NULL)) AND
((t.last IN (-1,2,3,6,7,10,11,34,35,130,131,258,514,4098,4354,8194,8450))) AND
(((dp.opzione2 = 'PI' AND dp.data_distinta < '2022-08-25')) OR ((dp.data_distinta >= '2022-08-25')) OR ((o.IDdoc >= '1'))) AND
(((t.last IN (-1,2,3,6,7,10,11,34,35,130,131,258,514,4098,4354,8194,8450))))
) AND
(
((NOT t.last IN (-1,4,5,6,7,20,21,132,133,149,516,532,1028,1157,8197)))
)
GROUP BY dpd.IDlavorazione
HAVING (1=1 AND ((tobill_count > '0')))
ORDER BY dp.data_distinta ASC;
The create tables are as following:
CREATE TABLE `doc_prodottipostali_dett` (
`IDlavorazione_dett` int(11) NOT NULL AUTO_INCREMENT,
`IDlavorazione` int(11) NOT NULL,
`IDdistinta` varchar(45) NOT NULL,
`IDcategoria` varchar(45) NOT NULL,
`IDprodotto` int(11) NOT NULL,
`codiceabarre` varchar(255) NOT NULL,
`codiceavviso` varchar(255) NOT NULL DEFAULT '',
`IDvettore` int(11) NOT NULL,
`IDlistino` int(11) NOT NULL,
`rif` varchar(45) NOT NULL,
`IDmittente` int(11) NOT NULL,
`IDdestinatario` int(11) NOT NULL,
`ufficio_mittente` varchar(45) NOT NULL,
`nome_lavoro` varchar(255) NOT NULL,
`IDpostino` int(11) DEFAULT NULL,
`note` longtext NOT NULL,
`allegati` char(1) NOT NULL,
`utente` varchar(45) NOT NULL,
`peso` decimal(10,5) NOT NULL,
`tariffa` decimal(10,5) NOT NULL,
`sconto` float NOT NULL DEFAULT 0,
`iva` int(11) NOT NULL,
`qta` int(11) NOT NULL,
`am` char(1) NOT NULL,
`cp` char(1) NOT NULL,
`eu` char(1) NOT NULL,
`aa` char(10) NOT NULL,
`ee` char(10) NOT NULL,
`stato` int(11) NOT NULL,
`lavorato` int(1) NOT NULL,
`IDesito` int(11) NOT NULL,
`esito` varchar(45) NOT NULL,
`data_op` datetime NOT NULL,
`fatturato` int(11) NOT NULL,
`data_fatt` date NOT NULL,
`IDfattura` int(11) NOT NULL,
`data` datetime DEFAULT NULL,
`data_ar` datetime DEFAULT NULL,
`nome_ar` varchar(255) DEFAULT NULL,
`file_ar` varchar(255) DEFAULT NULL,
`IDflusso_dett` int(11) NOT NULL DEFAULT 0,
`type` varchar(2) NOT NULL DEFAULT '',
`typology` varchar(2) NOT NULL DEFAULT '',
`related_id` int(11) NOT NULL DEFAULT 0,
`related_ar_id` int(11) NOT NULL DEFAULT 0,
`unregistered` int(1) NOT NULL DEFAULT 0,
`total_attachments` int(11) NOT NULL DEFAULT 0,
`repeated_recipient` int(4) DEFAULT NULL,
`law_tomanage` tinyint(4) NOT NULL DEFAULT 0,
`law_towork` int(11) NOT NULL DEFAULT 0,
`law_toprint` int(11) NOT NULL DEFAULT 0,
`notlaw_tocomplete` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`IDlavorazione_dett`),
KEY `IDvettore` (`IDvettore`),
KEY `IDmittente` (`IDmittente`),
KEY `IDdestinatario` (`IDdestinatario`),
KEY `IDlavorazione` (`IDlavorazione`),
KEY `IDfattura` (`IDfattura`),
KEY `codiceabarre` (`codiceabarre`),
KEY `rif` (`rif`),
KEY `IDpostino` (`IDpostino`),
KEY `IDcategoria` (`IDcategoria`),
KEY `IDprodotto` (`IDprodotto`),
KEY `IDlistino` (`IDlistino`),
KEY `codiceavviso` (`codiceavviso`),
KEY `IDflusso_dett` (`IDflusso_dett`),
KEY `typology` (`typology`),
KEY `nome_ar` (`nome_ar`),
KEY `unregistered` (`unregistered`),
KEY `related_ar_id` (`related_ar_id`),
KEY `related_id` (`related_id`) USING BTREE,
KEY `type` (`type`),
KEY `stato` (`stato`),
KEY `lavorato` (`lavorato`),
KEY `law_tomanage` (`law_tomanage`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `tracking` (
`tracking_id` int(11) NOT NULL AUTO_INCREMENT,
`product_id` int(11) NOT NULL,
`status_id` int(11) NOT NULL,
`lat` double NOT NULL DEFAULT 0,
`lng` double NOT NULL DEFAULT 0,
`entity_id` int(11) NOT NULL,
`sub_entity_id` int(11) NOT NULL DEFAULT 0,
`date` datetime NOT NULL,
`note` text DEFAULT NULL,
`last` int(11) NOT NULL DEFAULT 0,
`date_last` datetime NOT NULL DEFAULT current_timestamp(),
`tracking_rel` int(11) NOT NULL DEFAULT 0,
`quantity_from` int(11) NOT NULL DEFAULT 1,
`quantity_to` int(11) NOT NULL DEFAULT 1,
`price` decimal(14,4) NOT NULL DEFAULT 0.0000,
`price_unit` decimal(14,4) NOT NULL DEFAULT 0.0000,
`vat` decimal(14,4) NOT NULL DEFAULT 0.0000,
`invoice_id` int(11) NOT NULL DEFAULT 0,
`management_status` int(11) NOT NULL DEFAULT 0,
`package_id` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`tracking_id`),
KEY `product_id` (`product_id`),
KEY `status_id` (`status_id`),
KEY `entity_id` (`entity_id`),
KEY `sub_entity_id` (`sub_entity_id`),
KEY `date` (`date`),
KEY `last` (`last`),
KEY `package_id` (`package_id`),
KEY `management_status` (`management_status`),
KEY `date_last` (`date_last`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `prodotti_pp` (
`IDprodotto` int(10) unsigned NOT NULL AUTO_INCREMENT,
`codice` varchar(255) NOT NULL DEFAULT '',
`codiceabarre` varchar(255) NOT NULL DEFAULT '',
`descrizione` longtext NOT NULL,
`prezzo` decimal(10,2) NOT NULL DEFAULT 0.00,
`prezzo_2` decimal(10,2) NOT NULL DEFAULT 0.00,
`prezzo_3` decimal(10,2) NOT NULL DEFAULT 0.00,
`UM` char(3) NOT NULL DEFAULT '',
`peso` decimal(10,2) NOT NULL DEFAULT 0.00,
`datapv` date NOT NULL DEFAULT '0000-00-00',
`iva` decimal(10,0) NOT NULL DEFAULT 0,
`fornitore` varchar(45) NOT NULL DEFAULT '',
`categoria` int(10) unsigned NOT NULL DEFAULT 0,
`trasp` decimal(10,2) NOT NULL DEFAULT 0.00,
`dettaglio` longtext NOT NULL,
`ico` varchar(255) NOT NULL DEFAULT '',
`foto` varchar(255) NOT NULL DEFAULT '',
`visibility_web` char(1) NOT NULL DEFAULT '',
`composito` char(1) NOT NULL DEFAULT '',
`sottocat` int(10) unsigned NOT NULL DEFAULT 0,
`tipologia` int(10) NOT NULL DEFAULT 0,
`marchio` int(10) unsigned NOT NULL DEFAULT 0,
`disp` char(1) NOT NULL DEFAULT '',
`vetrina1` char(1) NOT NULL DEFAULT '',
`vetrina2` char(1) NOT NULL DEFAULT '',
`click` decimal(10,0) NOT NULL DEFAULT 0,
`IDnote` int(11) NOT NULL DEFAULT 1,
PRIMARY KEY (`IDprodotto`),
KEY `categoria` (`categoria`),
KEY `codice` (`codice`)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
CREATE TABLE `categorie_pp` (
`IDcategoria` int(10) unsigned NOT NULL AUTO_INCREMENT,
`IDmadre` int(11) NOT NULL DEFAULT 0,
`codice` varchar(255) NOT NULL,
`nome` varchar(45) NOT NULL DEFAULT '',
`visibility_web` char(1) NOT NULL DEFAULT '',
PRIMARY KEY (`IDcategoria`),
KEY `codice` (`codice`),
KEY `IDmadre` (`IDmadre`)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
CREATE TABLE `doc_prodottipostali` (
`IDlavorazione` int(11) NOT NULL AUTO_INCREMENT,
`codice_lavorazione` varchar(255) NOT NULL,
`IDdistinta` varchar(45) NOT NULL,
`tipo` varchar(45) NOT NULL,
`data_distinta` date NOT NULL,
`data_lavorazione` date NOT NULL,
`IDcliente` int(11) DEFAULT 1,
`IDpagamento` int(11) NOT NULL,
`note` longtext NOT NULL,
`utente` varchar(45) NOT NULL,
`stato` int(11) NOT NULL,
`data_op` datetime NOT NULL,
`opzioni` varchar(45) NOT NULL,
`rif` varchar(45) NOT NULL,
`opzione1` varchar(45) NOT NULL,
`opzione2` varchar(45) NOT NULL,
`allegati` int(11) NOT NULL DEFAULT 0,
`IDesito` int(11) NOT NULL,
`esito` varchar(45) NOT NULL,
`prod_totali` int(11) NOT NULL,
`prod_accettati` int(11) NOT NULL,
`prod_fatturati` int(11) NOT NULL,
`prod_chiusi` int(11) NOT NULL,
`prod_end_shipping` int(11) NOT NULL,
`stato_fatt` char(1) NOT NULL,
`IDfattura` int(11) NOT NULL DEFAULT 0,
`IDrel` int(11) NOT NULL DEFAULT 0,
`IDflow` int(11) NOT NULL DEFAULT 0,
`laws_tomanage` tinyint(4) NOT NULL DEFAULT 0,
`idSender` int(11) NOT NULL,
PRIMARY KEY (`IDlavorazione`),
KEY `IDdistinta` (`IDdistinta`),
KEY `IDfattura` (`IDfattura`),
KEY `IDrel` (`IDrel`),
KEY `allegati` (`allegati`),
KEY `tipo` (`tipo`),
KEY `IDesito` (`IDesito`),
KEY `laws_tomanage` (`laws_tomanage`),
KEY `data_distinta` (`data_distinta`)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
CREATE TABLE `ordini` (
`IDdoc` int(11) NOT NULL AUTO_INCREMENT,
`IDsoggetto` int(11) DEFAULT 1,
`numero` varchar(45) NOT NULL DEFAULT '0',
`numero_web` int(11) NOT NULL,
`datadoc` date NOT NULL DEFAULT '0000-00-00',
`tipodoc` varchar(255) NOT NULL,
`type_doc` varchar(4) NOT NULL DEFAULT '',
`modpag` int(10) unsigned NOT NULL DEFAULT 0,
`div_dest` longtext NOT NULL,
`IDagente` varchar(6) NOT NULL DEFAULT '0',
`IDvettore` int(10) unsigned NOT NULL DEFAULT 0,
`imballo` decimal(20,5) NOT NULL DEFAULT 0.00000,
`colli` decimal(10,0) NOT NULL DEFAULT 0,
`datascad` date NOT NULL DEFAULT '0000-00-00',
`stato` char(1) NOT NULL DEFAULT '',
`note` longtext NOT NULL,
`importo` decimal(20,5) NOT NULL DEFAULT 0.00000,
`tipo` varchar(4) NOT NULL DEFAULT '',
`causale` varchar(45) NOT NULL DEFAULT '',
`azione` char(1) NOT NULL DEFAULT '',
`utente` varchar(45) NOT NULL DEFAULT '',
`data_op` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`acconto` decimal(20,5) NOT NULL DEFAULT 0.00000,
`iva` decimal(20,5) NOT NULL DEFAULT 0.00000,
`sconto` decimal(20,5) NOT NULL DEFAULT 0.00000,
`n_doc_passive` varchar(45) NOT NULL DEFAULT '',
`porto` varchar(45) NOT NULL DEFAULT '',
`dataora_rit` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`ordine_web` char(1) NOT NULL DEFAULT '',
`clonato` char(1) NOT NULL DEFAULT '',
`trasp` decimal(20,5) NOT NULL DEFAULT 0.00000,
`varie` decimal(20,5) NOT NULL DEFAULT 0.00000,
`banca_appo` int(11) NOT NULL DEFAULT 0,
`IDmagazzino` int(11) NOT NULL DEFAULT 0,
`pv` int(11) NOT NULL DEFAULT 0,
`fornitore` int(11) NOT NULL DEFAULT 0,
`qta_colli` decimal(10,0) NOT NULL DEFAULT 0,
`bolli` decimal(20,5) NOT NULL DEFAULT 0.00000,
`agibilitamezzi` varchar(255) NOT NULL DEFAULT '',
`opzioneprezzo` varchar(45) NOT NULL DEFAULT '',
`emailcc` varchar(255) NOT NULL DEFAULT '',
`U_IBAN` varchar(45) NOT NULL,
`stato_fatt` char(1) NOT NULL,
`IDfattura` int(11) NOT NULL,
`data_fatt` date NOT NULL,
`digital_idinvoice` varchar(255) NOT NULL DEFAULT '0',
`digital_idupload` bigint(11) NOT NULL DEFAULT 0,
`digital_status` int(11) NOT NULL DEFAULT 0,
`digital_type` varchar(2) NOT NULL DEFAULT '',
`digital_office` int(11) NOT NULL,
`digital_sectional` char(2) NOT NULL DEFAULT '',
`digital_last_office` int(11) NOT NULL,
`digital_flag` varchar(4) NOT NULL,
`determines_code` varchar(255) NOT NULL,
`determines_date` date NOT NULL,
`determines_id` varchar(255) NOT NULL,
`flag_2` varchar(255) NOT NULL,
`flag` int(11) NOT NULL,
`flag_note` longtext NOT NULL,
`flag_3` varchar(255) NOT NULL,
`type_op` varchar(255) NOT NULL,
`IDpartenza_rel` int(11) NOT NULL,
`IDdestinazione_rel` int(11) NOT NULL,
`doc_insinuazionepassivo` varchar(255) NOT NULL,
`doc_insinuazionepassivo_data` date NOT NULL DEFAULT '0000-00-00',
PRIMARY KEY (`IDdoc`),
KEY `IDsoggetto` (`IDsoggetto`),
KEY `IDfattura` (`IDfattura`),
KEY `pv` (`pv`),
KEY `azione` (`azione`),
KEY `numero` (`numero`),
KEY `tipodoc` (`tipodoc`),
KEY `datadoc` (`datadoc`),
KEY `digital_sectional` (`digital_sectional`),
KEY `digital_type` (`digital_type`),
KEY `digital_office` (`digital_office`),
KEY `digital_status` (`digital_status`),
KEY `type_doc` (`type_doc`),
KEY `flag_2` (`flag_2`),
KEY `flag` (`flag`),
KEY `flag_3` (`flag_3`),
KEY `type_op` (`type_op`)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
These are the number of rows for each table:
tracking:42231628
doc_prodottipostali_det:11316150
doc_prodottipostali:40556
ordini:40360
prodotti_pp:52
categorie_pp:30
I've tried making a subquery on the tracking table (the biggest), trying to filter out most of the rows but it didn't work.
This is the explain I get, it looks fine except for the "Using temporary" and "Using filesort".
The problem with the query is that with low numbers of elaborated rows it runs fine, the moment it tries to elaborate more than 1 million it begins to slow down.
Explain
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
t
range
product_id,date,last
date
5
\N
69266
Using index condition; Using where; Using temporary; Using filesort
1
SIMPLE
dpd
eq_ref
PRIMARY,IDlavorazione
PRIMARY
4
db.t.product_id
1
1
SIMPLE
ppp
eq_ref
PRIMARY
PRIMARY
4
db.dpd.IDprodotto
1
Using where
1
SIMPLE
cpp
eq_ref
PRIMARY
PRIMARY
4
db.ppp.categoria
1
Using where
1
SIMPLE
cppp
eq_ref
PRIMARY
PRIMARY
4
func
1
Using where
1
SIMPLE
o
eq_ref
PRIMARY
PRIMARY
4
db.dpd.IDfattura
1
Using index
1
SIMPLE
dp
eq_ref
PRIMARY,tipo,data_distinta
PRIMARY
4
db.dpd.IDlavorazione
1
Using where
When you try to extend the time range on the tracking table t.date >= '2022-05-15 00:00:00' AND t.date <= '2022-07-26 23:59:59' the optimizer switches to a full table scan.
I've tried forcing the 'date' index on the tracking table but it still slows down with big ranges. The users are supposed to search without the time range too, so an index on the date isn't really the best option.
Update:
I've fixed the malformed query by aggregating the nonggregated fields. The GROUP_CONCAT fields returned the same values and the ONLY_FULL_GROUP_BY setting was disabled, that's why it was working normally.
I'll be more specific on the problems that I'm facing. When a user tries to search with a more extended range of dates, for example from '2022-03-15 00:00:00' to '2022-07-26 23:59:59' the query takes 5:24 minutes to complete with the following explain:
Explain with extended date range:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
dp
ref
PRIMARY,tipo,data_distinta
tipo
137
const
20400
Using index condition; Using where; Using temporary; Using filesort
1
SIMPLE
dpd
ref
PRIMARY,IDlavorazione
IDlavorazione
4
db.dp.IDlavorazione
115
1
SIMPLE
ppp
eq_ref
PRIMARY
PRIMARY
4
db.dpd.IDprodotto
1
Using where
1
SIMPLE
cpp
eq_ref
PRIMARY
PRIMARY
4
db.ppp.categoria
1
Using where
1
SIMPLE
cppp
eq_ref
PRIMARY
PRIMARY
4
func
1
Using where
1
SIMPLE
o
eq_ref
PRIMARY
PRIMARY
4
db.dpd.IDfattura
1
Using where; Using index
1
SIMPLE
t
ref
product_id,date,last
product_id
4
db.dpd.IDlavorazione_dett
1
Using where
Update 2:
Removing the ORDER BY clause removes the 'Using temporary' and 'Using filesort' from the explain, I guess it's an index issue, even if the fields are covered by indexes. After removing the ORDER BY clause the query time is still 5 minutes.
Your query appears to filter your tracking table on IN-lists on last and a date range on date. A multi-column index on those two columns, with the equality-matched column first and the range-matched column second may help you. And your query uses product_id from that table. So this covering index is worth a try.
ALTER TABLE tracking
ADD INDEX last_date (last, date, product_id);
You can look up covering indexes to learn more.
For starters, give these indexes a try:
dpd: INDEX(IDprodotto, IDcategoria, IDlavorazione_dett, IDlavorazione, IDfattura)
dpd: INDEX(IDlavorazione_dett, IDlavorazione, IDprodotto, IDcategoria, IDfattura)
dp: INDEX(tipo, opzione2, data_distinta, IDlavorazione)
dp: INDEX(IDlavorazione)
o: INDEX(IDdoc)
t: INDEX(date, last, product_id)
t: INDEX(product_id)
ppp: INDEX(IDprodotto, categoria)
categorie_pp: INDEX(IDcategoria)
categorie_pp: INDEX(codice, IDcategoria)
When adding a composite index, DROP any plain index(es) with the same leading columns. That is, when you have both INDEX(a) and INDEX(a,b), toss the former.
CAST(dpd.IDcategoria AS UNSIGNED) -- Perhaps you should change the datatype of IDcategoria so you can avoid the conversion here? (That might lead to better index usage.)
The main things slowing down the query (and making it difficult to optimize):
WHERE clause referencing multiple tables.
OR
NOT
GROUP BY after JOINing 7 tables
But I don't see how to improve on any of them.

MySQL Slow ORDER BY when done on JOIN value?

I have this query:
SELECT
c.*,
cv.views
FROM
content AS c
JOIN
content_views AS cv ON cv.content = c.record_num
WHERE
c.enabled = 1
ORDER BY
cv.views
Quite simple, but it's really slow... Is there a way to make it faster ?
This is my EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE c ref enabled_2,enabled enabled 4 const 23947 Using temporary; Using filesort
1 SIMPLE cv eq_ref PRIMARY PRIMARY 4 c.record_num 1
EDIT 2016-02-24
Please note that usually, I use a LIMIT so the number of records returned in the EXPLAIN isn't entirely accurate, however for the sake of simplicity and because the performance doesn't change with the LIMIT or without it, I have removed it.
As requested in the comments, this is the result of my SHOW CREATE TABLE. As you can see, one of my table is MyISAM while the other is InnoDB.
CREATE TABLE `content` (
`title` varchar(255) NOT NULL DEFAULT '',
`filename` varchar(255) NOT NULL DEFAULT '',
`filename_2` varchar(255) NOT NULL,
`filename_3` varchar(255) NOT NULL,
`orig_filename` varchar(255) NOT NULL,
`trailer_filename` varchar(255) NOT NULL,
`thumbnail` varchar(255) NOT NULL DEFAULT '',
`embed` text NOT NULL,
`description` text NOT NULL,
`paysite` int(11) NOT NULL DEFAULT '0',
`keywords` varchar(255) NOT NULL,
`model` varchar(255) NOT NULL DEFAULT '',
`scheduled_date` date NOT NULL DEFAULT '0000-00-00',
`date_added` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`encoded_date` datetime NOT NULL,
`rating` int(5) NOT NULL DEFAULT '0',
`length` int(11) NOT NULL DEFAULT '0',
`submitter` int(11) NOT NULL DEFAULT '0',
`ip` varchar(15) NOT NULL,
`approved` int(11) NOT NULL DEFAULT '0',
`hotlinked` varchar(1024) NOT NULL,
`plug_url` varchar(255) NOT NULL,
`enabled` int(11) NOT NULL DEFAULT '0',
`main_thumb` int(11) NOT NULL DEFAULT '3',
`xml` varchar(32) NOT NULL,
`photos` int(11) NOT NULL DEFAULT '0',
`mobile` varchar(255) NOT NULL,
`modeltmp` varchar(255) NOT NULL,
`movie_width` int(11) NOT NULL,
`movie_height` int(11) NOT NULL,
`token` varchar(255) DEFAULT NULL,
`source_thumb_url` varchar(255) NOT NULL,
`related` varchar(1024) NOT NULL,
`force_related` varchar(255) NOT NULL,
`record_num` int(11) NOT NULL AUTO_INCREMENT,
`webvtt_src` text NOT NULL,
`category_thumb` int(11) NOT NULL,
`related_date` date NOT NULL,
`publish_ready` tinyint(1) NOT NULL,
PRIMARY KEY (`record_num`),
KEY `encoded_date` (`encoded_date`,`photos`,`enabled`),
KEY `filename` (`filename`),
KEY `scheduled_date` (`scheduled_date`),
KEY `enabled_2` (`enabled`,`length`,`photos`),
KEY `enabled` (`enabled`,`encoded_date`,`photos`),
KEY `rating` (`rating`,`enabled`,`photos`),
KEY `token` (`token`),
KEY `submitter` (`submitter`),
FULLTEXT KEY `keywords` (`keywords`,`title`),
FULLTEXT KEY `title` (`title`),
FULLTEXT KEY `description` (`description`),
FULLTEXT KEY `keywords_2` (`keywords`)
) ENGINE=MyISAM AUTO_INCREMENT=124207 DEFAULT CHARSET=latin1
CREATE TABLE `content_views` (
`views` int(11) NOT NULL,
`content` int(11) NOT NULL,
PRIMARY KEY (`content`),
KEY `views` (`views`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
For this query:
SELECT c.*, cv.views
FROM content c JOIN
content_views cv
ON cv.content = c.record_num
WHERE c.enabled = 1
ORDER BY cv.views;
The best indexes are probably content(enabled, record_num) and content_views(content, views). I am guessing that the performance even with these indexes will be similar to what you have now.

mysql Query optimization task

Mentioned below is the query and the tables its is being run on ...
SELECT * FROM
tfl_acquistions a,
tfl_property_attributes b WHERE
a.id = b.property_id AND
attribute_id ='111' AND
a.id ='53a8288c03a6823';
Table tfl_acquistions
CREATE TABLE `tfl_acquistions` (
`id` VARCHAR(32) NOT NULL DEFAULT '',
`address` VARCHAR(100) NOT NULL DEFAULT '',
`city` VARCHAR(50) NOT NULL DEFAULT '',
`state` VARCHAR(10) NOT NULL DEFAULT '',
`zip` VARCHAR(10) NOT NULL DEFAULT '',
`county` VARCHAR(50) NOT NULL DEFAULT '',
`country` VARCHAR(50) NOT NULL DEFAULT '',
`status` ENUM('Y','N') NOT NULL DEFAULT 'Y',
`customer_case` VARCHAR(25) NOT NULL DEFAULT '',
`circle_id` INT(11) NOT NULL DEFAULT '0',
`visneta_id` VARCHAR(45) NOT NULL DEFAULT '',
`add_date` DATE NOT NULL DEFAULT '0000-00-00',
`apt_no` VARCHAR(10) NOT NULL DEFAULT '',
`profile_picture` VARCHAR(256) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
INDEX `address` (`address`),
INDEX `city` (`city`),
INDEX `state` (`state`),
INDEX `zip` (`zip`),
INDEX `status` (`status`),
INDEX `customer_case` (`customer_case`),
INDEX `circle_id` (`circle_id`),
INDEX `visneta_id` (`visneta_id`)
)
Table tfl_property_attributes
CREATE TABLE `tfl_property_attributes` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`property_id` VARCHAR(32) NOT NULL,
`attribute_id` INT(11) NOT NULL DEFAULT '0',
`value` VARCHAR(500) NOT NULL DEFAULT '',
`update_date` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`update_by` INT(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
INDEX `attribute_id` (`attribute_id`),
INDEX `property_id` (`property_id`),
INDEX `property_id_2` (`property_id`, `attribute_id`)
)
I am on a task to optimized this query and i am new .... any help is appreciated
Try:
SELECT * FROM
tfl_acquistions a JOIN
tfl_property_attributes b ON a.id = b.property_id WHERE
b.property_id = '53a8288c03a6823' AND b.attribute_id = '111';
This way MySQL will be able to use the index property_id_2 (property_id, attribute_id) you have created on the second table. Currently, it can't use any indexes.
Try putting EXPLAIN keyword in front of queries to see how MySQL plans to perform them, you'll see that your previous query does not use any index.

Am I wrong in table design or wrong in selected index when made the table?

I've build web application as a tool to eliminate unnecessary data in peoples table, this application mainly to filter all data of peoples who valid to get an election rights. At first, it wasn't a problem when the main table still had few rows, but it is really bad (6 seconds) when the table is filled with about 200K rows (really worse because the table will be up to 6 million rows).
I have table design like below, and I am doing a join with 4 tables (region table start from province, city, district and town). Each region table is related to each other with their own id:
CREATE TABLE `peoples` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`id_prov` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_town` smallint(4) NOT NULL,
`tps` smallint(4) NOT NULL,
`urut_xls` varchar(20) NOT NULL,
`nik` varchar(20) NOT NULL,
`name` varchar(60) NOT NULL,
`place_of_birth` varchar(60) NOT NULL,
`birth_date` varchar(30) NOT NULL,
`age` tinyint(3) NOT NULL DEFAULT '0',
`sex` varchar(20) NOT NULL,
`marital_s` varchar(20) NOT NULL,
`address` varchar(160) NOT NULL,
`note` varchar(60) NOT NULL,
`m_name` tinyint(1) NOT NULL DEFAULT '0',
`m_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`format_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`m_sex` tinyint(1) NOT NULL DEFAULT '0' COMMENT ,
`m_m_status` tinyint(1) NOT NULL DEFAULT '0' ,
`sex_double` tinyint(1) NOT NULL DEFAULT '0',
`id_import` bigint(10) NOT NULL,
`id_workspace` tinyint(4) unsigned NOT NULL DEFAULT '0',
`stat_valid` smallint(1) NOT NULL DEFAULT '0' ,
`add_manual` tinyint(1) unsigned NOT NULL DEFAULT '0' ,
`insert_by` varchar(12) NOT NULL,
`update_by` varchar(12) DEFAULT NULL,
`mark_as_duplicate` smallint(1) NOT NULL DEFAULT '0' ,
`mark_as_trash` smallint(1) NOT NULL DEFAULT '0' ,
`in_date_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ind_import` (`id_import`),
KEY `ind_duplicate` (`mark_as_duplicate`),
KEY `id_workspace` (`id_workspace`),
KEY `tambah_manual` (`tambah_manual`),
KEY `il` (`stat_valid`,`mark_as_trash`,`in_date_time`),
KEY `region` (`id_prov`,`id_kab`,`id_kec`,`id_kel`,`tps`),
KEY `name` (`name`),
KEY `place_of_birth` (`place_of_birth`),
KEY `ind_birth` (`birthdate`(10)),
KEY `ind_sex` (`sex`(2))
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
town:
CREATE TABLE `town` (
`id` smallint(4) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_prov` smallint(2) NOT NULL,
`name_town` varchar(60) NOT NULL,
`handprint` blob,
`pps_1` varchar(60) DEFAULT NULL,
`pps_2` varchar(60) DEFAULT NULL,
`pps_3` varchar(60) DEFAULT NULL,
`tpscount` smallint(2) DEFAULT NULL,
`pps_4` varchar(60) DEFAULT NULL,
`pps_5` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id_prov`,`id_kab`,`id_kec`,`id`),
KEY `name_town` (`name_town`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
and the query like
SELECT `E`.`id`, `E`.`id_prov`, `E`.`id_city`, `E`.`id_district`, `E`.`id_town`,
`B`.`name_prov`,`C`.`name_city`,`D`.`name_district`, `A`.`name_town`,
`E`.`tps`, `E`.`urut_xls`, `E`.`nik`,`E`.`name`,`E`.`place_of_birth`,
`E`.`birth_date`, `E`.age, `E`.`sex`, `E`.`marital_s`, `E`.`address`,
`E`.`note`
FROM peoples E
JOIN test_prov B ON E.id_prov = B.id
JOIN test_city C ON E.id_city = C.id
AND (C.id_prov=B.id)
JOIN test_district D ON E.id_district = D.id
AND ((D.id_city = C.id) AND (D.id_prov= B.id))
JOIN test_town A ON E.id_town = A.id
AND ((A.id_district = D.id)
AND (A.id_city = C.id)
AND (A.id_prov = B.id))
AND E.stat_valid=1
AND E.mark_as_trash=0
mark_as_trash is a mark column which only contain 1 and zero just to know if the data has been mark as a deleted record, and stat_valid is the filtered result value - if value is 1 then the data is valid to get the rights of election.
I've tried to see the explain but no column is used as an index lookup. I believe that's the problem why the application so slow in 200K rows. The query above only shows two conditions, but the application has a feature to filter by name, place of birth, birth date, age with ranges and so on.
How can I make this perform better?
Can a city be in two provinces? If not then why do you check C.id_prov=B.id if E.id_city = C.id should give you just one row?
Also it seems that your query is slow because you're selecting 200k rows. Indexes will improve performance but do you really need all the rows at once? You should use pagination (limit, offset).

MySql query optimization help

I have few queries and am not able to figure out how to optimize them,
QUERY 1
select *
from t_twitter_tracking
where classified is null and tweetType='ENGLISH'
order by id limit 500;
QUERY 2
Select
count(*) as cnt,
DATE_FORMAT(CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30'),'%Y-%m-%d')
as dat
from
t_twitter_tracking wrdTrk
where
wrdTrk.word like ('dell')
and CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30')
between '2010-12-12 00:00:00' and '2010-12-26 00:00:00'
group by dat;
Both these queries run on the same table,
CREATE TABLE `t_twitter_tracking` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`word` VARCHAR(200) NOT NULL,
`tweetId` BIGINT(100) NOT NULL,
`twtText` VARCHAR(800) NULL DEFAULT NULL,
`language` TEXT NULL,
`links` TEXT NULL,
`tweetType` VARCHAR(20) NULL DEFAULT NULL,
`source` TEXT NULL,
`sourceStripped` TEXT NULL,
`isTruncated` VARCHAR(40) NULL DEFAULT NULL,
`inReplyToStatusId` BIGINT(30) NULL DEFAULT NULL,
`inReplyToUserId` INT(11) NULL DEFAULT NULL,
`rtUsrProfilePicUrl` TEXT NULL,
`isFavorited` VARCHAR(40) NULL DEFAULT NULL,
`inReplyToScreenName` VARCHAR(40) NULL DEFAULT NULL,
`latitude` BIGINT(100) NOT NULL,
`longitude` BIGINT(100) NOT NULL,
`retweetedStatus` VARCHAR(40) NULL DEFAULT NULL,
`statusInReplyToStatusId` BIGINT(100) NOT NULL,
`statusInReplyToUserId` BIGINT(100) NOT NULL,
`statusFavorited` VARCHAR(40) NULL DEFAULT NULL,
`statusInReplyToScreenName` TEXT NULL,
`screenName` TEXT NULL,
`profilePicUrl` TEXT NULL,
`twitterId` BIGINT(100) NOT NULL,
`name` TEXT NULL,
`location` VARCHAR(100) NULL DEFAULT NULL,
`bio` TEXT NULL,
`url` TEXT NULL COLLATE 'latin1_swedish_ci',
`utcOffset` INT(11) NULL DEFAULT NULL,
`timeZone` VARCHAR(100) NULL DEFAULT NULL,
`frenCnt` BIGINT(20) NULL DEFAULT '0',
`createdAt` DATETIME NULL DEFAULT NULL,
`createdOnGMT` VARCHAR(40) NULL DEFAULT NULL,
`createdOnServerTime` DATETIME NULL DEFAULT NULL,
`follCnt` BIGINT(20) NULL DEFAULT '0',
`favCnt` BIGINT(20) NULL DEFAULT '0',
`totStatusCnt` BIGINT(20) NULL DEFAULT NULL,
`usrCrtDate` VARCHAR(200) NULL DEFAULT NULL,
`humanSentiment` VARCHAR(30) NULL DEFAULT NULL,
`replied` BIT(1) NULL DEFAULT NULL,
`replyMsg` TEXT NULL,
`classified` INT(32) NULL DEFAULT NULL,
`createdOnGMTDate` DATETIME NULL DEFAULT NULL,
`locationDetail` TEXT NULL,
`geonameid` INT(11) NULL DEFAULT NULL,
`country` VARCHAR(255) NULL DEFAULT NULL,
`continent` CHAR(2) NULL DEFAULT NULL,
`placeLongitude` FLOAT NULL DEFAULT NULL,
`placeLatitude` FLOAT NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `id` (`id`, `word`),
INDEX `createdOnGMT_index` (`createdOnGMT`) USING BTREE,
INDEX `word_index` (`word`) USING BTREE,
INDEX `location_index` (`location`) USING BTREE,
INDEX `classified_index` (`classified`) USING BTREE,
INDEX `tweetType_index` (`tweetType`) USING BTREE,
INDEX `getunclassified_index` (`classified`, `tweetType`) USING BTREE,
INDEX `timeline_index` (`word`, `createdOnGMTDate`, `classified`) USING BTREE,
INDEX `createdOnGMTDate_index` (`createdOnGMTDate`) USING BTREE,
INDEX `locdetail_index` (`country`, `id`) USING BTREE,
FULLTEXT INDEX `twtText_index` (`twtText`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM
ROW_FORMAT=DEFAULT
AUTO_INCREMENT=12608048;
The table has more than 10 million records. How can I optimize it?
EDITED
Explain on 2nd query
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"wrdTrk";"range";"word_index,word_createdOnGMT";"word_index";"602";NULL;"222847";"Using where; Using temporary; Using filesort"
Regards,
Rohit
In Query2, I suggest that:
1. remove DATE_FORMAT and CONVERT_TZ. You can process in PHP to be an output or between's condition.
2. like ('dell'): I don't see any '%', so you can use wrdTrk.word = 'dell' to let it faster.
The convert_tz in the where condition needs to be removed,
Select
count(*) as cnt,
DATE_FORMAT(CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30'),'%Y-%m-%d')
as dat
from
t_twitter_tracking wrdTrk
where
wrdTrk.word like ('dell')
and CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30')
between '2010-12-12 00:00:00' and '2010-12-26 00:00:00'
group by dat;
This will lead to comparing each row and finding out the right result, hence a tremendous improvement in the query result. Just passed the converted data to the query.
Select
count(*) as cnt,
DATE_FORMAT(CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30'),'%Y-%m-%d')
as dat
from
t_twitter_tracking wrdTrk
where
wrdTrk.word like ('dell')
and CONVERT_TZ(wrdTrk.createdOnGMTDate,'+00:00','+05:30')
between '2010-12-12 00:00:00' and '2010-12-26 00:00:00'
group by dat;