Help me optimize this query - mysql

I have this query for an application that I am designing. There is a table of references, an authors table and a reference_authors table. There is a sub query to return all authors for a given reference which I then display formatted in php. The subquery and query run individually are both nice and speedy. However as soon as the subquery is put into the main query the whole thing takes over 120s to run. I would apprecaite some fresh eyes on this one.
Thanks.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
(SELECT GROUP_CONCAT(a.author_name)
FROM authors_final AS a
INNER JOIN reference_authors AS ra2 ON ra2.author_id = a.author_id
WHERE ra2.reference_id = rf.reference_id
GROUP BY ra2.reference_id) AS authors
FROM
references_final AS rf
INNER JOIN reference_authors AS ra ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id;
Here is the fixed query. Thanks guys for the recommendations.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
GROUP_CONCAT(a.author_name) AS authors
FROM
references_final as rf
INNER JOIN (reference_authors AS ra INNER JOIN authors_final AS a ON ra.author_id = a.author_id)
ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id
GROUP BY rf.reference_id

Although not every subquery can be rewritten as an inner join, I think yours can.
From 120 seconds to 78 milliseconds is not a bad improvement--about three orders of magnitude. Take the rest of the day off.
When you come back tomorrow, start looking for other subqueries in your source code.

You say the subquery is nice and speedy in isolation but its now obviously running for every single row - 100 rows = 100 sub queries.
Assuming you have indexes on all your foreign keys that's as good as it gets as a sub query.
One option is to left join authors and create a Cartesian product - you'll have a lot more rows returned and will need some code to get to the same end result but it will put less strain on the db and will run quicker.
If you've got paging on and say are returning 10 rows, issung 10 individual calls to get the authors in isolation would also be be pretty quick.

Related

Is there any difference, performance wise, with these two queries? (Repeating the where clause inside the sub-query) MYSQL

I have a query that goes something like this.
Select *
FROM FaultCode FC
JOIN (
SELECT INNER_E.* FROM Equipment INNER_E
) E USING(EquipmentID)
LEFT JOIN AssetType AT ON AT.id_asset_type = E.id_asset_type AND AT.id_language = 'en-us'
LEFT JOIN Project P ON E.current_id_project = P.id_project
WHERE E.id_organization = 100057 AND E.equipment_status = 'ACTIVE'
AND FC.code_status = 'OPEN'
As you can see, in the outside query, there is a where clause in the outside main query.
But also, on the inside, we have an Inner Join statement with the line SELECT INNER_E.* FROM Equipment INNER_E. This inner join makes us only retrieve the fault codes that are inside the equipment table (correct me if I'm wrong).
I am trying to optimize this query.
My question is, does it make any difference to do this
Select *
FROM FaultCode FC
JOIN (
SELECT INNER_E.* FROM Equipment INNER_E
WHERE INNER_E.id_organization = 100057 AND INNER_E.equipment_status = 'ACTIVE'
) E USING(EquipmentID)
LEFT JOIN AssetType AT ON AT.id_asset_type = E.id_asset_type AND AT.id_language = 'en-us'
LEFT JOIN Project P ON E.current_id_project = P.id_project
WHERE E.id_organization = 100057 AND E.equipment_status = 'ACTIVE'
AND FC.code_status = 'OPEN'
So repeating the where clause inside the inner sub query, to further limit it before it joins. Or does the optimizer know to do this automatically?
I tried implementing that line in code, and it seemed to only make my query slower strangely enough. Is there any way I can optimize that query above, or since it's pretty simple, is that the best it's going to get without indexes?
I tried running the Explain Select statement, but I have a hard time parsing what it's telling me. Are there any good resources I can look into to learn some tips or techniques to optimize my query?
I don't have any aggregate functions in my Select fields. So is the only real answer Indexes?
Why is the first subquery needed? Perhaps simply
Select *
FROM FaultCode FC
JOIN Equipment AS E USING(EquipmentID)
LEFT JOIN AssetType AT ON AT.id_asset_type = E.id_asset_type
AND AT.id_language = 'en-us'
LEFT JOIN Project P ON E.current_id_project = P.id_project
WHERE E.id_organization = 100057
AND E.equipment_status = 'ACTIVE'
AND FC.code_status = 'OPEN';
Likely Indexes:
FC: INDEX(code_status, EquipmentID)
E: INDEX(id_organization, equipment_status, EquipmentID,)
Probably unwise to do SELECT * -- It will give you all the columns of all 4 tables. (Without further details, I cannot suggest any "covering" indexes, which seems likely for AT.)
With my version of the query, your question about repeating the WHERE vanishes. With your version, it is likely to help. I don't think the Optimizer is smart enough to catch on to what you are doing.
Show us the EXPLAINs. We can help some with what the cryptic stuff is saying. (And what it is not saying.)
"the best it's going to get without indexes" -- Are you saying you have no indexes??! Not even a PRIMARY KEY for each table? "So is the only real answer Indexes?" Every time you write a query against a non-tiny table, you should ask "do the table(s) have adequate indexes for this query?"

Sql Join taking a lot of time

I am tying to execute this query but it is taking more than 5 hours, but the data base size is just 20mb. this is my code. Here I am joining 11 tables with reg_id. I need all columns with distinct values. Please guide me how to rearrange the query.
SELECT *
FROM degree
JOIN diploma
ON degree.reg_id = diploma.reg_id
JOIN further_studies
ON diploma.reg_id = further_studies.reg_id
JOIN iti
ON further_studies.reg_id = iti.reg_id
JOIN personal_info
ON iti.reg_id = personal_info.reg_id
JOIN postgraduation
ON personal_info.reg_id = postgraduation.reg_id
JOIN puc
ON postgraduation.reg_id = puc.reg_id
JOIN skills
ON puc.reg_id = skills.reg_id
JOIN sslc
ON skills.reg_id = sslc.reg_id
JOIN license
ON sslc.reg_id = license.reg_id
JOIN passport
ON license.reg_id = passport.reg_id
GROUP BY fullname
Please help me if I did any mistake
This is a bit long for a comment.
The first problem with your query is that you are using select * with group by fullname. You have zillions of columns in the select that are not in the group by. Unless you really, really, really know what you are doing (which I doubt), this is the wrong way to write a query.
Your performance problem is undoubtedly due to cartesian products and lack of indexes. You are joining across different dimensions -- such as skills and degrees. The result is a product of all the possibilities. For some people, the data size can grow and grow and grow.
And then, the question is: do you have indexes on the keys used in the joins? For performance, you generally want such indexes.
I thought the problem is in the query.First make sure group by fullname and try to give some column names instead of *.

SELECT DISTINCT statement in MySQL is taking 10 minutes

I'm reasonably new to MySQL and I'm trying to select a distinct set of rows using this statement:
SELECT DISTINCT sp.atcoCode, sp.name, sp.longitude, sp.latitude
FROM `transportdata`.stoppoints as sp
INNER JOIN `vehicledata`.gtfsstop_times as st ON sp.atcoCode = st.fk_atco_code
INNER JOIN `vehicledata`.gtfstrips as trip ON st.trip_id = trip.trip_id
INNER JOIN `vehicledata`.gtfsroutes as route ON trip.route_id = route.route_id
INNER JOIN `vehicledata`.gtfsagencys as agency ON route.agency_id = agency.agency_id
WHERE agency.agency_id IN (1,2,3,4);
However, the select statement is taking around 10 minutes, so something is clearly afoot.
One significant factor is that the table gtfsstop_times is huge. (~250 million records)
Indexes seem to be set up properly; all the above joins are using indexed columns. Table sizes are, roughly:
gtfsagencys - 4 rows
gtfsroutes - 56,000 rows
gtfstrips - 5,500,000 rows
gtfsstop_times - 250,000,000 rows
`transportdata`.stoppoints - 400,000 rows
The server has 22Gb of memory, I've set the InnoDB buffer pool to 8G and I'm using MySQL 5.6.
Can anybody see a way of making this run faster? Or indeed, at all!
Does it matter that the stoppoints table is in a different schema?
EDIT:
EXPLAIN SELECT... returns this:
It looks like you are trying to find a collection of stop points, based on certain criteria. And, you're using SELECT DISTINCT to avoid duplicate stop points. Is that right?
It looks like atcoCode is a unique key for your stoppoints table. Is that right?
If so, try this:
SELECT sp.name, sp.longitude, sp.latitude, sp.atcoCode
FROM `transportdata`.stoppoints` AS sp
JOIN (
SELECT DISTINCT st.fk_atco_code AS atcoCode
FROM `vehicledata`.gtfsroutes AS route
JOIN `vehicledata`.gtfstrips AS trip ON trip.route_id = route.route_id
JOIN `vehicledata`.gtfsstop_times AS st ON trip.trip_id = st.trip_id
WHERE route.agency_id BETWEEN 1 AND 4
) ids ON sp.atcoCode = ids.atcoCode
This does a few things: It eliminates a table (agency) which you don't seem to need. It changes the search on agency_id from IN(a,b,c) to a range search, which may or may not help. And finally it relocates the DISTINCT processing from a situation where it has to handle a whole ton of data to a subquery situation where it only has to handle the ID values.
(JOIN and INNER JOIN are the same. I used JOIN to make the query a bit easier to read.)
This should speed you up a bit. But, it has to be said, a quarter gigarow table is a big table.
Having 250M records, I would shard the gtfsstop_times table on one column. Then each sharded table can be joined in a separate query that can run parallel in separate threads, you'll only need to merge the result sets.
The trick is to reduce how many rows of gtfsstop_times SQL has to evaluate. In this case SQL first evaluates every row in the inner join of gtfsstop_times and transportdata.stoppoints, right? How many rows does transportdata.stoppoints have? Then SQL evaluates the WHERE clause, then it evaluates DISTINCT. How does it do DISTINCT? By looking at every single row multiple times to determine if there are other rows like it. That would take forever, right?
However, GROUP BY quickly squishes all the matching rows together, without evaluating each one. I normally use joins to quickly reduce the number of rows the query needs to evaluate, then I look at my grouping.
In this case you want to replace DISTINCT with grouping.
Try this;
SELECT sp.name, sp.longitude, sp.latitude, sp.atcoCode
FROM `transportdata`.stoppoints as sp
INNER JOIN `vehicledata`.gtfsstop_times as st ON sp.atcoCode = st.fk_atco_code
INNER JOIN `vehicledata`.gtfstrips as trip ON st.trip_id = trip.trip_id
INNER JOIN `vehicledata`.gtfsroutes as route ON trip.route_id = route.route_id
INNER JOIN `vehicledata`.gtfsagencys as agency ON route.agency_id = agency.agency_id
WHERE agency.agency_id IN (1,2,3,4)
GROUP BY sp.name
, sp.longitude
, sp.latitude
, sp.atcoCode
There other valuable answers to your question and mine is an addition to it. I assume sp.atcoCode and st.fk_atco_code are indexed columns in their table.
If you can validate and make sure that agency ids in the WHERE clause are valid, you can eliminate joining `vehicledata.gtfsagencys` in the JOINS as you are not fetching any records from the table.
SELECT DISTINCT sp.atcoCode, sp.name, sp.longitude, sp.latitude
FROM `transportdata`.stoppoints as sp
INNER JOIN `vehicledata`.gtfsstop_times as st ON sp.atcoCode = st.fk_atco_code
INNER JOIN `vehicledata`.gtfstrips as trip ON st.trip_id = trip.trip_id
INNER JOIN `vehicledata`.gtfsroutes as route ON trip.route_id = route.route_id
WHERE route.agency_id IN (1,2,3,4);

Need help speeding up a MySQL query

I need a query that quickly shows the articles within a particular module (a subset of articles) that a user has NOT uploaded a PDF for. The query I am using below takes about 37 seconds, given there are 300,000 articles in the Article table, and 6,000 articles in the Module.
SELECT *
FROM article a
INNER JOIN article_module_map amm ON amm.article=a.id
WHERE amm.module = 2 AND
a.id NOT IN (
SELECT afm.article
FROM article_file_map afm
INNER JOIN article_module_map amm ON amm.article = afm.article
WHERE afm.organization = 4 AND
amm.module = 2
)
What I am doing in the above query is first truncating the list of articles to the selected module, and then further truncating that list to the articles that are not in the subquery. The subquery is generating a list of the articles that an organization has already uploaded PDF's for. Hence, the end result is a list of articles that an organization has not yet uploaded PDF's for.
Help would be hugely appreciated, thanks in advance!
EDIT 2012/10/25
With #fthiella's help, the below query ran in an astonishing 1.02 seconds, down from 37+ seconds!
SELECT a.* FROM (
SELECT article.* FROM article
INNER JOIN article_module_map
ON article.id = article_module_map.article
WHERE article_module_map.module = 2
) AS a
LEFT JOIN article_file_map
ON a.id = article_file_map.article
AND article_file_map.organization=4
WHERE article_file_map.id IS NULL
I am not sure that i can understand the logic and the structure of the tables correctly. This is my query:
SELECT
article.id
FROM
article
INNER JOIN
article_module_map
ON article.id = article_module_map.article
AND article_module_map.module=2
LEFT JOIN
article_file_map
ON article.id = article_file_map.article
AND article_file_map.organization=4
WHERE
article_file_map.id IS NULL
I extract all of the articles that have a module 2. I then select those that organization 4 didn't provide a file.
I used a LEFT JOIN instead of a subquery. In some circumstances this could be faster.
EDIT Thank you for your comment. I wasn't sure it would run faster, but it surprises me that it is so much slower! Anyway, it was worth a try!
Now, out of curiosity, I would like to try all the combinations of LEFT/INNER JOIN and subquery, to see which one runs faster, eg:
SELECT *
FROM
(SELECT *
FROM
article INNER JOIN article_module_map
ON article.id = article_module_map.article
WHERE
article_module_map.module=2)
LEFT JOIN
etc.
maybe removing *, and I would like to see what changes between the condition on the WHERE clause and on the ON clause... anyway I think it doesn't help much, you should concentrate on indexes now.
Indexes on keys/foreign key should be okay already, but what if you add an index on article_module_map.module and/or article_file_map.organization ?
When optimizing queries I use to check the following points:
First: I would avoid using * in SELECT clause, instead, name the diferent fields you want. This increases crazily the speed (I had one which took 7 seconds with *, and naming the field decreased to 0.1s).
Second: As #Adder says, add indexes to your tables.
Third: Try using INNER JOIN instead of WHERE amm.module = 2 AND a.id NOT IN ( ... ). I think I read (I don't remember it well, so take it carefully) that usually MySQL optimize INNER JOINS, and as your subquery is a filter, maybe using three INNER JOINS plus WHERE would be faster to retrieve.

Query's result set is too big

I have a query that can be fast or slow depending on how many records I'm fetching. Here's a table showing the number in my LIMIT clause and the corresponding time it takes to execute the query and fetch the results:
LIMIT | Seconds (Duration/Fetch)
------+-------------------------
10 | 0.030/ 0.0
100 | 0.062/ 0.0
1000 | 1.700/ 0.8
10000 | 25.000/100.0
As you can see, it's fine up to at least 1,000 but 10,000 is really slow, mostly due to a high fetch time. I don't understand why the growth of the fetch time isn't linear but I am grabbing over 200 columns from over 70 tables, so the fact that the result set takes a long time to fetch is not a surprise.
What I'm fetching, by the way, is data on all the accounts at a certain bank. The bank I'm dealing with has about 160,000 accounts so I ultimately need to fetch 160,000 rows from the database.
It's obviously not going to be feasible to try to fetch 160,000 rows at once (at least not unless I can somehow dramatically optimize my query). It seems to me that the biggest chunk I can reasonably grab is 1,000 rows, so I wrote a script that would run the query over and over with a SELECT INTO OUTFILE, limit and offset. Then, at the end, I take all the CSV files I dumped and cat them together. It works but it's slow. It takes hours. I have the script running right now and it's only dumped 43,000 rows in about an hour.
Should I attack this problem at the query optimization level or does the long fetch time suggest I should focus elsewhere? What would you recommend I do?
If you want to see the query you can see it here.
The answer is going to greatly depend on what you're doing with the data. Querying 215 columns through 29 joins will never be quick for non-trivial record sizes.
If you're trying to display 160,000 records to the user, you should page the results and only fetch one page at a time. This will keep the result set small enough that even a relatively inefficient query will return quickly. In this case, you will also want to examine just how much data the user needs in order to select or manipulate the data. Chances are good that you can pare it down to a handful of fields and some aggregates (count, sum, etc) that will let the user make an informed decision about which records they want to work with. Use LIMIT with an offset to pull single pages of arbitrary size.
If you need to export the data for reporting purposes, ensure that you are only pulling the exact data that the report needs. Eliminate joins where possible and use subqueries where you need an aggregate of child data. You'll want to tune/add indexes for the frequently used joins and criteria. In the case of your provided query, ib.id and the myriad of foreign keys you're joining through. You can leave off boolean columns because there are not enough distinct values to form a meaningful index.
Regardless of what you're trying to accomplish, removing some of the joins and columns will inherently speed up your processing. The amount of heavy lifting that MySQL needs to do to fill that query is your main stumbling block.
I've restructured your query to hopefully offer significant performance improvement time. By using the STRAIGHT_JOIN tells MySQL to do in the order you've stated (or I've adjusted here). The inner-most, first query "PreQuery" alias STARTS at your criteria of the import bundle and generic import, to the account import to the account... By pre-applying the WHERE clause there (and as you would test, add your LIMIT CLAUSE HERE) you are pre-joining these tables and getting them right out of the way before wasting any time trying to get the customers, address, etc other information going. In the query, I've adjusted the join/left joins to better show the relationship of the underlying linked tables (primarily for anyone else reading in).
As another person noted, what I've done in the PREQUERY could be a basis of "Account.ID" records in a master pre-query list used to go through and page-available. I would be curious to the performance of this to your existing especially at the 10,000 limit range.
The PREQUERY gets unique elements (including the Account ID used downstream, bank, month, year and category), so those tables don't have to be rejoined in the rest of the joining process.
SELECT STRAIGHT_JOIN
PreQuery.*,
customer.customer_number,
customer.name,
customer.has_bad_address,
address.line1,
address.line2,
address.city,
state.name,
address.zip,
po_box.line1,
po_box.line2,
po_box.city,
po_state.name,
po_box.zip,
customer.date_of_birth,
northway_account.cffna,
northway_account.cfinsc,
customer.deceased,
customer.social_security_number,
customer.has_internet_banking,
customer.safe_deposit_box,
account.has_bill_pay,
account.has_e_statement,
branch.number,
northway_product.code,
macatawa_product.code,
account.account_number,
account.available_line,
view_macatawa_atm_card.number,
view_macatawa_debit_card.number,
uc.code use_class,
account.open_date,
account.balance,
account.affinion,
northway_account.ytdsc,
northway_account.ytdodf,
northway_account.ytdnsf,
northway_account.rtckcy,
northway_account.rtckwy,
northway_account.odwvey,
northway_account.ytdscw,
northway_account.feeytd,
customer.do_not_mail,
northway_account.aledq1,
northway_account.aledq2,
northway_account.aledq3,
northway_account.aledq4,
northway_account.acolq1,
northway_account.acolq2,
northway_account.acolq3,
northway_account.acolq4,
o.officer_number,
northway_account.avg_bal_1,
northway_account.avg_bal_2,
northway_account.avg_bal_3,
account.maturity_date,
account.interest_rate,
northway_account.asslc,
northway_account.paidlc,
northway_account.lnuchg,
northway_account.ytdlc,
northway_account.extfee,
northway_account.penamt,
northway_account.cdytdwaive,
northway_account.cdterm,
northway_account.cdtcod,
account.date_of_last_statement,
northway_account.statement_cycle,
northway_account.cfna1,
northway_account.cfna2,
northway_account.cfna3,
northway_account.cfna4,
northway_account.cfcity,
northway_account.cfstate,
northway_account.cfzip,
northway_account.actype,
northway_account.sccode,
macatawa_account.account_type_code,
macatawa_account.account_type_code_description,
macatawa_account.advance_code,
macatawa_account.amount_last_advance,
macatawa_account.amount_last_payment,
macatawa_account.available_credit,
macatawa_account.balance_last_statement,
macatawa_account.billing_day,
macatawa_account.birthday_3,
macatawa_account.birthday_name_2,
macatawa_account.ceiling_rate,
macatawa_account.class_code,
macatawa_account.classified_doubtful,
macatawa_account.classified_loss,
macatawa_account.classified_special,
macatawa_account.classified_substandard,
macatawa_account.closed_account_flag,
macatawa_account.closing_balance,
macatawa_account.compounding_code,
macatawa_account.cost_center_full,
macatawa_account.cytd_aggregate_balance,
macatawa_account.cytd_amount_of_advances,
macatawa_account.cytd_amount_of_payments,
macatawa_account.cytd_average_balance,
macatawa_account.cytd_average_principal_balance,
macatawa_account.cytd_interest_paid,
macatawa_account.cytd_number_items_nsf,
macatawa_account.cytd_number_of_advanes,
macatawa_account.cytd_number_of_payments,
macatawa_account.cytd_number_times_od,
macatawa_account.cytd_other_charges,
macatawa_account.cytd_other_charges_waived,
macatawa_account.cytd_reporting_points,
macatawa_account.cytd_service_charge,
macatawa_account.cytd_service_charge_waived,
macatawa_account.date_closed,
macatawa_account.date_last_activity,
macatawa_account.date_last_advance,
macatawa_account.date_last_payment,
macatawa_account.date_paid_off,
macatawa_account.ddl_code,
macatawa_account.deposit_rate_index,
macatawa_account.employee_officer_director_full_desc,
macatawa_account.floor_rate,
macatawa_account.handling_code,
macatawa_account.how_paid_code,
macatawa_account.interest_frequency,
macatawa_account.ira_plan,
macatawa_account.load_rate_code,
macatawa_account.loan_rate_code,
macatawa_account.loan_rating_code,
macatawa_account.loan_rating_code_1_full_desc,
macatawa_account.loan_rating_code_2_full_desc,
macatawa_account.loan_rating_code_3_full_desc,
macatawa_account.loan_to_value_ratio,
macatawa_account.maximum_credit,
macatawa_account.miscellaneous_code_full_desc,
macatawa_account.months_to_maturity,
macatawa_account.msa_code,
macatawa_account.mtd_agg_available_balance,
macatawa_account.naics_code,
macatawa_account.name_2,
macatawa_account.name_3,
macatawa_account.name_line,
macatawa_account.name_line_2,
macatawa_account.name_line_3,
macatawa_account.name_line_1,
macatawa_account.net_payoff,
macatawa_account.opened_by_responsibility_code_full,
macatawa_account.original_issue_date,
macatawa_account.original_maturity_date,
macatawa_account.original_note_amount,
macatawa_account.original_note_date,
macatawa_account.original_prepaid_fees,
macatawa_account.participation_placed_code,
macatawa_account.participation_priority_code,
macatawa_account.pay_to_account,
macatawa_account.payment_code,
macatawa_account.payoff_principal_balance,
macatawa_account.percent_participated_code,
macatawa_account.pmtd_number_deposit_type_1,
macatawa_account.pmtd_number_deposit_type_2,
macatawa_account.pmtd_number_deposit_type_3,
macatawa_account.pmtd_number_type_1,
macatawa_account.pmtd_number_type_2,
macatawa_account.pmtd_number_type_6,
macatawa_account.pmtd_number_type_8,
macatawa_account.pmtd_number_type_9,
macatawa_account.principal,
macatawa_account.purpose_code,
macatawa_account.purpose_code_full_desc,
macatawa_account.pytd_number_of_items_nsf,
macatawa_account.pytd_number_of_times_od,
macatawa_account.rate_adjuster,
macatawa_account.rate_over_split,
macatawa_account.rate_under_split,
macatawa_account.renewal_code,
macatawa_account.renewal_date,
macatawa_account.responsibility_code_full,
macatawa_account.secured_unsecured_code,
macatawa_account.short_first_name_1,
macatawa_account.short_first_name_2,
macatawa_account.short_first_name_3,
macatawa_account.short_last_name_1,
macatawa_account.short_last_name_2,
macatawa_account.short_last_name_3,
macatawa_account.statement_cycle,
macatawa_account.statement_rate,
macatawa_account.status_code,
macatawa_account.tax_id_number_name_2,
macatawa_account.tax_id_number_name_3,
macatawa_account.teller_alert_1,
macatawa_account.teller_alert_2,
macatawa_account.teller_alert_3,
macatawa_account.term,
macatawa_account.term_code,
macatawa_account.times_past_due_01_29,
macatawa_account.times_past_due_01_to_29_days,
macatawa_account.times_past_due_30_59,
macatawa_account.times_past_due_30_to_59_days,
macatawa_account.times_past_due_60_89,
macatawa_account.times_past_due_60_to_89_days,
macatawa_account.times_past_due_over_90,
macatawa_account.times_past_due_over_90_days,
macatawa_account.tin_code_name_1,
macatawa_account.tin_code_name,
macatawa_account.tin_code_name_2,
macatawa_account.tin_code_name_3,
macatawa_account.total_amount_past_due,
macatawa_account.waiver_od_charge,
macatawa_account.waiver_od_charge_description,
macatawa_account.waiver_service_charge_code,
macatawa_account.waiver_transfer_advance_fee,
macatawa_account.short_first_name,
macatawa_account.short_last_name
FROM
( SELECT STRAIGHT_JOIN DISTINCT
b.name bank,
ib.YEAR,
ib.MONTH,
ip.category,
Account.ID
FROM import_bundle ib
JOIN generic_import gi ON ib.id = gi.import_bundle_id
JOIN account_import AI ON gi.id = ai.generic_import_id
JOIN Account ON AI.ID = account.account_import_id
JOIN import_profile ip ON gi.import_profile_id = ip.id
JOIN bank b ib.Bank_ID = b.id
WHERE
IB.ID = 95
AND IB.Active = 1
AND GI.Active = 1
LIMIT 1000 ) PreQuery
JOIN Account on PreQuery.ID = Account.ID
JOIN Customer on Account.Customer_ID = Customer.ID
JOIN Officer on Account.Officer_ID = Officer.ID
LEFT JOIN branch ON Account.branch_id = branch.id
LEFT JOIN cd_type ON account.cd_type_id = cd_type.id
LEFT JOIN use_class uc ON account.use_class_id = uc.id
LEFT JOIN account_type at ON account.account_type_id = at.id
LEFT JOIN northway_account ON account.id = northway_account.account_id
LEFT JOIN macatawa_account ON account.id = macatawa_account.account_id
LEFT JOIN view_macatawa_debit_card ON account.id = view_macatawa_debit_card.account_id
LEFT JOIN view_macatawa_atm_card ON account.id = view_macatawa_atm_card.account_id
LEFT JOIN original_address OA ON Account.ID = OA.account_id
JOIN Account_Address AA ON Account.ID = AA.account_id
JOIN address ON AA.address_id = address.id
JOIN state ON address.state_id = state.id
LEFT JOIN Account_po_box APB ON Account.ID = APB.account_id
LEFT JOIN address po_box ON APB.address_id = po_box.id
LEFT JOIN state po_state ON po_box.state_id = po_state.id
LEFT JOIN Account_macatawa_product amp ON account.id = amp.account_id
LEFT JOIN macatawa_product ON amp.macatawa_product_id = macatawa_product.id
LEFT JOIN product_type pt ON macatawa_product.product_type_id = pt.id
LEFT JOIN harte_hanks_service_category hhsc ON macatawa_product.harte_hanks_service_category_id = hhsc.id
LEFT JOIN core_file_type cft ON macatawa_product.core_file_type_id = cft.id
LEFT JOIN Account_northway_product anp ON account.id = anp.account_id
LEFT JOIN northway_product ON anp.northway_product_id = northway_product.id
The non-linear increase in fetch time is likely the result of key buffers filling up, and probably other memory related issues as well. You should both optimize the query using EXPLAIN to maximize use of indexes, and tune your MySQL server settings.