Related
When I add the nested query for invCount, my query time goes from .03 sec to 14 sec. The query works and I get correct values, but it is very, very slow in comparison. Is that just because I have to many conditions in that query? When I take it out and still have the second nested query, the time is still .03 secs. There is clearly something about the first nested query the database doesn't like, but I am not seeing what it is. I have a foreign key set for all the inner join lines too. Any help or ideas would be appreciated.
SELECT a.*,
f.name,
f.partNumber,
f.showInAdminStore,
f.showInPublicStore,
f.productImage,
r.mastCatID,
(SELECT COUNT(b.inventoryID)
FROM storeInventory b
INNER JOIN events c ON c.eventID = b.eventID
WHERE b.pluID = a.pluID
AND b.listPrice = a.listPrice
AND b.unlimitedQty = a.unlimitedQty
AND (b.packageID = a.packageID OR (b.packageID IS NULL AND a.packageID IS NULL))
AND b.orderID IS NULL
AND c.isOpen = '1'
AND b.paymentTypeID <= '2'
AND (b.inCart < '$cartTime' OR b.inCart IS NULL) ) AS invCount,
(SELECT COUNT(x.inventoryID)
FROM storeInventory x
WHERE x.packageID = a.inventoryID) AS packageCount
FROM storeInventory a
INNER JOIN storePLUs f ON f.pluID = a.pluID
INNER JOIN storeCategories r ON r.catID = f.catID
INNER JOIN events d ON d.eventID = a.eventID
WHERE a.storeFrontID = '1'
AND a.orderID IS NULL
AND a.paymentTypeID <= '2'
AND d.isOpen = '1'
GROUP BY a.packageID, a.unlimitedQty, a.listPrice, a.pluID
Table from query output
UPDATE: 12/12/2022
I changed the line checking the packageID to "AND (b.packageID <=> a.packageID)" as suggested and that cut my query time down to 7.8 seconds from 14 seconds. Thanks for the pointer. I will definitely use that in the future for NULL comparisons.
using "count(*)" took about half a second off. When I take the first nested query out, it drops down to .05 seconds even with the other nested queries in there, so I feel like there is still something causing issues. I tried running it without the other "AND (b.inCart < '$cartTime' OR b.inCart IS NULL)" line and that did take about a second off, but no where what I was hoping for. Is there an operand that includes NULL on a less than comparison? I also tried running it without the inner join in the nested query and that didn't change much at all. Of course removing any of that, throughs the values off and they become incorrect, so I can't run it that way.
Here is my current query setup that still pulls correct values.
SELECT a.*,
f.name,
f.partNumber,
f.showInAdminStore,
f.showInPublicStore,
f.productImage,
r.mastCatID,
(SELECT COUNT(*)
FROM storeInventory b
INNER JOIN events c ON c.eventID = b.eventID
WHERE b.pluID = a.pluID
AND b.listPrice = a.listPrice
AND b.unlimitedQty = a.unlimitedQty
AND (b.packageID <=> a.packageID)
AND b.orderID IS NULL
AND c.isOpen = '1'
AND b.paymentTypeID <= '2'
AND (b.inCart < '$cartTime' OR b.inCart IS NULL) ) AS invCount,
(SELECT COUNT(x.inventoryID)
FROM storeInventory x
WHERE x.packageID = a.inventoryID) AS packageCount
FROM storeInventory a
INNER JOIN storePLUs f ON f.pluID = a.pluID
INNER JOIN storeCategories r ON r.catID = f.catID
INNER JOIN events d ON d.eventID = a.eventID
WHERE a.storeFrontID = '1'
AND a.orderID IS NULL
AND a.paymentTypeID <= '2'
AND d.isOpen = '1'
GROUP BY a.packageID, a.unlimitedQty, a.listPrice, a.pluID
I am not familiar with the term 'Composite indexes' Is that something different than these?
Screenshot of ForeignKeys on Table a
I think
AND (b.packageID = a.packageID
OR (b.packageID IS NULL
AND a.packageID IS NULL)
)
can be simplified to ( https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_equal-to ):
AND ( b.packageID <=> a.packageID )
Use COUNT(*) instead of COUNT(x.inventoryID) unless you check for not-NULL.
The subquery to compute packageCount seems strange; you seem to count inventories but join on packages.
The need to reach into another table to check isOpen is part of the performance problem. If eventID is not the PRIMARY KEYforevents, then add INDEX(eventID, isOpen)`.
Some other indexes that may help:
a: INDEX(storeFrontID, orderID, paymentTypeID)
a: INDEX(packageID, unlimitedQty, listPrice, pluID)
b: INDEX(pluID, listPrice, unlimitedQty, orderID)
f: INDEX(pluID, catID)
r: INDEX(catID, mastCatID)
x: INDEX(packageID, inventoryID)
After OP's Update
There is no way to do (x<y OR x IS NULL) except by switching to a UNION. In your case, it is pretty easy to do the conversion. Replace
( SELECT COUNT(*) ... AND ( b.inCart < '$cartTime'
OR b.inCart IS NULL ) ) AS invCount,
with
( SELECT COUNT(*) ... AND b.inCart < '$cartTime' ) +
( SELECT COUNT(*) ... AND b.inCart IS NULL ) AS invCount,
Revised indexes:
storePLUs:
INDEX(pluID, catID)
storeCategories:
INDEX(catID, mastCatID)
events:
INDEX(isOpen, eventID)
storeInventory:
INDEX(pluID, listPrice, unlimitedQty, orderID, packageID)
INDEX(pluID, listPrice, unlimitedQty, orderID, inCart)
INDEX(packageID, inventoryID)
INDEX(storeFrontID, orderID, paymentTypeID)
When executing a query statement, the speed is very slow.
SELECT
T1.APPL_SEQ
, T1.COMP_CD
, (SELECT COMP_NM FROM tb_company WHERE COMP_CD = T1.COMP_CD) AS COMP_NM
, T1.GPROD_CD
, (SELECT GPROD_NM FROM tb_gprod WHERE GPROD_CD = T1.GPROD_CD) AS GPROD_NM
, T1.SITE_CD
, (SELECT SITE_NM FROM tb_site WHERE SITE_CD = T1.SITE_CD) AS SITE_NM
, T1.INFLOW_CD
, T1.INFLOW_URL
, T1.STATUS
, T1.REG_DTM
, DECRYPTO(T1.NAME) AS NAME
, DECRYPTO(T1.HP) AS HP
, ifnull(T1.AGE,T1.`115`) AS AGE
, ifnull(T1.GENDER,T1.`116`) AS GENDER
, ifnull(T1.MEMO,T1.`120`) AS MEMO
, ifnull(T1.`105`,T1.`124`) AS TIME
, T1.`125` AS AGE_CHILD
, T2.API_YN
, T2.API_START_DT
, T2.API_END_DT
, T2.API_CD
, T2.DATA_INFLOWCD
, T2.CONFIRM_YN
, T2.SALE_YN
, T2.SALE_PRICE
, T2.BREAKDOWN
, T2.INPUT_DATE
, T3.DIST_YN
, T3.DIST_DT
,(select ifnull((select timestampdiff(DAY, T11.REG_DTM,T1.REG_DTM) AS DIFF2REGTIME from tb_applicant T11 WHERE T11.HP = T1.HP AND T11.GPROD_CD = T1.GPROD_CD AND T11.REG_DTM < T1.REG_DTM order by T11.REG_DTM desc limit 1),-1)) AS HP2_COUNT
FROM
tb_applicant T1
LEFT JOIN mm_applicant T2
ON T1.APPL_SEQ = T2.APPL_SEQ
LEFT JOIN dist_applicant T3
ON T1.APPL_SEQ = T3.APPL_SEQ
LEFT JOIN tb_site T4
ON T4.site_cd = T1.SITE_CD and T4.comp_cd = T1.COMP_CD and T4.gprod_cd = T1.GPROD_CD
WHERE 1=1
AND T1.APPL_SEQ > 147293
AND T4.is_use = 'Y'
$Sql_Search
ORDER BY
$Sql_OrderBy
) U1
, (SELECT #ROWNUM := 0) U2
) V1";
,(select ifnull((...),-1)) AS HP2_COUNT
This is part of why it's so slow.
This query calculates the number of months difference by comparing REG_DTM when the td_applicant table has the same data for HP, GPROD, and COMP.
I don't need to get the date difference, is there any way to improve the query speed?
The main problem are those subselect in the select. As #Akina suggested, you should move them in FROM and make them as join.
They way you have done implies that each subselect is executed for each row returned by the main select.
You have 4 subselect that mean if you have 100 rows you execute 1 (main select) + (4*100) query so 401 instead of 1.
Using join allow the internal optimization engine to choose the best strategy to perform the query, in your way practically no optimization are applied.
I post a short example of how should be your query, didn't refactor the whole query since without database is a bit difficult to do it and I can easily produce a wrong query.
Notice that you select twice on tb_site with different condition, so is up to you to put the correct one.
SELECT T1.APPL_SEQ, T1.COMP_CD, T1.GPROD_CD, T1.SITE_CD
TC.COMP_NM,
TG.GPROD_NM,
TS.SITE_NM,
......
FROM tb_applicant T1
LEFT JOIN mm_applicant T2
JOIN tb_company TC on TC.COMP_CD = T1.COMP_CD
JOIN tb_gprod TG on GPROD_CD = T1.GPROD_CD
JOIN tb_site TS on TS.SITE_CD = T1.SITE_CD ON T1.APPL_SEQ = T2.APPL_SEQ
.......
MySQL Find Most Recent/Largest Record Per Group by order by, and how can i minimize / shorten this query and every time it returns the first row value of the group, whereas i like to select the last row value of a group, and sort the values based on ja.id ? I know this is a bad query can anyone suggest or provide me solution to shorten this query . I have used all the necessary Column Indexes in all the tables. How to shorten the query without the use of union all .both the queries in union all are same expect in the where statement.
SELECT
a.previous_status,
a.rejected_status,
a.rejection_reason_text,
a.rejection_reason,
a.rjaId,
a.refer_applied_status,
a.title,
a.playerId,
a.gameId,
a.gamePostDate,
a.game_referal_amount,
a.country,
a.country_name,
a.state,
a.location,
a.state_abb,
a.game_type,
a.appliedId,
a.appliedStatus,
a.admin_review,
a.is_req_referal_check,
a.memberId,
a.appliedEmail,
a.game_id,
a.referred_id,
a.memStateAbb,
a.memState,
a.memZipcode,
a.memCity,
a.memCountryNme,
a.memCountry,
a.appliedMemberName,
a.first_name,
a.primary_contact,
a.last_name,
a.addressbookImage,
a.userImage,
a.last_login,
a.user_experience_year,
a.user_experience_month,
a.current_designation,
a.current_player,
a.appliedDate,
a.addressbook_id,
a.joiningdate,
a.gameStatus,
a.gameReferalAmountType,
a.gameFreezeStatus,
a.gameFreezeMsg,
a.app_assign_back_to_rp_count,
a.applied_source,
a.max_id,
a.gamesApplied,
a.gamesAppliedId,
SUM(a.totalgameApplied) AS totalgameApplied,
a.application_assign_to_rp_status,
a.rpAppliedSource,
a.applied_on
FROM
(
(
SELECT
ja.previous_status,
ja.rejected_status,
ja.rejection_reason_text,
ja.rejection_reason,
rja.id AS rjaId,
rja. STATUS AS refer_applied_status,
jp.title,
jp.user_user_id AS playerId,
jp.id AS gameId,
jp.posted_on AS gamePostDate,
jp.game_referal_amount,
jp.country,
jp.country_name,
jp.state,
jp.location,
jp.state_abb,
jp.game_type,
ja.id AS appliedId,
IFNULL(ja. STATUS, '') AS appliedStatus,
IFNULL(ja.admin_review, '') AS admin_review,
ja.is_req_referal_check,
usr.id AS memberId,
rja.email AS appliedEmail,
rja.game_id,
rja.referred_id,
mem.state_abb AS memStateAbb,
mem.state AS memState,
mem.zipcode AS memZipcode,
mem.city AS memCity,
mem.country_name AS memCountryNme,
mem.country_code AS memCountry,
usc. NAME AS appliedMemberName,
usc.first_name,
IFNULL(
mem.primary_contact,
usc.phone_number
) AS primary_contact,
usc.last_name,
usc.profileimage_path AS addressbookImage,
usr.profile_image AS userImage,
usr.last_login,
mem.user_experience_year,
mem.user_experience_month,
mem.current_designation,
mem.current_player,
rja.create_date AS appliedDate,
rja.addressbook_id,
IFNULL(ja.joining_date, '') AS joiningdate,
jp. STATUS AS gameStatus,
jp.games_referal_amount_type AS gameReferalAmountType,
jp.game_freeze_status AS gameFreezeStatus,
jp.game_freeze_message AS gameFreezeMsg,
ja.app_assign_back_to_rp_count,
ja.applied_source,
MAX(rja.id) AS max_id,
GROUP_CONCAT(
jp.title
ORDER BY
rja.create_date DESC
) AS gamesApplied,
GROUP_CONCAT(DISTINCT(jp.id)) AS gamesAppliedId,
COUNT(DISTINCT(jp.id)) totalgameApplied,
ja.application_assign_to_rp_status,
1 AS rpAppliedSource,
rja.create_date AS applied_on
FROM
(`refer_gameapplied` AS rja)
JOIN `games_post` AS jp ON `jp`.`id` = `rja`.`game_id`
JOIN `user_socialconnections` AS usc ON `rja`.`addressbook_id` = `usc`.`id`
LEFT JOIN `user_user` AS usr ON `usr`.`email` = `rja`.`email`
LEFT JOIN `user_member` AS mem ON `mem`.`user_id` = `usr`.`id`
LEFT JOIN `game_applied` AS ja ON `ja`.`id` = `rja`.`applied_id`
WHERE
`rja`.`referby_id` = '2389'
GROUP BY
`rja`.`email`
)
UNION ALL
(
SELECT
ja.previous_status,
ja.rejected_status,
ja.rejection_reason_text,
ja.rejection_reason,
jr.id AS rjaId,
jrtm. STATUS AS refer_applied_status,
jp.title,
jp.user_user_id AS playerId,
jp.id AS gameId,
jp.posted_on AS gamePostDate,
jp.game_referal_amount,
jp.country,
jp.country_name,
jp.state,
jp.location,
jp.state_abb,
jp.game_type,
ja.id AS appliedId,
IFNULL(ja. STATUS, '') AS appliedStatus,
IFNULL(ja.admin_review, '') AS admin_review,
ja.is_req_referal_check,
usr.id AS memberId,
jrtm.referto_email AS refappliedEmail,
jr.game_id,
jrtm.id,
mem.state_abb AS memStateAbb,
mem.state AS memState,
mem.zipcode AS memZipcode,
mem.city AS memCity,
mem.country_name AS memCountryNme,
mem.country_code AS memCountry,
usc. NAME AS appliedMemberName,
usc.first_name,
IFNULL(
mem.primary_contact,
usc.phone_number
) AS primary_contact,
usc.last_name,
usc.profileimage_path AS addressbookImage,
usr.profile_image AS userImage,
usr.last_login,
mem.user_experience_year,
mem.user_experience_month,
mem.current_designation,
mem.current_player,
jrtm.refer_on AS appliedDate,
jrtm.referto_addressbookid,
IFNULL(ja.joining_date, '') AS joiningdate,
jp. STATUS AS gameStatus,
jp.games_referal_amount_type AS gameReferalAmountType,
jp.game_freeze_status AS gameFreezeStatus,
jp.game_freeze_message AS gameFreezeMsg,
ja.app_assign_back_to_rp_count,
ja.applied_source,
MAX(jrtm.id) AS max_id,
GROUP_CONCAT(
jp.title
ORDER BY
jr.refer_on DESC
) AS gamesApplied,
GROUP_CONCAT(DISTINCT(jp.id)) AS gamesAppliedId,
COUNT(DISTINCT(jp.id)) totalgameApplied,
ja.application_assign_to_rp_status,
2 AS rpAppliedSource,
jrtm.refer_on AS applied_on
FROM
(`game_refer` AS jr)
JOIN `game_refer_to_member` AS jrtm ON `jrtm`.`rid` = `jr`.`id`
JOIN `games_post` AS jp ON `jp`.`id` = `jr`.`game_id`
JOIN `user_socialconnections` AS usc ON `jrtm`.`referto_addressbookid` = `usc`.`id`
LEFT JOIN `user_user` AS usr ON `usr`.`email` = `jrtm`.`referto_email`
LEFT JOIN `user_member` AS mem ON `mem`.`user_id` = `usr`.`id`
LEFT JOIN `game_applied` AS ja ON `ja`.`referred_by` = `jrtm`.`id`
WHERE
`jrtm`.`status` = '1'
AND `jr`.`referby_user_id` = '2389'
AND `jrtm`.`refer_source` NOT IN ('4')
GROUP BY
`jrtm`.`referto_email`
)
) a
GROUP BY
a.appliedEmail
ORDER BY
a.gamesAppliedId DESC
It sounds like a "groupwise-max" problem. I added a tag that you should research.
At least get rid of the columns that are not relevant to the question.
Try tossing the tables other than jrtm and jr to see if the performance problem persists. (I'm thinking that the LEFT JOINs may be red herrings.)
Try with one part of the UNION, then with the other. This may identify which of the two is more of a burden.
Some indexes to add:
rja: (referby_id, create_date)
jrtm: (status, referto_email)
jrtm: (rid, status, referto_email)
jr: (referby_user_id, refer_on)
DISTINCT is not a function. Don't use parents in DISTINCT(jp.id).
My Sql query takes more time to execute from mysql database server . There are number of tables are joined with sb_tblproperty table. sb_tblproperty is main table that contain more than 1,00,000 rows . most of table contain 50,000 rows.
How to optimize my sql query to fast execution. I have also used indexing.
indexing Explain - query - structure
SELECT `t1`.`propertyId`, `t1`.`projectId`,
`t1`.`furnised`, `t1`.`ownerID`, `t1`.`subType`,
`t1`.`fors`, `t1`.`size`, `t1`.`unit`,
`t1`.`bedrooms`, `t1`.`address`, `t1`.`dateConfirm`,
`t1`.`dateAdded`, `t1`.`floor`, `t1`.`priceAmount`,
`t1`.`priceRate`, `t1`.`allInclusive`, `t1`.`booking`,
`t1`.`bookingRate`, `t1`.`paidPercetage`,
`t1`.`paidAmount`, `t1`.`is_sold`, `t1`.`remarks`,
`t1`.`status`, `t1`.`confirmedStatus`, `t1`.`source`,
`t1`.`companyName` as company, `t1`.`monthly_rent`,
`t1`.`per_sqft`, `t1`.`lease_duration`,
`t1`.`lease_commencement`, `t1`.`lock_in_period`,
`t1`.`security_deposit`, `t1`.`security_amount`,
`t1`.`total_area_leased`, `t1`.`lease_escalation_amount`,
`t1`.`lease_escalation_years`, `t2`.`propertyTypeName` as
propertyTypeName, `t3`.`propertySubTypeName` subType,
`t3`.`propertySubTypeId` subTypeId, `Owner`.`ContactName`
ownerName, `Owner`.`companyName`, `Owner`.`mobile1`,
`Owner`.`otherPhoneNo`, `Owner`.`mobile2`,
`Owner`.`email`, `Owner`.`address` as caddress,
`Owner`.`contactType`, `P`.`projectName` as project,
`P`.`developerName` as developer, `c`.`name` as city,
if(t1.projectId="", group_concat( distinct( L.locality)),
group_concat( distinct(L2.locality))) as locality, `U`.`firstname`
addedBy, `U1`.`firstname` confirmedBy
FROM `sb_tblproperty` as t1
JOIN `sb_contact` Owner ON `Owner`.`id` = `t1`.`ownerID`
JOIN `tbl_city` C ON `c`.`id` = `t1`.`city`
JOIN `sb_propertytype` t2 ON `t1`.`propertyType`= `t2`.`propertyTypeId`
JOIN `sb_propertysubtype` t3 ON `t1`.`subType` =`t3`.`propertySubTypeId`
LEFT JOIN `sb_tbluser` U ON `t1`.`addedBy` = `U`.`userId`
LEFT JOIN`sb_tbluser` U1 ON `t1`.`confirmedBy` = `U1`.`userId`
LEFT JOIN `sb_tblproject` P ON `P`.`id` = `t1`.`projectId` LEFT
JOIN `sb_tblpropertylocality` PL ON `t1`.`propertyId` = `PL`.`propertyId`
LEFT JOIN `sa_localitiez` L ON `L`.`id` = `PL`.`localityId`
LEFT JOIN `sb_tblprojectlocality` PROL ON `PROL`.`projectId` = `P`.`id`
LEFT JOIN `sa_localitiez` L2 ON `L2`.`id` = `PROL`.`localityId`
LEFT JOIN `sb_tblfloor` F
ON `F`.`floorName` =`t1`.`floor`
WHERE `t1`.`is_sold` != '1' GROUP BY `t1`.`propertyId`
ORDER BY `t1`.`dateConfirm`
DESC LIMIT 1000
Please provide the EXPLAIN.
Meanwhile, try this:
SELECT ...
FROM (
SELECT propertyId
FROM sb_tblproperty
WHERE `is_sold` = 0
ORDER BY `dateConfirm` DESC
LIMIT 1000
) AS x
JOIN `sb_tblproperty` as t1 ON t1.propertyId = x.propertyId
JOIN `sb_contact` Owner ON `Owner`.`id` = `t1`.`ownerID`
JOIN `tbl_city` C ON `c`.`id` = `t1`.`city`
...
LEFT JOIN `sb_tblfloor` F ON `F`.`floorName` =`t1`.`floor`
ORDER BY `t1`.`dateConfirm` DESC -- yes, again
Together with
INDEX(is_sold, dateConfirm)
How can t1.projectId="" ? Isn't projectId the PRIMARY KEY? (This is one of many reasons for needing the SHOW CREATE TABLE.)
If my suggestion leads to "duplicate" rows (that is, multiple rows with the same propertyId), don't simply add back the GROUP BY propertyId. Instead figure out why, and avoid the need for the GROUP BY. (That is probably the performance issue.)
A likely case is the GROUP_CONCAT. A common workaround is to change from
GROUP_CONCAT( distinct( L.locality)) AS Localities,
...
LEFT JOIN `sa_localitiez` L ON `L`.`id` = `PL`.`localityId`
to
( SELECT GROUP_CONCAT(distinct locality)
FROM sa_localitiez
WHERE id = PL.localityId ) AS Localities
...
# and remove the JOIN
I'd need to optimize the following query which takes up to 10 minutes to run.
Performing the explain it seems to be running on all 350815 rows of the "table_3" table and 1 for all the others.
General rules to place indexes the propper way? Should I think about using multidimensional indexes? Where should I use them at first on the JOINS, the WHERE or the GROUP BY, if I remember right there should be a hierarchy to follow. Also If I have 1 row for all tables but one (in the row column of the explain table) how can I optimize usually my optimization consists in ending up with only one row for all columns but one.
All tables average from 100k to 1000k+ rows.
CREATE TABLE datab1.sku_performance
SELECT
table1.sku,
CONCAT(table1.sku,' ',table1.fk_container ) as sku_container,
table1.price as price,
SUM( CASE WHEN ( table1.fk_table1_status = 82
OR table1.fk_table1_status = 119
OR table1.fk_table1_status = 124
OR table1.fk_table1_status = 141
OR table1.fk_table1_status = 131) THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( table1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qt,
container.id_container as container_id,
container.idden as container_idden,
container.delivery_badge,
catalog_school.id_catalog_school,
LEFT(catalog_school.flight_fair,2) as departing_country,
catalog_school.weight,
catalog_school.flight_type,
catalog_school.price,
table_3.id_table_3,
table_3.fk_catalog_brand,
MAX( LEFT( table_3.note,3 )) AS supplier,
GROUP_CONCAT( product_number, ' by ',FORMAT(catalog_school_class.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( catalog_school_class.purch_pri * catalog_school_class.quantity)) AS final_purch_pri,
catalog_groupp.idden as supplier_idden,
catalog_category_details.id_catalog_category,
catalog_category_details.cat1 as product_cat1,
catalog_category_details.cat2 as product_cat2,
COUNT( distinct catalog_school_class.id_catalog_school_class) as setinfo,
datab1.pageviewgrouped.pv as page_views,
Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS purch_pri,
container_has_table_3.position,
max( table1.created_at ) as last_order_date
FROM
table1
LEFT JOIN container
ON table1.fk_container = container.id_container
LEFT JOIN catalog_school
ON table1.sku = catalog_school.sku
LEFT JOIN table_3
ON catalog_school.fk_table_3 = table_3.id_table_3
LEFT JOIN container_has_table_3
ON table_3.id_table_3 = container_has_table_3.fk_table_3
LEFT JOIN datab1.pageviewgrouped
on table_3.id_table_3 = datab1.pageviewgrouped.url
LEFT JOIN datab1.catalog_category_details
ON datab1.catalog_category_details.id_catalog_category = table_3_has_catalog_minority.fk_catalog_category
LEFT JOIN catalog_groupp
ON table_3.fk_catalog_groupp = catalog_groupp.id_catalog_groupp
LEFT JOIN table_3_has_catalog_minority
ON table_3.id_table_3 = table_3_has_catalog_minority.fk_table_3
LEFT JOIN catalog_school_class
ON catalog_school.id_catalog_school = catalog_school_class.fk_catalog_school
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
GROUP BY
table1.sku,
table1.fk_container;
rows per table :
.table1 960096 to 1.3mn rows
.container 9275 to 13000 rows
.catalog_school 709970 to 1 mn rows
.table_3 709970 to 1 mn rows
.container_has_table_3 709970 to 1 mn rows
.pageviewgrouped 500000 rows
.catalog_school_class 709970 to 1 mn rows
.catalog_groupp 3000 rows
.table_3_has_catalog_minority 709970 to 1 mn rows
.catalog_category_details 659 rows
Too much to put into a single comment, so I'll add here and adjust later as possibly needed... You have LEFT JOINs everywhere, but your WHERE clause is specifically qualifying fields from the Table_3, Catalog_School and Table_3_has_catalog_minority. This by default changes them to INNER JOINs.
With respect to your where clause
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
Which table / column would have the smallest results based on these criteria. ex: Table_3.Status_ok = 1 might have 500k records but table_3_has_catalog_minority.is_primary may only have 65k and catalog_school.status = 'active' may have 430k.
Also, some of your columns are not qualified with the table they are coming from. Can you please confirm... such as "id_catalog_school_class" and "product_number"
SOMETIMES, changing the order of the tables, with good knowledge of the makeup of the data and in MySQL adding a "STRAIGHT_JOIN" keyword can improve performance. This was something I've had in the past working with gov't database of contracts and grants with 20+ million records and joining to about 15+ lookup tables. It went from hanging the server to getting the query finished in less than 2 hrs. Considering the amount of data I was dealing with, that was actually a good time.
AFTER dissecting this thing some, I restructured a bit more for readability, added aliases for table references and changed the order of the query and have some suggested indexes. To help the query, I tried moving the Catalog_School table to the first position and added the STRAIGHT_JOIN. The index is based on the STATUS first to match the WHERE clause, THEN I included the SKU as it is first element of the GROUP BY, then the other columns used to join to the subsequent tables. By having these columns in the index, it can qualify the joins without having to go to the raw data.
By changing the group by to the Catalog_School.SKU instead of table_1.SKU the index from catalog_school can be used to help optimize that. It is the same value since the join from the catalog_school.sku = table_1.sku. I also added index references for table_1 and table_3 that are suggestions -- again, to preemptively qualify the joins without going to the raw data pages of the tables.
I would be interested in knowing the final performance (better or worse) from your data.
TABLE INDEX ON...
catalog_school ( status, sku, fk_table_3, id_catalog_school )
table_1 ( sku, fk_container )
table_3 ( id_table_3, status_ok, fk_catalog_groupp )
SELECT STRAIGHT_JOIN
CS.sku,
CONCAT(CS.sku,' ',T1.fk_container ) as sku_container,
T1.price as price,
SUM( CASE WHEN ( T1.fk_table1_status IN ( 82, 119, 124, 141, 131)
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( T1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qt,
CS.id_catalog_school,
LEFT(CS.flight_fair,2) as departing_country,
CS.weight,
CS.flight_type,
CS.price,
T3.id_table_3,
T3.fk_catalog_brand,
MAX( LEFT( T3.note,3 )) AS supplier,
C.id_container as container_id,
C.idden as container_idden,
C.delivery_badge,
GROUP_CONCAT( product_number, ' by ',FORMAT(CSC.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( CSC.purch_pri * CSC.quantity)) AS final_purch_pri,
CGP.idden as supplier_idden,
CCD.id_catalog_category,
CCD.cat1 as product_cat1,
CCD.cat2 as product_cat2,
COUNT( distinct CSC.id_catalog_school_class) as setinfo,
PVG.pv as page_views,
Sum(distinct(CSC.purch_pri * CSC.quantity)) AS purch_pri,
CHT3.position,
max( T1.created_at ) as last_order_date
FROM
catalog_school CS
JOIN table1 T1
ON CS.sku = T1.sku
LEFT JOIN container C
ON T1.fk_container = C.id_container
LEFT JOIN catalog_school_class CSC
ON CS.id_catalog_school = CSC.fk_catalog_school
JOIN table_3 T3
ON CS.fk_table_3 = T3.id_table_3
JOIN table_3_has_catalog_minority T3HCM
ON T3.id_table_3 = T3HCM.fk_table_3
LEFT JOIN datab1.catalog_category_details CCD
ON T3HCM.fk_catalog_category = CCD.id_catalog_category
LEFT JOIN container_has_table_3 CHT3
ON T3.id_table_3 = CHT3.fk_table_3
LEFT JOIN datab1.pageviewgrouped PVG
on T3.id_table_3 = PVG.url
LEFT JOIN catalog_groupp CGP
ON T3.fk_catalog_groupp = CGP.id_catalog_groupp
WHERE
CS.status = 'active'
AND T3.status_ok = 1
AND T3HCM.is_primary = '1'
GROUP BY
CS.sku,
T1.fk_container;