I would like to optimize my sql query. When I put a lot of data in the investors_diversificacao table it gets very slow and doesn't load. But with little data in the investors_diversificacao table it can load the data.
I would like help to optimize query response time.
sql query:
SELECT investors_positivador.cod_cliente, investors_saldo_financeiro.nome_cliente, investors_base_assessores.squad, investors_base_assessores.nome_investor, investors_saldo_financeiro.saldo_d0,
SUM(CASE WHEN investors_posicao_geral.vencimento <= #vencimento THEN investors_posicao_geral.financeiro ELSE 0 END) AS vencimentos_ate_data,
ROUND(SUM(CASE WHEN investors_diversificacao.produto = 'Fundos' THEN investors_diversificacao.net / 6 ELSE 0 END), 2) AS fundos_ate_data, investors_guia_fundos.liquidez_total, investors_positivador.contatar_liquidity_map,
investors_positivador.id, investors_posicao_geral.vencimento, investors_base_assessores.nome_assessor
FROM investors_positivador
INNER JOIN investors_saldo_financeiro ON cod_cliente = investors_saldo_financeiro.cod_cliente
INNER JOIN investors_base_assessores ON cod_assessor = investors_base_assessores.cod_assessor
INNER JOIN investors_posicao_geral ON investors_positivador.cod_cliente = investors_posicao_geral.cod_cliente
LEFT OUTER JOIN investors_diversificacao ON investors_positivador.cod_cliente = investors_diversificacao.cod_cliente
LEFT OUTER JOIN investors_guia_fundos ON investors_diversificacao.cnpj = investors_guia_fundos.cnpj
WHERE (investors_base_assessores.nome_investor = #investor_nome) AND (investors_saldo_financeiro.saldo_d0 > 0)
OR (investors_base_assessores.nome_investor = #investor_nome) AND (investors_posicao_geral.financeiro > 0)
OR (investors_base_assessores.nome_investor = #investor_nome) AND (investors_diversificacao.net > 0)
GROUP BY investors_positivador.cod_cliente
Some of these indexes may help:
investors_positivador: INDEX(cod_cliente, contatar_liquidity_map, id)
investors_saldo_financeiro: INDEX(cod_cliente, saldo_d0, nome_cliente)
investors_base_assessores: INDEX(nome_investor, cod_assessor, squad, nome_assessor)
investors_posicao_geral: INDEX(cod_cliente, financeiro, vencimento)
investors_diversificacao: INDEX(cod_cliente, net, produto, cnpj)
investors_guia_fundos: INDEX(cnpj, liquidez_total)
Are you sure you want that WHERE clause? All the ORs are irrelevant, and Riggs' version simplifies to just
WHERE investors_base_assessores.nome_investor = #investor_nome
AND investors_diversificacao.net > 0
Don't round before summing; you will get extra rounding errors:
ROUND(SUM(IF(investors_diversificacao.produto = 'Fundos',
investors_diversificacao.net, 0
)
) / 6 -- divide after summing and before rounding
, 2) AS fundos_ate_data
Related
my sql query is normally executed in mysql workbench (despite the 15-second wait), but when I try to run the query from the side of another application (phpmyadmin, etc...) it gives me several errors, such as:
Timeout
Error in SELECT clause: expression near '"'.
Missing FROM clause.
Below is the query code:
SELECT investors_positivador.cod_cliente, investors_saldo_financeiro.nome_cliente, investors_base_assessores.squad, investors_base_assessores.nome_investor, investors_saldo_financeiro.saldo_d0,
SUM(investors_posicao_geral.financeiro) AS vencimentos_ate_data, investors_posicao_geral.vencimento,
CASE WHEN investors_diversificacao.produto = "FUNDO" THEN
ROUND(SUM( IF( investors_diversificacao.produto = "FUNDO", investors_diversificacao.net / 6, 0 ) ), 2)
ELSE
"0,00"
END AS fundos_ate_data
,investors_guia_fundos.liquidez_total
,investors_positivador.contatar_liquidity_map
,investors_positivador.id
FROM investors_positivador INNER JOIN
investors_saldo_financeiro ON investors_positivador.cod_cliente = investors_saldo_financeiro.cod_cliente LEFT OUTER JOIN
investors_base_assessores ON investors_positivador.cod_assessor = investors_base_assessores.cod_assessor LEFT OUTER JOIN
investors_posicao_geral ON investors_positivador.cod_cliente = investors_posicao_geral.cod_cliente
LEFT OUTER JOIN investors_diversificacao ON investors_diversificacao.cod_cliente = investors_positivador.cod_cliente AND investors_diversificacao.net != "0,00"
LEFT OUTER JOIN investors_guia_fundos ON investors_guia_fundos.cnpj = investors_diversificacao.cnpj
WHERE investors_base_assessores.nome_investor = "Daiane Costa" AND investors_saldo_financeiro.saldo_d0 > 0 AND investors_posicao_geral.financeiro > 0
GROUP BY cod_cliente
For the waiting time, you could try wrapping your query in a transaction. That may help in optimizing your query
Please help to sort out this isse
I have database in which i have 150,000 business records, each business record has its own business category (e.g: Bars, Pubs, Restaurant).
I am using this SQl to get Categories listing based on the visitor's location.
SELECT
ROUND(6371*acos(cos(radians('52.28231599999999'))*cos(radians(bizprof.vLatitude))*cos(radians(bizprof.vLongitude)-radians('-1.584927'))+sin(radians('52.28231599999999'))*sin(radians(bizprof.vLatitude))),2) AS distance,
`bizcat`.`vCategoryName`,
`bizcat`.`iCategoryId` FROM `business_profile` `bizprof`
LEFT JOIN `users` `u` ON u.iUserId = bizprof.iUserId
AND u.tiIsProfileSet = 1
AND u.tiIsActive = 1
AND u.tiIsDeleted = 0
LEFT JOIN `business_categories` `bizcat` ON bizcat.iCategoryId = bizprof.iCategoryId
GROUP BY `bizcat`.`iCategoryId`
HAVING distance >= 0 AND distance <= 10
This query taking too much time to render the data from the records.
Any idea on this ?
Use ST_Distance_Sphere(g1, g2 [, radius]) and spatial indexes
Move on AND u.tiIsProfileSet = 1 AND u.tiIsActive = 1 AND u.tiIsDeleted = 0 to where condition.
Avoid third join, fetch data from business_categories with another query (via relation e.g.)
Try to execute this query
SELECT
ST_Distance_Sphere(Point('-1.584927','52.28231599999999'), Point(`bizprof`.`vLongitude`,`bizprof`.`vLatitude`), 6370986 ) AS `distance`,
`bizprof`.`iCategoryId`
FROM `business_profile` `bizprof`
LEFT JOIN `users` `u` ON `u`.`iUserId` = `bizprof`.`iUserId`
WHERE 1=1
AND `u`.`tiIsProfileSet` = 1
AND `u`.`tiIsActive` = 1
AND `u`.`tiIsDeleted` = 0
HAVING distance >= 0 AND distance <= 10*1000
just some suggestion ..
Be sure you have proper compisite indexes on
Table business_profile columns( iUserId, iCategoryId)
table users columns (iUserId, tiIsProfileSet, tiIsActive, tiIsDeleted )
table business_categories column (iCategoryId)
then you should not use group by without aggregation function (if you need distinct result add DISTINCT clause in select )
you could also use the where (repeating di code for distance) clause and not the having for filter the result
SELECT
ROUND(6371*acos(cos(radians('52.28231599999999'))*cos(radians(bizprof.vLatitude))*cos(radians(bizprof.vLongitude)-radians('-1.584927'))+sin(radians('52.28231599999999'))*sin(radians(bizprof.vLatitude))),2) AS distance,
`bizcat`.`vCategoryName`,
`bizcat`.`iCategoryId`
FROM `business_profile` `bizprof`
LEFT JOIN `users` `u` ON u.iUserId = bizprof.iUserId
AND u.tiIsProfileSet = 1
AND u.tiIsActive = 1
AND u.tiIsDeleted = 0
LEFT JOIN `business_categories` `bizcat` ON bizcat.iCategoryId = bizprof.iCategoryId
WHERE ROUND(6371*acos(cos(radians('52.28231599999999'))*cos(radians(bizprof.vLatitude))*cos(radians(bizprof.vLongitude)-radians('-1.584927'))+sin(radians('52.28231599999999'))*sin(radians(bizprof.vLatitude))),2) >= 0
AND ROUND(6371*acos(cos(radians('52.28231599999999'))*cos(radians(bizprof.vLatitude))*cos(radians(bizprof.vLongitude)-radians('-1.584927'))+sin(radians('52.28231599999999'))*sin(radians(bizprof.vLatitude))),2) <= 10
I've got a script where checked in employee's can see the stock inventory from each other, linked with their personal stock location, so each checked in employee can see which items are in stock at different locations. However, I want the main stock (id of 1, which is not attached to an employee) to be showed always, but I can't get the query right because one of the where statements is clearly not correct:
`stock_locations`.`location_id` = 1 AND
`workschedule`.`checkedIn` = 1 AND
Rememeber, the main stock is not linked to an employee, so it doesn't show up at the workschedule table. If I remove the first statement, It clearly shows up all the checked in employee's with their location, but that doesn't give me the main stock. If I remove the second statement, it only shows me the main stock.
How can I solve this issue within SQL? This is btw the full statement:
SELECT
`item_quantities`.`item_id`,
`stock_locations`.`location_name`,
`item_quantities`.`quantity`,
`people`.`first_name`
FROM
`item_quantities`
JOIN `stock_locations` ON `item_quantities`.`location_id` = `stock_locations`.`location_id`
JOIN `items` ON `item_quantities`.`item_id` = `items`.`item_id`
LEFT JOIN `workschedule` ON `workschedule`.`linked_storage` = `stock_locations`.`location_id`
LEFT JOIN `people` ON `workschedule`.`employee_id` = `people`.`person_id`
WHERE
`stock_locations`.`location_id` = 1 AND
`workschedule`.`checkedIn` = 0 AND
`items`.`unit_price` != 0 AND
`items`.`deleted` = 0 AND
`stock_locations`.`deleted` = 0 NULL
Thanks in advance!
Make it an OR statement inside of parens.
(`stock_locations`.`location_id` = 1 OR `workschedule`.`checkedIn` = 1) AND
This will return all records that match either the main stock or the employee.
You need to use the OR operator. Clearly both things can't happen at the same time, so you need to specify each set of acceptable conditions.
SELECT
`item_quantities`.`item_id`,
`stock_locations`.`location_name`,
`item_quantities`.`quantity`,
`people`.`first_name`
FROM
`item_quantities`
JOIN `stock_locations`
ON `item_quantities`.`location_id` = `stock_locations`.`location_id`
JOIN `items`
ON `item_quantities`.`item_id` = `items`.`item_id`
LEFT JOIN `workschedule`
ON `workschedule`.`linked_storage` = `stock_locations`.`location_id`
LEFT JOIN `people`
ON `workschedule`.`employee_id` = `people`.`person_id`
WHERE
`stock_locations`.`location_id` = 1
OR (
AND `workschedule`.`checkedIn` = 1
AND `items`.`unit_price` != 0
AND `items`.`deleted` = 0
AND `stock_locations`.`deleted` = 0
NULL
)
You have LEFT JOIN, but your WHERE clause turns them into inner joins. Fixing that will probably fix your problem:
SELECT . . .
FROM item_quantities iq JOIN
stock_locations sl
ON iq.`location_id` = sl.`location_id` JOIN
items i
ON iq.`item_id` = i.`item_id` LEFT JOIN
workschedule ws
ON ws.`linked_storage` = sl.`location_id` AND
ws.`checkedIn` = 0 LEFT JOIN
--------^
people p
ON ws.`employee_id` = p.`person_id`
WHERE sl.`location_id` = 1 AND
i.`unit_price` != 0 AND
i.`deleted` = 0 AND
sl.`deleted` = 0
I have 2 queries :
1.
SELECT u.unitId unitId,
u.unitScode 'unitScode',
( Cast(Count(vd.Date) AS FLOAT) / u.timeDiff ) * 100 'BookingCount',
u.tradeStartTime,
u.tradeStopTime,
u.minimumSlot,
u.maximumTerm
FROM #allDates vd
INNER JOIN CommcmlBookingDetail cd
ON vd.Date BETWEEN cd.dtFromTime AND cd.dtToTime
AND Datepart(minute, vd.date) = Datepart(minute, cd.dtFromTime)
INNER JOIN CommCmlBooking cb
ON cb.hMy = cd.hBooking
AND cb.iStatus = 1
AND cb.iType = 525
INNER JOIN #unitsInfo u
ON u.unitId = cb.hUnit
AND Cast(vd.Date AS DATE) BETWEEN Cast(#BookingFromDate AS DATE) AND Cast(#BookingToDate AS DATE)
AND Cast(vd.Date AS TIME) BETWEEN Cast(u.tradeStartTime AS TIME) AND Cast(u.tradeStopTime AS TIME)
WHERE cb.hRecord = case when #amendmentId = 0 then cb.hRecord else #amendmentId end
GROUP BY u.unitId,
u.unitScode,
u.minimumSlot,
u.tradeStartTime,
u.timeDiff,
u.tradeStopTime,
u.maximumTerm;
2.
INSERT INTO #tempBookingCount
SELECT u.unitId,
u.timeDiff
FROM #allDates vd
INNER JOIN CommcmlBookingDetail cd
ON vd.Date BETWEEN cd.dtFromTime AND cd.dtToTime
AND Datepart(minute, vd.date) = Datepart(minute, cd.dtFromTime)
INNER JOIN CommCmlBooking cb
ON cb.hMy = cd.hBooking
AND cb.iStatus = 1
AND cb.iType = 525
INNER JOIN #unitsInfo u
ON u.unitId = cb.hUnit
AND Cast(vd.Date AS DATE) BETWEEN Cast(#BookingFromDate AS DATE) AND Cast(#BookingToDate AS DATE)
AND Cast(vd.Date AS TIME) BETWEEN Cast(u.tradeStartTime AS TIME) AND Cast(u.tradeStopTime AS TIME)
WHERE cb.hRecord = case when #amendmentId = 0 then cb.hRecord else #amendmentId end
INSERT INTO #unitBookingCount
SELECT tt.unitID,
u.unitScode,
( Cast(Count(tt.unitID) AS FLOAT) / tt.timeDiff ) * 100,
u.tradeStartTime,
u.tradeStopTime,
u.minimumSlot,
u.maximumTerm
FROM #tempBookingCount tt
INNER JOIN #unitsInfo u
ON u.unitId = tt.unitID
GROUP BY tt.unitID,
tt.timeDiff,
u.tradeStartTime,
u.tradeStopTime,
u.minimumSlot,
u.maximumTerm,
u.unitScode
I have separated the first query into 2 parts, and i can a huge difference in the performance!
The first query take 14 seconds when executed for 5 months where as next query take 4 seconds.
Your original query uses 2 table variables #allDates and #unitsInfo.
By using table variables, you aren't giving SQL a fighting chance of optimizing the query because there are no statistics on table variables and the row count estimations and query plan are impacted.
One reference, you can find many more:
http://blogs.msdn.com/b/psssql/archive/2010/08/24/query-performance-and-table-variables.aspx
Try the original with #TempTables instead of #TableVars
I'd need to optimize the following query which takes up to 10 minutes to run.
Performing the explain it seems to be running on all 350815 rows of the "table_3" table and 1 for all the others.
General rules to place indexes the propper way? Should I think about using multidimensional indexes? Where should I use them at first on the JOINS, the WHERE or the GROUP BY, if I remember right there should be a hierarchy to follow. Also If I have 1 row for all tables but one (in the row column of the explain table) how can I optimize usually my optimization consists in ending up with only one row for all columns but one.
All tables average from 100k to 1000k+ rows.
CREATE TABLE datab1.sku_performance
SELECT
table1.sku,
CONCAT(table1.sku,' ',table1.fk_container ) as sku_container,
table1.price as price,
SUM( CASE WHEN ( table1.fk_table1_status = 82
OR table1.fk_table1_status = 119
OR table1.fk_table1_status = 124
OR table1.fk_table1_status = 141
OR table1.fk_table1_status = 131) THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( table1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT id_catalog_school_class) as qt,
container.id_container as container_id,
container.idden as container_idden,
container.delivery_badge,
catalog_school.id_catalog_school,
LEFT(catalog_school.flight_fair,2) as departing_country,
catalog_school.weight,
catalog_school.flight_type,
catalog_school.price,
table_3.id_table_3,
table_3.fk_catalog_brand,
MAX( LEFT( table_3.note,3 )) AS supplier,
GROUP_CONCAT( product_number, ' by ',FORMAT(catalog_school_class.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( catalog_school_class.purch_pri * catalog_school_class.quantity)) AS final_purch_pri,
catalog_groupp.idden as supplier_idden,
catalog_category_details.id_catalog_category,
catalog_category_details.cat1 as product_cat1,
catalog_category_details.cat2 as product_cat2,
COUNT( distinct catalog_school_class.id_catalog_school_class) as setinfo,
datab1.pageviewgrouped.pv as page_views,
Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS purch_pri,
container_has_table_3.position,
max( table1.created_at ) as last_order_date
FROM
table1
LEFT JOIN container
ON table1.fk_container = container.id_container
LEFT JOIN catalog_school
ON table1.sku = catalog_school.sku
LEFT JOIN table_3
ON catalog_school.fk_table_3 = table_3.id_table_3
LEFT JOIN container_has_table_3
ON table_3.id_table_3 = container_has_table_3.fk_table_3
LEFT JOIN datab1.pageviewgrouped
on table_3.id_table_3 = datab1.pageviewgrouped.url
LEFT JOIN datab1.catalog_category_details
ON datab1.catalog_category_details.id_catalog_category = table_3_has_catalog_minority.fk_catalog_category
LEFT JOIN catalog_groupp
ON table_3.fk_catalog_groupp = catalog_groupp.id_catalog_groupp
LEFT JOIN table_3_has_catalog_minority
ON table_3.id_table_3 = table_3_has_catalog_minority.fk_table_3
LEFT JOIN catalog_school_class
ON catalog_school.id_catalog_school = catalog_school_class.fk_catalog_school
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
GROUP BY
table1.sku,
table1.fk_container;
rows per table :
.table1 960096 to 1.3mn rows
.container 9275 to 13000 rows
.catalog_school 709970 to 1 mn rows
.table_3 709970 to 1 mn rows
.container_has_table_3 709970 to 1 mn rows
.pageviewgrouped 500000 rows
.catalog_school_class 709970 to 1 mn rows
.catalog_groupp 3000 rows
.table_3_has_catalog_minority 709970 to 1 mn rows
.catalog_category_details 659 rows
Too much to put into a single comment, so I'll add here and adjust later as possibly needed... You have LEFT JOINs everywhere, but your WHERE clause is specifically qualifying fields from the Table_3, Catalog_School and Table_3_has_catalog_minority. This by default changes them to INNER JOINs.
With respect to your where clause
WHERE
table_3.status_ok = 1
AND catalog_school.status = 'active'
AND table_3_has_catalog_minority.is_primary = '1'
Which table / column would have the smallest results based on these criteria. ex: Table_3.Status_ok = 1 might have 500k records but table_3_has_catalog_minority.is_primary may only have 65k and catalog_school.status = 'active' may have 430k.
Also, some of your columns are not qualified with the table they are coming from. Can you please confirm... such as "id_catalog_school_class" and "product_number"
SOMETIMES, changing the order of the tables, with good knowledge of the makeup of the data and in MySQL adding a "STRAIGHT_JOIN" keyword can improve performance. This was something I've had in the past working with gov't database of contracts and grants with 20+ million records and joining to about 15+ lookup tables. It went from hanging the server to getting the query finished in less than 2 hrs. Considering the amount of data I was dealing with, that was actually a good time.
AFTER dissecting this thing some, I restructured a bit more for readability, added aliases for table references and changed the order of the query and have some suggested indexes. To help the query, I tried moving the Catalog_School table to the first position and added the STRAIGHT_JOIN. The index is based on the STATUS first to match the WHERE clause, THEN I included the SKU as it is first element of the GROUP BY, then the other columns used to join to the subsequent tables. By having these columns in the index, it can qualify the joins without having to go to the raw data.
By changing the group by to the Catalog_School.SKU instead of table_1.SKU the index from catalog_school can be used to help optimize that. It is the same value since the join from the catalog_school.sku = table_1.sku. I also added index references for table_1 and table_3 that are suggestions -- again, to preemptively qualify the joins without going to the raw data pages of the tables.
I would be interested in knowing the final performance (better or worse) from your data.
TABLE INDEX ON...
catalog_school ( status, sku, fk_table_3, id_catalog_school )
table_1 ( sku, fk_container )
table_3 ( id_table_3, status_ok, fk_catalog_groupp )
SELECT STRAIGHT_JOIN
CS.sku,
CONCAT(CS.sku,' ',T1.fk_container ) as sku_container,
T1.price as price,
SUM( CASE WHEN ( T1.fk_table1_status IN ( 82, 119, 124, 141, 131)
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qty_returned,
SUM( CASE WHEN ( T1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166))
THEN 1 ELSE 0 END)
/ COUNT( DISTINCT CSC.id_catalog_school_class) as qt,
CS.id_catalog_school,
LEFT(CS.flight_fair,2) as departing_country,
CS.weight,
CS.flight_type,
CS.price,
T3.id_table_3,
T3.fk_catalog_brand,
MAX( LEFT( T3.note,3 )) AS supplier,
C.id_container as container_id,
C.idden as container_idden,
C.delivery_badge,
GROUP_CONCAT( product_number, ' by ',FORMAT(CSC.quantity,0)
ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod,
Sum( distinct( CSC.purch_pri * CSC.quantity)) AS final_purch_pri,
CGP.idden as supplier_idden,
CCD.id_catalog_category,
CCD.cat1 as product_cat1,
CCD.cat2 as product_cat2,
COUNT( distinct CSC.id_catalog_school_class) as setinfo,
PVG.pv as page_views,
Sum(distinct(CSC.purch_pri * CSC.quantity)) AS purch_pri,
CHT3.position,
max( T1.created_at ) as last_order_date
FROM
catalog_school CS
JOIN table1 T1
ON CS.sku = T1.sku
LEFT JOIN container C
ON T1.fk_container = C.id_container
LEFT JOIN catalog_school_class CSC
ON CS.id_catalog_school = CSC.fk_catalog_school
JOIN table_3 T3
ON CS.fk_table_3 = T3.id_table_3
JOIN table_3_has_catalog_minority T3HCM
ON T3.id_table_3 = T3HCM.fk_table_3
LEFT JOIN datab1.catalog_category_details CCD
ON T3HCM.fk_catalog_category = CCD.id_catalog_category
LEFT JOIN container_has_table_3 CHT3
ON T3.id_table_3 = CHT3.fk_table_3
LEFT JOIN datab1.pageviewgrouped PVG
on T3.id_table_3 = PVG.url
LEFT JOIN catalog_groupp CGP
ON T3.fk_catalog_groupp = CGP.id_catalog_groupp
WHERE
CS.status = 'active'
AND T3.status_ok = 1
AND T3HCM.is_primary = '1'
GROUP BY
CS.sku,
T1.fk_container;