MySQL row count - mysql

I have a very large table (~1 000 000 rows) and complicated query with unions, joins and where statements (user can select different ORDER BY columns and directions). I need to get a row count for pagination. If I run query without counting rows it completes very fast. How can I implement pagination in fastest way?
I tried to use EXPLAIN SELECT and SHOW TABLE STATUS to get approximate row count, but it is very different from real row count.
My query is like this one (simplyfied):
SELECT * FROM (
(
SELECT * FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
ORDER BY x ASC
LIMIT 0, 10
)
UNION
(
SELECT * FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
ORDER BY x ASC
LIMIT 0, 10
)
) tbl ORDER BY x ASC LIMIT 0, 10
Query result without limiting is about ~100 000 rows, how can I get this approximate count in fastest way?
My production query example is like this one:
SELECT SQL_CALC_FOUND_ROWS * FROM (
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot, articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`contents`.dat AS source_dat, `contents_trans`.header, `contents_trans`.custom_text
FROM articles_log
INNER JOIN `contents` ON articles_log.record_id = `contents`.id
AND articles_log.source_table = 'contents'
INNER JOIN `contents_trans` ON `contents`.id = `contents_trans`.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
UNION
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot,
articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`news`.dat AS source_dat, `news_trans`.header, `news_trans`.custom_text
FROM articles_log
INNER JOIN `news` ON articles_log.record_id = `news`.id
AND articles_log.source_table = 'news'
INNER JOIN `news_trans` ON `news`.id = `news_trans`.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
) tbl ORDER BY view_dat ASC LIMIT 0, 10
Many thanks!

If you can use UNION ALL instead of UNION (which is a shortcut for UNION DISTINCT) - In other words - If you don't need to remove duplicates you can try to add the counts of the two subqueries:
SELECT
(
SELECT COUNT(*) FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
)
+
(
SELECT COUNT(*) FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
)
AS cnt
Without ORDER BY and without UNION the engine might not need to create a huge temp table.
Update
For your original query try the following:
Select only count(*).
Remove OR articles_log.source_table <> 'contents' from first part (contents) since we know it's never true.
Remove AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404' OR articles_log.source_table <> 'contents') from second part (news) since we know it's allways true because OR articles_log.source_table <> 'contents' is allways true.
Remove the joins with contents and news. You can join the *_trans tables directly using record_id
Remove articles_log.dat > 0 since it's redundant with articles_log.dat >= 1488319200
The resulting query:
SELECT (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `contents_trans`
ON `contents_trans`.record_id = articles_log.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.record_id NOT LIKE '%\_404'
AND articles_log.record_id <> '404'
) + (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `news_trans`
ON `news_trans`.record_id = articles_log.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
) AS cnt
Try the following index combinations:
articles_log(bot, dat, record_id)
contents_trans(lang, record_id)
news_trans(lang, record_id)
or
contents_trans(lang, record_id)
news_trans(lang, record_id)
articles_log(record_id, bot, dat)
It depends on the data, which combination ist the better one.
I might be wrong on one ore more points, since i don't know your data and business logic. If so, try to adjust the other.

You can get the calculation when you run the query using SQL_CALC_FOUND_ROWS as explained in the documentation:
select SQL_CALC_FOUND_ROWS *
. . .
And then running:
select FOUND_ROWS()
However, the first run needs to generate all the data, so you are going to get up to 20 possible rows -- I don't think it respects LIMIT in subqueries.
Given the structure of your query and you want to do, I would think first about optimizing the query. For instance, is UNION really needed (it incurs overhead for removing duplicates)? As pointed out in a comment, your joins are really inner joins disguised as outer joins. Indexes might improve performance.
You might want to ask another question, providing sample data and desired results to get advice on such issues.

Related

mysql Multiple left joins using count

I have been researching this for hours and the best code that I have come up with is this from an example i found on overstack. I have been through several derivations but the following is the only query that returns the correct data, the problem is it takes over 139s (more than 2 minutes) to return only 30 rows of data. Im stuck. (life_p is a 'likes'
SELECT
logos.id,
logos.in_gallery,
logos.active,
logos.pubpriv,
logos.logo_name,
logos.logo_image,
coalesce(cc.Count, 0) as CommentCount,
coalesce(lc.Count, 0) as LikeCount
FROM logos
left outer join(
select comments.logo_id, count( * ) as Count from comments group by comments.logo_id
) cc on cc.logo_id = logos.id
left outer join(
select life_p.logo_id, count( * ) as Count from life_p group by life_p.logo_id
) lc on lc.logo_id = logos.id
WHERE logos.active = '1'
AND logos.pubpriv = '0'
GROUP BY logos.id
ORDER BY logos.in_gallery desc
LIMIT 0, 30
I'm not sure whats wrong. If i do them singularly meaningremove the coalece and one of the joins:
SELECT
logos.id,
logos.in_gallery,
logos.active,
logos.pubpriv,
logos.logo_name,
logos.logo_image,
count( * ) as lc
FROM logos
left join life_p on life_p.logo_id = logos.id
WHERE logos.active = '1'
AND logos.pubpriv = '0'
GROUP BY logos.id
ORDER BY logos.in_gallery desc
LIMIT 0, 30
that runs in less than half a sec ( 2-300 ms )....
Here is a link to the explain: https://logopond.com/img/explain.png
MySQL has a peculiar quirk that allows a group by clause that does not list all non-aggregating columns. This is NOT a good thing and you should always specify ALL non-aggregating columns in the group by clause.
Note, when counting over joined tables it is useful to know that the COUNT() function ignores NULLs, so for a LEFT JOIN where NULLs can occur don't use COUNT(*), instead use a column from within the joined table and only rows from that table will be counted. From these points I would suggest the following query structure.
SELECT
logos.id
, logos.in_gallery
, logos.active
, logos.pubpriv
, logos.logo_name
, logos.logo_image
, COALESCE(COUNT(cc.logo_id), 0) AS CommentCount
, COALESCE(COUNT(lc.logo_id), 0) AS LikeCount
FROM logos
LEFT OUTER JOIN comments cc ON cc.logo_id = logos.id
LEFT OUTER JOIN life_p lc ON lc.logo_id = logos.id
WHERE logos.active = '1'
AND logos.pubpriv = '0'
GROUP BY
logos.id
, logos.in_gallery
, logos.active
, logos.pubpriv
, logos.logo_name
, logos.logo_image
ORDER BY logos.in_gallery DESC
LIMIT 0, 30
If you continue to have performance issues then use a execution plan and consider adding indexes to suit.
You can create some indexes on the joining fields:
ALTER TABLE table ADD INDEX idx__tableName__fieldName (field)
In your case will be something like:
ALTER TABLE cc ADD INDEX idx__cc__logo_id (logo_id);
I dont really like it because ive always read that sub queries are bad and that joins perform better under stress, but in this particular case subquery seems to be the only way to pull the correct data in under half a sec consistently. Thanks for the suggestions everyone.
SELECT
logos.id,
logos.in_gallery,
logos.active,
logos.pubpriv,
logos.logo_name,
logos.logo_image,
(Select COUNT(comments.logo_id) FROM comments
WHERE comments.logo_id = logos.id) AS coms,
(Select COUNT(life_p.logo_id) FROM life_p
WHERE life_p.logo_id = logos.id) AS floats
FROM logos
WHERE logos.active = '1' AND logos.pubpriv = '0'
ORDER BY logos.in_gallery desc
LIMIT ". $start .",". $pageSize ."
Also you can create a mapping tables to speed up your query try:
CREATE TABLE mapping_comments AS
SELECT
comments.logo_id,
count(*) AS Count
FROM
comments
GROUP BY
comments.logo_id
) cc ON cc.logo_id = logos.id
Then change your code
left outer join(
should become
inner join mapping_comments as mp on mp.logo_id =cc.id
Then each time a new comment are added to the cc table you need to update your mapping table OR you can create a stored procedure to do it automatically when your cc table changes

Count on multiple tables with missing zero counts

I am running this query to return data with count < 0. It works fine until count is > 0 and < 50. But when count becomes 0, it doesnot return the data. Count is defined by coupons`.`status. On count zero, there will be no data in coupons table with status as 1. This is creating the issue, as it omits the whole row.
SELECT count(*) AS count, clients.title, plans.name
FROM `coupons`
INNER JOIN `clients` ON `coupons`.`client_id` = `clients`.`id`
INNER JOIN `plans` ON `coupons`.`plan_id` = `plans`.`id`
WHERE `coupons`.`status` = 1
GROUP BY `coupons`.`client_id`, `coupons`.`plan_id`
HAVING count < 50
Please help how to fix it.
Table definitions.
coupons (id, client_id, plan_id, customer_id, status, code)
plans (id, name)
clients (id, name...)
client_plans (id, client_id, plan_id)
Basically, a client can have multiple plans and a plan can belong to multiple clients.
Coupons table stores predefined coupons which can be allocated to customers. Non allocated coupons have status as 0, while as allocated coupons get status as 1
Here I am trying to fetch non allocated client wise, plan wise coupon count where either the count is less than 50 or count has reached 0
For example,
If coupons table as 10 rows of client_id = 1 & plan_id = 1 with status as 1, it should return count as 10, but when the table has 0 rows with client_id = 1 and plan_id = 1 with status as 1, it does not return anything in the above query.
Thank you all for your inputs, this worked.
select
sum(CASE WHEN `coupons`.`status` = 1 THEN 1 ELSE 0 END) as count,
clients.title,
plans.name
from
`clients`
left join
`coupons`
on
`coupons`.`client_id` = `clients`.`id`
left join
`plans`
on
`coupons`.`plan_id` = `plans`.`id`
group by
`coupons`.`client_id`,
`coupons`.`plan_id`
having
count < 50
With the inner joins, the query is not going to return any "zero" counts.
If you want to return "zero" counts, you are going to need an outer join somewhere.
But it's not clear what you are actually trying to count.
Assuming that what you are trying to get is a count of rows from coupons, for every possible combination of rows from plans and clients, you could do something like this:
SELECT COUNT(`coupons`.`client_id`) AS `count`
, clients.title
, plans.name
FROM `plans`
CROSS
JOIN `clients`
LEFT
JOIN `coupons`
ON `coupons`.`client_id` = `clients`.`id`
AND `coupons`.`plan_id` = `plans`.`id`
AND `coupons`.`status` = 1
GROUP
BY `clients`.`id`
, `plans`.`id`
HAVING `count` < 50
This is just a guess at result set you are expecting to return. Absent table definitions, example data, and the expected result, we're just guessing.
FOLLOWUP
Based on your comment, it sounds like you want conditional aggregation.
To "count" only the rows in coupons that have status=1, you can do something like this:
SELECT SUM( `coupons`.`status` = 1 ) AS `count`
, clients.title
, plans.name
FROM `coupons`
JOIN `plans`
ON `plans`.`id` = `coupons`.`plan_id`
JOIN `clients`
ON `clients`.`id` = `coupons`.`client_id`
GROUP
BY `clients`.`id`
, `plans`.`id`
HAVING `count` < 50
There are other expressions you can use to get the conditional "count". For example
SELECT COUNT( IF(`coupons`.`status`=1, 1, NULL) ) AS `count`
or
SELECT SUM( IF(`coupons`.`status`=1, 1, 0) ) AS `count`
or, for a more ANSI standards compatible approach
SELECT SUM( CASE WHEN `coupons`.`status` = 1 THEN 1 ELSE 0 END ) AS `count`

Order a Group BY

There are alot questions on this topic, still can't figure out a way to make this work.
The query I'm doing is:
SELECT `b`.`ads_id` AS `ads_id`,
`b`.`bod_bedrag` AS `bod_bedrag`,
`a`. `ads_naam` AS `ads_naam`,
`a`.`ads_url` AS `ads_url`,
`a`.`ads_prijs` AS `ads_price`,
`i`.`url` AS `img_url`,
`c`.`url` AS `cat_url`
FROM `ads_market_bids` AS `b`
INNER JOIN `ads_market` AS `a`
ON `b`.`ads_id` = `a`.`id`
INNER JOIN `ads_images` AS `i`
ON `b`.`ads_id` = `i`.`ads_id`
INNER JOIN `ads_categories` AS `c`
ON `a`.`cat_id` = `c`.`id`
WHERE `i`.`img_order` = '0'
AND `b`.`u_id` = '285'
GROUP BY `b`.`ads_id`
HAVING MAX(b.bod_bedrag)
ORDER BY `b`.`bod_bedrag` ASC
But, the problem I keep seeing is that I need b.bod_bedrag to be sorted before the GROUP BY is taking place or so. Don't know how to explain it exactly.
The bod_bedrag i'm getting now are the lowest of the bids in the table. I need the highest.
Tried like everything, even tought of not grouping by but using DISTINCT. This didn't work either. Tried order by max, everything I know or could find on the internet.
Image 1 is the situation without the group by. Order By works great (ofc).
Image 2 is with the group by. As you can see, the lowest bid is taken as bod_bedrag. I need the highest.
Judging by your output you want:
SELECT amb.ads_id,
MAX(amb.bod_bedrag) max_bod_bedrag,
am.ads_naam,
am.ads_url,
am.ads_prijs ads_price,
ai.url img_url,
ac.url cat_url
FROM ads_market_bids amb
JOIN ads_images ai
ON ai.ads_id = amb.ads_id
AND ai.img_order = 0
JOIN ads_market am
ON am.id = amb.ads_id
JOIN ads_categories ac
ON ac.id = am.cat_id
WHERE amb.u_id = 285
GROUP BY amb.ads_id,
am.ads_naam,
am.ads_url,
am.ads_prijs,
ai.url,
ac.url
ORDER BY max_bod_bedrag ASC
I have also removed all the unecessary backtickery and aliasing of columns to the same name.
Your HAVING was doing nothing as all the groups 'have' a MAX(amb.bod_rag).
select distinct `b`.`ads_id` as `ads_id`, max(`b`.`bod_bedrag`) as `bod_bedrag`,
`a`.`ads_naam` as `ads_naam`, `a`.`ads_url` as `ads_url`, `a`.`ads_prijs` as `ads_price`,
`i`.`url` as `img_url`, `c`.`url` as `cat_url`
from `ads_market_bids` as `b`
inner join `ads_market` as `a` on `b`.`ads_id` = `a`.`id`
inner join `ads_images` as `i` on `b`.`ads_id` = `i`.`ads_id`
inner join `ads_categories` as `c` on `a`.`cat_id` = `c`.`id`
where `i`.`img_order` = '0' and `b`.`u_id` = '285'
group by b.ads_id, a.ads_naam, a.ads_url, a.ads_prijs, i.url, c.url
One approach is to simulate row_number() (which MySQL does not have), but it allows for selection - by record - rather than by aggregates which may come from disparate source records. It works by adding to variables to each row (it does not increase the number of rows) Then, using an ordered subquery those variables are set to 1 for the highest b.bod_bedrag for each b.ads_id, all other rows perb.ads_id` get a higher RN value. At the end we filter where RN = 1 (which equates the the record containing the highest bid value)
SELECT *
FROM (
SELECT
#row_num :=IF(#prev_value=`b`.`ads_id`, #row_num + 1, 1) AS RN
,`b`.`ads_id` AS `ads_id`
,`b`.`bod_bedrag` AS `bod_bedrag`
,`a`.`ads_naam` AS `ads_naam`
,`a`.`ads_url` AS `ads_url`
,`a`.`ads_prijs` AS `ads_price`
,`i`.`url` AS `img_url`
,`c`.`url` AS `cat_url`
, #prev_value := `b`.`bod_bedrag`
FROM `ads_market_bids` AS `b`
INNER JOIN `ads_market` AS `a` ON `b`.`ads_id` = `a`.`id`
INNER JOIN `ads_images` AS `i` ON `b`.`ads_id` = `i`.`ads_id`
INNER JOIN `ads_categories` AS `c` ON `a`.`cat_id` = `c`.`id`
CROSS JOIN
( SELECT #row_num :=1
, #prev_value :=''
) vars
WHERE `i`.`img_order` = '0'
AND `b`.`u_id` = '285'
ORDER BY `b`.`ads_id`, b`.`bod_bedrag` DESC
)
WHERE RN = 1;
You can even turn off that silly GROUP BY extension, details in the man page:
MySQL Extensions to GROUP BY

MySQL Inner Join with where clause sorting and limit, subquery?

Everything in the following query results in one line for each invBlueprintTypes row with the correct information. But I'm trying to add something to it. See below the codeblock.
Select
blueprintType.typeID,
blueprintType.typeName Blueprint,
productType.typeID,
productType.typeName Item,
productType.portionSize,
blueprintType.basePrice * 0.9 As bpoPrice,
productGroup.groupName ItemGroup,
productCategory.categoryName ItemCategory,
blueprints.productionTime,
blueprints.techLevel,
blueprints.researchProductivityTime,
blueprints.researchMaterialTime,
blueprints.researchCopyTime,
blueprints.researchTechTime,
blueprints.productivityModifier,
blueprints.materialModifier,
blueprints.wasteFactor,
blueprints.maxProductionLimit,
blueprints.blueprintTypeID
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
So what I need to get in here is the following table with the columns below it so I can use the values timestamp and sort the entire result by profitHour
tablename: invBlueprintTypesPrices
columns: blueprintTypeID, timestamp, profitHour
I need this information with the following select in mind. Using a select to show my intention of the JOIN/in-query select or whatever that can do this.
SELECT * FROM invBlueprintTypesPrices
WHERE blueprintTypeID = blueprintType.typeID
ORDER BY timestamp DESC LIMIT 1
And I need the main row from table invBlueprintTypes to still show even if there is no result from the invBlueprintTypesPrices. The LIMIT 1 is because I want the newest row possible, but deleting the older data is not a option since history is needed.
If I've understood correctly I think I need a subquery select, but how to do that? I've tired adding the exact query that is above with a AS blueprintPrices after the query's closing ), but did not work with a error with the
WHERE blueprintTypeID = blueprintType.typeID
part being the focus of the error. I have no idea why. Anyone who can solve this?
You'll need to use a LEFT JOIN to check for NULL values in invBlueprintTypesPrices. To mimic the LIMIT 1 per TypeId, you can use the MAX() or to truly make sure you only return a single record, use a row number -- this depends on whether you can have multiple max time stamps for each type id. Assuming not, then this should be close:
Select
...
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Left Join (
SELECT MAX(TimeStamp) MaxTime, TypeId
FROM invBlueprintTypesPrices
GROUP BY TypeId
) blueprintTypePrice On blueprints.blueprintTypeID = blueprintTypePrice.typeID
Left Join invBlueprintTypesPrices blueprintTypePrices On
blueprintTypePrice.TypeId = blueprintTypePrices.TypeId AND
blueprintTypePrice.MaxTime = blueprintTypePrices.TimeStamp
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
Order By
blueprintTypePrices.profitHour
Assuming you might have the same max time stamp with 2 different records, replace the 2 left joins above with something similar to this getting the row number:
Left Join (
SELECT #rn:=IF(#prevTypeId=TypeId,#rn+1,1) rn,
TimeStamp,
TypeId,
profitHour,
#prevTypeId:=TypeId
FROM (SELECT *
FROM invBlueprintTypesPrices
ORDER BY TypeId, TimeStamp DESC) t
JOIN (SELECT #rn:=0) t2
) blueprintTypePrices On blueprints.blueprintTypeID = blueprintTypePrices.typeID AND blueprintTypePrices.rn=1
You don't say where you are putting the subquery. If in the select clause, then you have a problem because you are returning more than one value.
You can't put this into the from clause directly, because you have a correlated subquery (not allowed).
Instead, you can put it in like this:
from . . .
(select *
from invBLueprintTypesPrices ibptp
where ibtp.timestamp = (select ibptp2.timestamp
from invBLueprintTypesPrices ibptp2
where ibptp.blueprintTypeId = ibptp2.blueprintTypeId
order by timestamp desc
limit 1
)
) ibptp
on ibptp.blueprintTypeId = blueprintType.TypeID
This identifies the most recent records for all the blueprintTypeids in the subquery. It then joins in the one that matches.

MySQL Update query with left join and group by

I am trying to create an update query and making little progress in getting the right syntax.
The following query is working:
SELECT t.Index1, t.Index2, COUNT( m.EventType )
FROM Table t
LEFT JOIN MEvents m ON
(m.Index1 = t.Index1 AND
m.Index2 = t.Index2 AND
(m.EventType = 'A' OR m.EventType = 'B')
)
WHERE (t.SpecialEventCount IS NULL)
GROUP BY t.Index1, t.Index2
It creates a list of triplets Index1,Index2,EventCounts.
It only does this for case where t.SpecialEventCount is NULL. The update query I am trying to write should set this SpecialEventCount to that count, i.e. COUNT(m.EventType) in the query above. This number could be 0 or any positive number (hence the left join). Index1 and Index2 together are unique in Table t and they are used to identify events in MEvent.
How do I have to modify the select query to become an update query? I.e. something like
UPDATE Table SET SpecialEventCount=COUNT(m.EventType).....
but I am confused what to put where and have failed with numerous different guesses.
I take it that (Index1, Index2) is a unique key on Table, otherwise I would expect the reference to t.SpecialEventCount to result in an error.
Edited query to use subquery as it didn't work using GROUP BY
UPDATE
Table AS t
LEFT JOIN (
SELECT
Index1,
Index2,
COUNT(EventType) AS NumEvents
FROM
MEvents
WHERE
EventType = 'A' OR EventType = 'B'
GROUP BY
Index1,
Index2
) AS m ON
m.Index1 = t.Index1 AND
m.Index2 = t.Index2
SET
t.SpecialEventCount = m.NumEvents
WHERE
t.SpecialEventCount IS NULL
Doing a left join with a subquery will generate a giant
temporary table in-memory that will have no indexes.
For updates, try avoiding joins and using correlated
subqueries instead:
UPDATE
Table AS t
SET
t.SpecialEventCount = (
SELECT COUNT(m.EventType)
FROM MEvents m
WHERE m.EventType in ('A','B')
AND m.Index1 = t.Index1
AND m.Index2 = t.Index2
)
WHERE
t.SpecialEventCount IS NULL
Do some profiling, but this can be significantly faster in some cases.
my example
update card_crowd as cardCrowd
LEFT JOIN
(
select cc.id , count(1) as num
from card_crowd cc LEFT JOIN
card_crowd_r ccr on cc.id = ccr.crowd_id
group by cc.id
) as tt
on cardCrowd.id = tt.id
set cardCrowd.join_num = tt.num;