This query takes more than 40 seconds to execute on a table that has 200k rows
SELECT
my_robots.*,
(
SELECT count(id)
FROM hpsi_trading
WHERE estado <= 1 and idRobot = my_robots.id
) as openorders,
apikeys.apikey,
apikeys.apisecret
FROM my_robots, apikeys
WHERE estado <= 1
and idRobot = '2'
and ready = '1'
and apikeys.id = my_robots.idApiKey
and (my_robots.id LIKE '%0'
OR my_robots.id LIKE '%1'
OR my_robots.id LIKE '%2')
I know it is because of the count inside the query, but how could i fix this efficiently.
Edit: Explain
Thanks.
Use GROUP BY instead
SELECT my_robots.*,
count(id) as openorders,
apikeys.apikey,
apikeys.apisecret
FROM my_robots
JOIN apikeys ON apikeys.id = my_robots.idApiKey
LEFT JOIN hpsi_trading ON hpsi_trading.idRobot = my_robots.id and estado <= 1
WHERE estado <= 1 and
idRobot = '2' and
ready = '1' and
(
my_robots.id LIKE '%0' OR
my_robots.id LIKE '%1' OR
my_robots.id LIKE '%2'
)
GROUP BY my_robots.id, apikeys.apikey, apikeys.apisecret
Use explicit JOIN syntax. Some indexes will be needed to run it fast, however, the database structure is not clear from your post (and from your query as well).
The explain plan shows that the largest pain is selecting the data from the table hpsi_trading.
The challenge from the database's point of view is that the query contains a correlated subquery in the SELECT clause, which needs to be executed once for each result of the outer query (after filtering).
Replacing this subquery with a JOIN + GROUP BY will require MySQL to join between all these records (inflate) and only then deflate the data using GROUP BY, which might take time.
Instead, I would extract the subquery to a temporary table, which is grouped during creation, index it and join to it. That way, the subquery will run once, using a quick covering index, it will already group the data and only then join it to the other table.
This far, it's all pros. But, the con here is that extracting a subquery to a temporary table might require more effort on the development side.
Please try this version and let me know if it helped (if not, please provide a fresh EXPLAIN plan screenshot):
Creating the temp table:
CREATE TEMPORARY TABLE IF NOT EXISTS temp1 AS
SELECT idRobot, COUNT(id) as openorders
FROM hpsi_trading
WHERE estado <= 1
GROUP BY idRobot;
The modified query:
SELECT
my_robots.*,
temp1.openorders,
apikeys.apikey,
apikeys.apisecret
FROM
my_robots,
apikeys
LEFT JOIN temp1 on temp1.idRobot = my_robots.id
WHERE
estado <= 1 AND idRobot = '2'
AND ready = '1'
AND apikeys.id = my_robots.idApiKey
AND (my_robots.id LIKE '%0'
OR my_robots.id LIKE '%1'
OR my_robots.id LIKE '%2')
The indexes to add for this solution (I assumed from logic that estado, idRobot and ready are from the apikeys table. If that's not the case, let me know and I'll adjust the indexes):
ALTER TABLE `temp1` ADD INDEX `temp1_index_1` (idRobot);
ALTER TABLE `hpsi_trading` ADD INDEX `hpsi_trading_index_1` (idRobot, estado, id);
ALTER TABLE `apikeys` ADD INDEX `apikeys_index_1` (`idRobot`, `ready`, `id`, `estado`);
ALTER TABLE `my_robots` ADD INDEX `my_robots_index_1` (`idApiKey`);
Related
I have a query that selects ~8000 rows. When I execute the query it takes 0.1 sec.
When I copy the query into a view and execute the view it takes about 2 seconds. In the first row of explain it selects ~570K rows, i dont know why.
I dont understand the first Row and why it shows up only in the view explain
1 PRIMARY ALL NULL NULL NULL NULL
This is the query (yes i know im not a mysql pro and the query is not that efficent, but it works ans 0.1 sek would be ok for me. Does anyone know why it is so slow in a view?
MariaDB 10.5.9
select
`xxxxxxx`.`auftraege`.`Zustandigkeit` AS `Zustandigkeit`,
`xxxxxxx`.`auftraege`.`cms` AS `cms`,
`xxxxxxx`.`auftraege`.`auftrag_id` AS `auftrag_id`,
`xxxxxxx`.`angebot`.`angebot_id` AS `angebot_id`,
`xxxxxxx`.`kunden`.`kunde_id` AS `kid`,
`xxxxxxx`.`angebot`.`kunde_id` AS `kunde_id`,
`xxxxxxx`.`kunden`.`firma` AS `firma`,
`xxxxxxx`.`auftraege`.`gekuendigt` AS `gekuendigt`,
`xxxxxxx`.`kunden`.`ansprechpartnerVorname` AS `ansprechpartnerVorname`,
`xxxxxxx`.`kunden`.`ansprechpartner` AS `ansprechpartner`,
`xxxxxxx`.`auftraege`.`ampstatus` AS `ampstatus`,
`xxxxxxx`.`auftraege`.`autoMahnungen` AS `autoMahnungen`,
`xxxxxxx`.`kunden`.`mail` AS `mail`,
`xxxxxxx`.`kunden`.`ansprechpartnerAnrede` AS `ansprechpartnerAnrede`,
case
`xxxxxxx`.`kunden`.`ansprechpartnerAnrede`
when
'm'
then
concat('Herr ', ifnull(`xxxxxxx`.`kunden`.`ansprechpartnerVorname`, ''), ifnull(`xxxxxxx`.`kunden`.`ansprechpartner`, ''))
else
concat('Frau ', ifnull(`xxxxxxx`.`kunden`.`ansprechpartnerVorname`, ''), ifnull(`xxxxxxx`.`kunden`.`ansprechpartner`, ''))
end
AS `ansprechpartnerfullName`, `xxxxxxx`.`kunden`.`website` AS `website`, `xxxxxxx`.`personal`.`name_betrieb` AS `name_betrieb`, `xxxxxxx`.`kunden`.`prioritaet` AS `prioritaet`, `xxxxxxx`.`auftraege`.`infoemail` AS `infoemail`, `xxxxxxx`.`auftraege`.`keywords` AS `keywords`, `xxxxxxx`.`auftraege`.`ftp_h` AS `ftp_h`, `xxxxxxx`.`auftraege`.`ftp_u` AS `ftp_u`, `xxxxxxx`.`auftraege`.`ftp_pw` AS `ftp_pw`, `xxxxxxx`.`auftraege`.`lgi_h` AS `lgi_h`, `xxxxxxx`.`auftraege`.`lgi_u` AS `lgi_u`, `xxxxxxx`.`auftraege`.`lgi_pw` AS `lgi_pw`, `xxxxxxx`.`auftraege`.`autoRemind` AS `autoRemind`, `xxxxxxx`.`kunden`.`telefon` AS `telefon`, `xxxxxxx`.`kunden`.`mobilfunk` AS `mobilfunk`, `xxxxxxx`.`auftraege`.`kommentar` AS `kommentar`, `xxxxxxx`.`auftraege`.`phase` AS `phase`, `xxxxxxx`.`auftraege`.`datum` AS `datum`, `xxxxxxx`.`angebot`.`typ` AS `typ`,
case
`xxxxxxx`.`auftraege`.`gekuendigt`
when
'1'
then
'Ja'
else
'Nein'
end
AS `Gekuendigt ? `,
(
select
count(`xxxxxxx`.`status`.`aenderung`)
from
`xxxxxxx`.`status`
where
`xxxxxxx`.`status`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
)
AS `aenderungen`,
`xxxxxxx`.`auftraege`.`vertragStart` AS `vertragStart`,
`xxxxxxx`.`auftraege`.`vertragEnde` AS `vertragEnde`,
case
`xxxxxxx`.`auftraege`.`zahlungsart`
when
'U'
then
'Überweisung'
when
'L'
then
'Lastschrift'
else
'Unbekannt'
end
AS `Zahlungsart`, `xxxxxxx`.`kunden`.`yyyyy_piwik` AS `yyyyy_piwik`,
(
select
max(`xxxxxxx`.`status`.`datum`) AS `mxDTst`
from
`xxxxxxx`.`status`
where
`xxxxxxx`.`status`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
and `xxxxxxx`.`status`.`typ` = 'SEO'
)
AS `mxDTst`,
(
select
case
`xxxxxxx`.`rechnungen`.`beglichen`
when
'YES'
then
'isOk'
else
'isAffe'
end
AS `neuUwe`
from
(
`xxxxxxx`.`zahlungsplanneu`
join
`xxxxxxx`.`rechnungen`
on(`xxxxxxx`.`zahlungsplanneu`.`rechnungsnummer` = `xxxxxxx`.`rechnungen`.`rechnungsnummer`)
)
where
`xxxxxxx`.`zahlungsplanneu`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
and `xxxxxxx`.`rechnungen`.`beglichen` <> 'STO' limit 1
)
AS `neuer`,
(
select
group_concat(`xxxxxxx`.`kunden_keywords`.`keyword` separator ',')
from
`xxxxxxx`.`kunden_keywords`
where
`xxxxxxx`.`kunden_keywords`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`
)
AS `keyword`,
(
select
case
count(0)
when
0
then
'Cool'
else
'Uncool'
end
AS `AusfallVor`
from
`xxxxxxx`.`rechnungen`
where
`xxxxxxx`.`rechnungen`.`rechnung_tag` < current_timestamp() - interval 15 day
and `xxxxxxx`.`rechnungen`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`
and `xxxxxxx`.`rechnungen`.`beglichen` = 'NO' limit 1
)
AS `Liquidiert`
from
(
((((`xxxxxxx`.`auftraege`
join
`xxxxxxx`.`angebot`
on(`xxxxxxx`.`auftraege`.`angebot_id` = `xxxxxxx`.`angebot`.`angebot_id`))
join
`xxxxxxx`.`kunden`
on(`xxxxxxx`.`angebot`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`))
left join
`xxxxxxx`.`kunden_keywords`
on(`xxxxxxx`.`angebot`.`kunde_id` = `xxxxxxx`.`kunden_keywords`.`kunde_id`))
join
`xxxxxxx`.`personal`
on(`xxxxxxx`.`kunden`.`bearbeiter` = `xxxxxxx`.`personal`.`personal_id`))
left join
`xxxxxxx`.`status`
on(`xxxxxxx`.`auftraege`.`auftrag_id` = `xxxxxxx`.`status`.`auftrag_id`)
)
group by
`xxxxxxx`.`auftraege`.`auftrag_id`
order by
NULL
UPDATE 1
1. The View Itself (Duration 1.83 sec)
1.1 Create the View: This is the View i created, it only contains the query from above.
1.2 Executing the View: It takes 1.83 sek to execute the view
1.3 Analyze the View: This is the explain of the view
2. The view with added where clause (Duration 1.86 sec)
2.1 Analyze the View with added where clause #rick wanted me to add a where clause to the view, if i understood him correctly. This is the explain of the view, where i added a where clause, takes 1.86 sec.
3. The Query, that is the source of the view (Duration: 0.1 sec)
3.1 Execute the query directly This is the query, that is the source of the view, when i execute it directly to the server. It takes ~0.1 - 0.2 seconds.
3.2 Analyze the direct queryAnd this is the explain of the pure query.
Why the view is so much slower, by only cupsuling the query inside of the view?
Update 2
These are the indexes I have set
ALTER TABLE angebot ADD INDEX angebot_idx_angebot_id (angebot_id);
ALTER TABLE auftraege ADD INDEX auftraege_idx_auftrag_id (auftrag_id);
ALTER TABLE kunden ADD INDEX kunden_idx_kunde_id (kunde_id);
ALTER TABLE kunden_keywords ADD INDEX kunden_keywords_idx_kunde_id (kunde_id);
ALTER TABLE personal ADD INDEX personal_idx_personal_id (personal_id);
ALTER TABLE rechnungen ADD INDEX rechnungen_idx_rechnungsnummer_beglichen (rechnungsnummer,beglichen);
ALTER TABLE rechnungen ADD INDEX rechnungen_idx_beglichen_kunde_id_rechnung (beglichen,kunde_id,rechnung_tag);
ALTER TABLE status ADD INDEX status_idx_auftrag_id (auftrag_id);
ALTER TABLE status ADD INDEX status_idx_typ_auftrag_id_datum (typ,auftrag_id,datum);
ALTER TABLE zahlungsplanneu ADD INDEX zahlungsplanneu_idx_auftrag_id (auftrag_id);
Be consistent between tables. kunde_id, for example, seems to be declared differently between tables. This may be preventing some obvious optimizations. (There are 6 JOINs that say func in EXPLAIN`.)
Remove the extra parentheses in JOINs. They may be preventing what the Optimizer is happy to do -- rearrange the tables in a JOIN.
Turn the query inside out. By this, I mean to do the minimum amount of work to do the main JOIN. Collect mostly id(s). Then do the dependent subqueries in an outer select. Something like:
SELECT ... ( SELECT ... ), ...
FROM ( SELECT a1.id
FROM a AS a1
JOIN b ON ..
JOIN c ON .. )
JOIN a AS a2 ON a2.id = a1.id
JOIN d ON ...
The "inside-out" kludge may eliminate the need for the GROUP BY. (Your query is too complex for me to see for sure.) If so, then I call the problem "explode-implode" -- Your query first JOINs, producing a temp table with lots of rows ("explodes"). Then it does a GROUP BY ("implodes").
More
These indexes will probably help:
status: (auftrag_id, typ, datum, aenderung)
rechnungen: (beglichen, kunde_id, rechnung_tag)
rechnungen: (rechnungsnummer, beglichen)
zahlungsplanneu: (auftrag_id, rechnungsnummer)
kunden_keywords: (kunde_id, keyword) -- (unless `kunde_id` is the PK)
(I see from all 3 EXPLAINs that you probably have sufficient indexes on kunden_keywords and status. Show me what indexes you have, so I can see if the existing indexes are as good as my suggestions.) "Using index" == "covering index".
Near the end is this LEFT JOIN, but I did not spot any use for the table; perhaps it can be removed?
left join `kunden_keywords` on(`angebot`.`kunde_id` = `kunden_keywords`.`kunde_id`))
I'm having trouble optimising a simple SQL query but having serious issue with timing. I've written it three times and none of them work. Here is the original one I was hoping to work:
SELECT RSKADDR.*
FROM EDW_BASE.RCI_RISK_ADDRESS RSKADDR
INNER JOIN (
SELECT DISTINCT COVER_RISK_ID
FROM EDW_BASE.RCI_COVER_RISK_MASTER RSKMASTER
INNER JOIN
(SELECT DISTINCT CONTACT_ID, FOLLOW_UP_DATE
FROM EDW_STG.STG_CIM_SVOM03
WHERE OUTSTANDING = 1 AND QUEUE = 'CIM Update for Contact Address') ADDR_WF
ON RSKMASTER.CONTACT_CODE = ADDR_WF.CONTACT_ID
WHERE RSKMASTER.IS_STORNO != 1
AND RSKMASTER.PRODUCT_CODE = 'HOME'
AND ADDR_WF.FOLLOW_UP_DATE >= RSKMASTER.COVER_EFF_START_DATE
AND RSKMASTER.POLICY_STATUS_CODE = 'POLICY'
AND ADDR_WF.FOLLOW_UP_DATE <= RSKMASTER.COVER_EFF_END_DATE
) ACTVRSK
ON ACTVRSK.COVER_RISK_ID = RSKADDR.RISK_ID
The code in the first inner join works fast all the way to the end. That is, the second SELECT query (within the INNER JOIN query of the first and main SELECT query) works fast without a problem. The problem arises when I integrate the second SELECT query inside the INNER JOIN of the main SELECT query (select RSKADDR.*).
Then it seems the execution is never ending!
I tried other ways and same result:
SELECT RSKADDR.*
FROM EDW_BASE.RCI_RISK_ADDRESS RSKADDR
INNER JOIN EDW_BASE.RCI_COVER_RISK_MASTER RSKMASTER
ON RSKMASTER.COVER_RISK_ID = RSKADDR.RISK_ID
AND RSKMASTER.IS_STORNO != 1
AND RSKMASTER.PRODUCT_CODE = 'HOME'
AND RSKMASTER.POLICY_STATUS_CODE = 'POLICY'
INNER JOIN EDW_STG.STG_CIM_SVOM03 ADDR_WF
ON OUTSTANDING = 1 AND QUEUE = 'CIM Update for Contact Address'
AND RSKMASTER.CONTACT_CODE = ADDR_WF.CONTACT_ID
AND ADDR_WF.FOLLOW_UP_DATE >= RSKMASTER.COVER_EFF_START_DATE
AND ADDR_WF.FOLLOW_UP_DATE <= RSKMASTER.COVER_EFF_END_DATE
It's such an easy query and can't get it to work. How can I do this?
DISTINCT is a costly operation and seldom needed. It often indicates a bad database design or a poorly written query. In your query you are even doing this repeatedly; that doesn't look good.
The second query looks much better. As you say you get the same result, DISTINCT in the first query was superfluous obviously.
I see you doing joins, but all you select is data from one table. So why join then? Select from the table you want data from and put your criteria in WHERE where it belongs.
The following query may be faster, because it plainly shows that we are simply checking whether we find matches in the other tables or not. But then, MySQL was known for not performing too well with IN clauses, so that may depend on the Version you are using.
select *
from edw_base.rci_risk_address
where risk_id in
(
select rm.cover_risk_id
from edw_base.rci_cover_risk_master rm
where rm.is_storno <> 1
and rm.product_code = 'HOME'
and rm.policy_status_code = 'POLICY'
and exists
(
select *
from edw_stg.stg_cim_svom03 adr
where adr.contact_id = rm.contact_code
and adr.follow_up_date >= rm.cover_eff_start_date
and adr.follow_up_date <= rm.cover_eff_end_date
and adr.outstanding = 1
and adr.queue = 'CIM Update for Contact Address'
)
);
Anyway, with your second query or with mine, I suppose the following indexes would help:
create index idx1 on rci_cover_risk_master
(
product_code,
policy_status_code,
is_storno,
contact_code,
cover_eff_start_date,
cover_eff_end_date,
cover_risk_id
);
create index idx2 on stg_cim_svom03
(
contact_id,
follow_up_date,
outstanding,
queue
);
create index idx3 on rci_risk_address(risk_id);
From the query, you only need RSKADDR data, so no need for an INNER JOIN. You can do the same with EXISTS keyword. Try the below query
SELECT RSKADDR.*
FROM EDW_BASE.RCI_RISK_ADDRESS RSKADDR
WHERE EXISTS (
SELECT 1
FROM EDW_BASE.RCI_COVER_RISK_MASTER RSKMASTER
WHERE EXISTS
(SELECT 1
FROM EDW_STG.STG_CIM_SVOM03
WHERE OUTSTANDING = 1 AND QUEUE = 'CIM Update for Contact Address') ADDR_WF
AND RSKMASTER.CONTACT_CODE = ADDR_WF.CONTACT_ID
AND RSKMASTER.IS_STORNO != 1
AND RSKMASTER.PRODUCT_CODE = 'HOME'
AND ADDR_WF.FOLLOW_UP_DATE >= RSKMASTER.COVER_EFF_START_DATE
AND RSKMASTER.POLICY_STATUS_CODE = 'POLICY'
AND ADDR_WF.FOLLOW_UP_DATE <= RSKMASTER.COVER_EFF_END_DATE
)
AND RSKMASTER.COVER_RISK_ID = RSKADDR.RISK_ID
)
Note : I have not tested query as no schema available.
I'm not an expert in SQL, i have an sql statement :
SELECT * FROM articles WHERE article_id IN
(SELECT distinct(content_id) FROM contents_by_cats WHERE cat_id='$cat')
AND permission='true' AND date <= '$now_date_time' ORDER BY date DESC;
Table contents_by_cats has 11000 rows.
Table articles has 2700 rows.
Variables $now_date_time and $cat are php variables.
This query takes about 10 seconds to return the values (i think because it has nested SELECT statements) , and 10 seconds is a big amount of time.
How can i achieve this in another way ? (Views or JOIN) ?
I think JOIN will help me here but i don't know how to use it properly for the SQL statement that i mentioned.
Thanks in advance.
A JOIN is exactly what you are looking for. Try something like this:
SELECT DISTINCT articles.*
FROM articles
JOIN contents_by_cats ON articles.article_id = contents_by_cats.content_id
WHERE contents_by_cats.cat_id='$cat'
AND articles.permission='true'
AND articles.date <= '$now_date_time'
ORDER BY date DESC;
If your query is still not as fast as you would like then check that you have an index on articles.article_id and contents_by_cats.content_id and contents_by_cats.cat_id. Depending on the data you may want an index on articles.date as well.
Do note that if the $cat and $now_date_time values are coming from a user then you should really be preparing and binding the query rather than just dumping these values into the query.
This is the query we are starting with:
SELECT a.*
FROM articles a
WHERE article_id IN (SELECT distinct(content_id)
FROM contents_by_cats
WHERE cat_id ='$cat'
) AND
permission ='true' AND
date <= '$now_date_time'
ORDER BY date DESC;
Two things will help this query. The first is to rewrite it using exists rather than in and to simplify the subquery:
SELECT a.*
FROM articles a
WHERE EXISTS (SELECT 1
FROM contents_by_cats cbc
WHERE cbc.content_id = a.article_id and cat_id = '$cat'
) AND
permission ='true' AND
date <= '$now_date_time'
ORDER BY date DESC;
Second, you want indexes on both articles and contents_by_cats:
create index idx_articles_3 on articles(permission, date, article_id);
create index idx_contents_by_cats_2 on contents_by_cat(content_id, cat_id);
By the way, instead of $now_date_time, you can just use the now() function in MySQL.
Let's assume I have the following tables:
items table
item_id|view_count
item_views table
view_id|item_id|ip_address|last_view
What I would like to do is:
If last view of item with given item_id by given ip_address was 1+ hour ago I would like to increment view_count of item in items table. And as a result get the view count of item. How I will do it normally:
q = SELECT count(*) FROM item_views WHERE item_id='item_id' AND ip_address='some_ip' AND last_view < current_time-60*60
if(q==1) then q = UPDATE items SET view_count = view_count+1 WHERE item_id='item_id'
//and finally get view_count of item
q = SELECT view_count FROM items WHERE item_id='item_id'
Here I used 3 SQL queries. How can I merge it into one SQL query? And how can it affect the processing time? Will it be faster or slower than previous method?
I don't think your logic is correct for what you describe that you want. The query:
SELECT count(*)
FROM item_views
WHERE item_id='item_id' AND
ip_address='some_ip' AND
last_view < current_time-60*60
is counting the number of views longer ago than your time frame. I think you want:
last_view > current_time-60*60
and then have if q = 0 on the next line.
MySQL is pretty good with the performance of not exists, so the following should work well:
update items
set view_count = view_count+1
WHERE item_id='item_id' and
not exists (select 1
from item_views
where item_id='item_id' AND
ip_address='some_ip' AND
last_view > current_time-60*60
)
It will work much better with an index on item_views(item_id, ip_address, last_view) and an index on item(item_id).
In MySQL scripting, you could then write:
. . .
set view_count = (#q := view_count+1)
. . .
This would also give you the variable you are looking for.
update target
set target.view_count = target.view_count + 1
from items target
inner join (
select item_id
from item_views
where item_id = 'item_id'
and ip_address = 'some_ip'
and last_view < current_time - 60*60
) ref
on ref.item_id = target.item_id;
You can only combine the update statement with the condition using a join as in the above example; but you'll still need a separate select statement.
It may be slower on very large set and/or unindexed table.
I have a table like this (MySQL 5.0.x, MyISAM):
response{id, title, status, ...} (status: 1 new, 3 multi)
I would like to update the status from new (status=1) to multi (status=3) of all the responses if at least 20 have the same title.
I have this one, but it does not work :
UPDATE response SET status = 3 WHERE status = 1 AND title IN (
SELECT title FROM (
SELECT DISTINCT(r.title) FROM response r WHERE EXISTS (
SELECT 1 FROM response spam WHERE spam.title = r.title LIMIT 20, 1)
)
as u)
Please note:
I do the nested select to avoid the famous You can't specify target table 'response' for update in FROM clause
I cannot use GROUP BY for performance reasons. The query cost with a solution using LIMIT is way better (but it is less readable).
EDIT:
It is possible to do SELECT FROM an UPDATE target in MySQL. See solution here
The issue is on the data selected which is totaly wrong.
The only solution I found which works is with a GROUP BY:
UPDATE response SET status = 3
WHERE status = 1 AND title IN (SELECT title
FROM (SELECT title
FROM response
GROUP BY title
HAVING COUNT(1) >= 20)
as derived_response)
Thanks for your help! :)
MySQL doesn't like it when you try to UPDATE and SELECT from the same table in one query. It has to do with locking priorities, etc.
Here's how I would solve this problem:
SELECT CONCAT('UPDATE response SET status = 3 ',
'WHERE status = 1 AND title = ', QUOTE(title), ';') AS sql
FROM response
GROUP BY title
HAVING COUNT(*) >= 20;
This query produces a series of UPDATE statements, with the quoted titles that deserve to be updated embedded. Capture the result and run it as an SQL script.
I understand that GROUP BY in MySQL often incurs a temporary table, and this can be costly. But is that a deal-breaker? How frequently do you need to run this query? Besides, any other solutions are likely to require a temporary table too.
I can think of one way to solve this problem without using GROUP BY:
CREATE TEMPORARY TABLE titlecount (c INTEGER, title VARCHAR(100) PRIMARY KEY);
INSERT INTO titlecount (c, title)
SELECT 1, title FROM response
ON DUPLICATE KEY UPDATE c = c+1;
UPDATE response JOIN titlecount USING (title)
SET response.status = 3
WHERE response.status = 1 AND titlecount.c >= 20;
But this also uses a temporary table, which is why you try to avoid using GROUP BY in the first place.
I would write something straightforward like below
UPDATE `response`, (
SELECT title, count(title) as count from `response`
WHERE status = 1
GROUP BY title
) AS tmp
SET response.status = 3
WHERE status = 1 AND response.title = tmp.title AND count >= 20;
Is using GROUP BY really that slow ? The solution you tried to implement looks like requesting again and again on the same table and should be way slower than using GROUP BY if it worked.
This is a funny peculiarity with MySQL - I can't think of a way to do it in a single statement (GROUP BY or no GROUP BY).
You could select the appropriate response rows into a temporary table first then do the update by selecting from that temp table.
you'll have to use a temporary table:
create temporary table r_update (title varchar(10));
insert r_update
select title
from response
group
by title
having count(*) < 20;
update response r
left outer
join r_update ru
on ru.title = r.title
set status = case when ru.title is null then 3 else 1;