How can I optimize MySQL query for update? - mysql

I have a table with 300 000 records. In this table have duplicae rows and I want to update column "flag"
TABLE
------------------------------------
|number | flag | ... more column ...|
------------------------------------
|ABCD | 0 | ...................|
|ABCD | 0 | ...................|
|ABCD | 0 | ...................|
|BCDE | 0 | ...................|
|BCDE | 0 | ...................|
I use this query for updating "flag" column:
UPDATE table i
INNER JOIN (SELECT number FROM table
GROUP BY number HAVING count(number) > 1 ) i2
ON i.number = i2.number
SET i.flag = '1'
This query working very very slowly (more 600 seconds) for this 300 000 records.
How Can I optimize this query?
STRUCTURE OF MY TABLE
CREATE TABLE IF NOT EXISTS `inv` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`pn` varchar(10) NOT NULL COMMENT 'Part Number',
`qty` int(5) NOT NULL,
`qty_old` int(5) NOT NULL,
`flag_qty` tinyint(1) NOT NULL,
`name` varchar(60) NOT NULL,
`vid` int(11) NOT NULL ,
`flag_d` tinyint(1) NOT NULL ,
`flag_u` tinyint(1) NOT NULL ,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `pn` (`pn`),
KEY `name` (`name`),
KEY `vid` (`vid`),
KEY `pn_2` (`pn`),
KEY `flag_qty` (`flag_qty`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=0 ;
If "name" is duplicate I want to update flag_qty

If you do not already have an index on number you should add one -
CREATE INDEX table_number ON table (number);
UPDATE Try this -
UPDATE inv t1
INNER JOIN inv t2
ON t1.name = t2.name
AND t1.id <> t2.id
SET t1.flag_qty = 1;
You can create your table with just the duplicates by selecting this data directly into another table instead of doing this flag update first.
INSERT INTO duplicate_invs
SELECT DISTINCT inv1.*
FROM inv AS inv1
INNER JOIN inv AS inv2
ON inv1.name = inv2.name
AND inv1.id < inv2.id
If you can explain the logic for which rows get deleted from inv table it may be that the whole process can be done in one step.

Get MySQL to EXPLAIN the query to you. Then you will see what indexing would improve things.

EXPLAIN will show you where it is slow and here're some ideas, how to improve perfomance:
Add indexing
Use InnoDB foreign keys
Split query into 2 and process them separately in lagnuage you use.
write the same idea in MySQL procedure (not sure, whether this would be fast).

I would use a temp table. 1.) select all relevant records into a temp table, set INDEX on id. 2.) update the table using something like this
UPDATE table i, tmp_i
SET i.flag = '1'
WHERE i.id = tmp_i.id

you can try (assuming VB.net, but can be implemented with any language).
Dim ids As String = Cmd.ExectueScalar("select group_concat(number) from (SELECT number FROM table GROUP BY number HAVING count(number) > 1)")
After you get the list of IDs (comma-delimited) than use
UPDATE i
SET i.flag = '1'
WHERE i.number in ( .... )
It can be slow also, but the first - SELECT, will not lock up your database and replication, etc. the UPDATE will be faster.

Related

Query speed of insert/ update SMA (simple moving average)

I would like to include a column in my table with the simple moving average of stock data. I have been able to create several queries which successfully do so, however the query speed is slow. My goal is to improve the query speed.
I have the following table:
CREATE TABLE `timeseries_test` (
`timeseries_id` int(11) NOT NULL AUTO_INCREMENT,
`stock_id` int(10) NOT NULL,
`date` date NOT NULL,
`open` decimal(16,8) NOT NULL,
`high` decimal(16,8) NOT NULL,
`low` decimal(16,8) NOT NULL,
`close` decimal(16,8) NOT NULL,
`adjusted_close` double(16,8) NOT NULL,
`volume` int(16) NOT NULL,
`dividend` double(16,8) NOT NULL,
`split_coefficient` double(16,15) NOT NULL,
`100sma` decimal(16,8) NOT NULL,
PRIMARY KEY (`timeseries_id`),
KEY `stock` (`stock_id`),
KEY `date` (`date`),
KEY `date_stock` (`stock_id`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=5444325 DEFAULT CHARSET=latin1
I have tried many different query formats, but they all take about 25 seconds per 5000 rows. The select query only takes less than a second. Below an example query:
UPDATE stock.timeseries_test t1 INNER JOIN (
SELECT a.timeseries_id,
Round( ( SELECT SUM(b.close) / COUNT(b.close)
FROM timeseries_test AS b
WHERE DATEDIFF(a.date, b.date) BETWEEN 0 AND 99 AND a.stock_id = b.stock_id
), 2 ) AS '100sma'
FROM timeseries_test AS a) t2
ON t1.`timeseries_id` = t2.`timeseries_id`
SET t1.100sma = t2.100SMA
WHERE t2.100sma = null
Below the explain query:
1 PRIMARY <derived2> NULL ALL NULL NULL NULL NULL 10385 10.00 Using where
1 UPDATE t1 NULL eq_ref PRIMARY PRIMARY 4 t2.timeseries_id 1 100.00 NULL
2 DERIVED a NULL index NULL date_stock 7 NULL 10385 100.00 Using index
3 DEPENDENT SUBQUERY b NULL ref stock,date_stock stock 4 stock.a.stock_id 5192 100.00 Using where
Any help is appreciated.
If you are running MySQL 8.0, I recommend window functions with a range specification; this avois the need for a correlated subquery.
update stock.timeseries_test t1
inner join (
select timeseries_id,
avg(close) over(
partition by stock_id
order by date
range between interval 99 day preceding and current row
) `100sma`
from timeseries_test
) t2 on t1.timeseries_id = t2.timeseries_id
set t1.`100sma` = t2.`100sma`
It is quite unclear what the purpose of the original, outer where clause is, so I removed it:
WHERE t2.`100sma` = null
If you do want to check for nullness, then you need is null; but doing so would pretty much defeat whole logic of the update statement. Maybe you meant:
WHERE t1.`100sma` is null
Functions are not sargable. Instead of
DATEDIFF(a.date, b.date) BETWEEN 0 AND 99
use
a.date BETWEEN b.date AND b.date + INTERVAL 99 DAY
(or maybe a and b should be swapped)
I suspect (from the column names) that the pair (stock_id,date) is unique and that timeseries_id is never really used. If those are correct, then
PRIMARY KEY (`timeseries_id`),
KEY `date_stock` (`stock_id`,`date`)
-->
PRIMARY KEY(`stock_id`,`date`)
The ON(timestamp_id would need to be changed to testing both those columns.
Also, toss this since there is another index that starts with the same column(s):
KEY `stock` (`stock_id`),

Optimizing MySQL CREATE TABLE Query

I have two tables I am trying to join in a third query and it seems to be taking far too long.
Here is the syntax I am using
CREATE TABLE active_users
(PRIMARY KEY ix_all (platform_id, login_year, login_month, person_id))
SELECT platform_id
, YEAR(my_timestamp) AS login_year
, MONTH(my_timestamp) AS login_month
, person_id
, COUNT(*) AS logins
FROM
my_login_table
GROUP BY 1,2,3,4;
CREATE TABLE active_alerts
(PRIMARY KEY ix_all (platform_id, alert_year, alert_month, person_id))
SELECT platform_id
, YEAR(alert_datetime) AS alert_year
, MONTH(alert_datetime) AS alert_month
, person_id
, COUNT(*) AS alerts
FROM
my_alert_table
GROUP BY 1,2,3,4;
CREATE TABLE all_data
(PRIMARY KEY ix_all (platform_id, theYear, theMonth, person_id))
SELECT a.platform_id
, a.login_year AS theyear
, a.login_month AS themonth
, a.person_id
, IFNULL(a.logins,0) AS logins
, IFNULL(b.alerts,0) AS job_alerts
FROM
active_users a
LEFT OUTER JOIN
active_alerts b
ON a.platform_id = b.platform_id
AND a.login_year = b.alert_year
AND a.login_month = b.alert_month
AND a.person_id = b.person_id;
The first table (logins) returns about half a million rows and takes less than 1 minute, the second table (alerts) returns about 200k rows and takes less than 1 minute.
If I run just the SELECT part of the third statement it runs in a few seconds, however as soon as I run it with the CREATE TABLE syntax it takes more than 30 minutes.
I have tried different types of indexes than a primary key, such as UNIQUE or INDEX as well as no key at all, but that doesn't seem to make much difference.
Is there something I can do to speed up the creation / insertion of this table?
EDIT:
Here is the output of the show create table statements
CREATE TABLE `active_users` (
`platform_id` int(11) NOT NULL,
`login_year` int(4) DEFAULT NULL,
`login_month` int(2) DEFAULT NULL,
`person_id` varchar(40) NOT NULL,
`logins` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`login_year`,`login_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
CREATE TABLE `alerts` (
`platform_id` int(11) NOT NULL,
`alert_year` int(4) DEFAULT NULL,
`alert_month` int(2) DEFAULT NULL,
`person_id` char(36) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`alerts` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`alert_year`,`alert_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
and the output of the EXPLAIN
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a (null) ALL (null) (null) (null) (null) 503504 100 (null)
1 SIMPLE b (null) ALL ix_all (null) (null) (null) 220187 100 Using where; Using join buffer (Block Nested Loop)
It's a bit of a hack but I figured out how to get it to run much faster.
I added a primary key to the third table on platform, year, month, person
I inserted the intersect data using an inner join, then insert ignore the left table plus a zero for alerts in a separate statement.

MySql group by optimization - avoid tmp table and/or filesort

I have a slow query, without the group by is fast (0.1-0.3 seconds), but with the (required) group by the duration is around 10-15s.
The query joins two tables, events (near 50 million rows) and events_locations (5 million rows).
Query:
SELECT `e`.`id` AS `event_id`,`e`.`time_stamp` AS `time_stamp`,`el`.`latitude` AS `latitude`,`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,`e`.`entity_id` AS `asset_name`, `el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id`AS `entity_type_id`, el.some_id
FROM events e
INNER JOIN events_locations el ON el.event_id = e.id
WHERE 1=1
AND el.other_id = '1'
AND time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
GROUP BY `e`.`event_type_id` , `el`.`some_id` , `el`.`group_alias`;
Table events:
CREATE TABLE `events` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`event_type_id` int(11) NOT NULL,
`entity_type_id` int(11) NOT NULL,
`entity_id` varchar(64) NOT NULL,
`alias` varchar(64) NOT NULL,
`time_stamp` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `entity_id` (`entity_id`),
KEY `event_type_idx` (`event_type_id`),
KEY `idx_events_time_stamp` (`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Table events_locations
CREATE TABLE `events_locations` (
`event_id` bigint(20) NOT NULL,
`latitude` double NOT NULL,
`longitude` double NOT NULL,
`some_id` bigint(20) DEFAULT NULL,
`other_id` bigint(20) DEFAULT NULL,
`time_span` bigint(20) DEFAULT NULL,
`group_alias` varchar(64) NOT NULL,
KEY `some_id_idx` (`some_id`),
KEY `idx_events_group_alias` (`group_alias`),
KEY `idx_event_id` (`event_id`),
CONSTRAINT `fk_event_id` FOREIGN KEY (`event_id`) REFERENCES `events` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The explain:
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| 1 | SIMPLE | ea | ALL | 'idx_event_id' | NULL | NULL | NULL | 5152834 | 'Using where; Using temporary; Using filesort' |
| 1 | SIMPLE | e | eq_ref | 'PRIMARY,idx_events_time_stamp' | PRIMARY | '8' | 'name.ea.event_id' | 1 | |
+----+-------------+----------------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
2 rows in set (0.08 sec)
From the doc:
Temporary tables can be created under conditions such as these:
If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.
DISTINCT combined with ORDER BY may require a temporary table.
If you use the SQL_SMALL_RESULT option, MySQL uses an in-memory temporary table, unless the query also contains elements (described later) that require on-disk storage.
I already tried:
Create an index by 'el.some_id , el.group_alias'
Decrease the varchar size to 20
Increase the size of sort_buffer_size and read_rnd_buffer_size;
Any suggestions for performance tuning would be much appreciated!
In your case events table has time_span as indexing property. So before joining both tables first select required records from events table for specific date range with required details. Then join the event_location by using table relation properties.
Check your MySql Explain keyword to check how does your approach your table records. It will tell you how much rows are scanned for before selecting required records.
Number of rows that are scanned also involve in query execution time. Use my below logic to reduce the number of rows that are scanned.
SELECT
`e`.`id` AS `event_id`,
`e`.`time_stamp` AS `time_stamp`,
`el`.`latitude` AS `latitude`,
`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,
`e`.`entity_id` AS `asset_name`,
`el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,
`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id` AS `entity_type_id`,
`el`.`some_id` as `some_id`
FROM
(select
`id` AS `event_id`,
`time_stamp` AS `time_stamp`,
`entity_id` AS `asset_name`,
`event_type_id` AS `event_type_id`,
`entity_type_id` AS `entity_type_id`
from
`events`
WHERE
time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
) AS `e`
JOIN `events_locations` `el` ON `e`.`event_id` = `el`.`event_id`
WHERE
`el`.`other_id` = '1'
GROUP BY
`e`.`event_type_id` ,
`el`.`some_id` ,
`el`.`group_alias`;
The relationship between these tables is 1:1, so, I asked me why is a group by required and I found some duplicated rows, 200 in 50000 rows. So, somehow, my system is inserting duplicates and someone put that group by (years ago) instead of seek of the bug.
So, I will mark this as solved, more or less...

Help me optimize this MySql query

I have a MySql query that take a very long time to run (about 7 seconds). The problem seems to be with the OR in this part of the query: "(tblprivateitem.userid=?userid OR tblprivateitem.userid=1)". If I skip the "OR tblprivateitem.userid=1" part it takes only 0.01 seconds. As I need that part I need to find a way to optimize this query. Any ideas?
QUERY:
SELECT
tbladdeditem.addeditemid,
tblprivateitem.iitemid,
tblprivateitem.itemid
FROM tbladdeditem
INNER JOIN tblprivateitem
ON tblprivateitem.itemid=tbladdeditem.itemid
AND (tblprivateitem.userid=?userid OR tblprivateitem.userid=1)
WHERE tbladdeditem.userid=?userid
EXPLAIN:
id select_type table type possible_keys key key_len ref rows extra
1 SIMPLE tbladdeditem ref userid userid 4 const 293 Using where
1 SIMPLE tblprivateitem ref userid,itemid itemid 4 tbladdeditem.itemid 2 Using where
TABLES:
tbladdeditem contains 1 100 000 rows:
CREATE TABLE `tbladdeditem` (
`addeditemid` int(11) NOT NULL auto_increment,
`itemid` int(11) default NULL,
`userid` mediumint(9) default NULL,
PRIMARY KEY (`addeditemid`),
KEY `userid` (`userid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
tblprivateitem contains 2 700 000 rows:
CREATE TABLE `tblprivateitem` (
`privateitemid` int(11) NOT NULL auto_increment,
`userid` mediumint(9) default '1',
`itemid` int(10) NOT NULL,
`iitemid` mediumint(9) default NULL,
PRIMARY KEY (`privateitemid`),
KEY `userid` (`userid`),
KEY `itemid` (`itemid`) //Changed this index to only use itemid instead
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
UPDATE
I made my queries and schema match your original question exactly, multi-column key and all. The only possible difference is that I populated each table with two million entries. My query (your query) runs in 0.15 seconds.
delimiter $$
set #userid = 6
$$
SELECT
tbladdeditem.addeditemid, tblprivateitem.iitemid, tblprivateitem.itemid
FROM tbladdeditem
INNER JOIN tblprivateitem
ON tblprivateitem.itemid=tbladdeditem.itemid
AND (tblprivateitem.userid=#userid or tblprivateitem.userid = 1)
WHERE tbladdeditem.userid=#userid
I have the same explain that you do, and with my data, my query return over a thousand matches without any issue at all. Being completely at a loss, as you really shouldn't be having these issues -- is it possible you are running a very limiting version of MySQL? Are you running 64-bit? Plenty of memory?
I had made the assumption that your query wasn't performing well, and when mine was, assumed I had fixed you problem. So now I eat crow. I'll post some of the avenues I went down. But I'm telling you, your query the way you posted it originally works just fine. I can only imagine your MySQL thrashing to the hard drive or something. Sorry I couldn't be more help.
PREVIOUS RESPONSE (Which is also an update)
I broke down and recreated your problem in my own database. After trying independent indexes on userid and on itemid I was unable to get the query below a few seconds, so I set up very specific multi-column keys as directed by the query. Notice on tbladdeditem the multi-column query begins with itemid while on the tblprivateitem the columns are reversed:
Here is the schema I used:
CREATE TABLE `tbladdeditem` (
`addeditemid` int(11) NOT NULL AUTO_INCREMENT,
`itemid` int(11) NOT NULL,
`userid` mediumint(9) NOT NULL,
PRIMARY KEY (`addeditemid`),
KEY `userid` (`userid`),
KEY `i_and_u` (`itemid`,`userid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `tblprivateitem` (
`privateitemid` int(11) NOT NULL AUTO_INCREMENT,
`userid` mediumint(9) NOT NULL DEFAULT '1',
`itemid` int(10) NOT NULL,
`iitemid` mediumint(9) NOT NULL,
PRIMARY KEY (`privateitemid`),
KEY `userid` (`userid`),
KEY `itemid` (`itemid`),
KEY `u_and_i` (`userid`,`itemid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I filled each table with 2 million entries of random data. I made some assumptions:
userid varies from 1 to 2000
itemid varies between 1 and 10000
This gives each user about a thousand entries in each table.
Here are two versions of the query (I'm using workbench for my editor):
Version 1 - do all the filtering on the join.
Result: 0.016 seconds to return 1297 rows
delimiter $$
set #userid = 3
$$
SELECT
a.addeditemid,
p.iitemid,
p.itemid
FROM tblprivateitem as p
INNER JOIN tbladdeditem as a
ON (p.userid in (1, #userid))
AND p.itemid = a.itemid
AND a.userid = #userid
$$
Here's the explain:
EXPLAIN:
id select_type table type key ref rows extra
1 SIMPLE p range u_and_i 2150 Using where; Using index
1 SIMPLE a ref i_and_u 1 Using where; Using index
Version 2 - filter up front
Result: 0.015 seconds to return 1297 rows
delimiter $$
set #userid = 3
$$
SELECT
a.addeditemid,
p.iitemid,
p.itemid
from
(select userid, itemid, iitemid from tblprivateitem
where userid in (1, #userid)) as p
join tbladdeditem as a on p.userid = a.userid and a.itemid = p.itemid;
where a.userid = #userid
$$
Here's the explain:
id select_type table type key ref rows extra
1 PRIMARY <derived2> ALL null null 2152
1 PRIMARY a ref i_and_u p.itemid,const 1 Using where; Using index
2 DERIVED p1 range u_and_i 2150 Using where
Since you have the predicate condition tbladdeditem.userid=?userid in the where clause I don't think you need it in the join condition.. Try removing it from the join condition and (If you are using the Or to handle the case where the parameter is null, then use Coalesce instead of OR) if not leave it as an Or
-- If Or is to provide default for when (?userid is null...
SELECT a.addeditemid, p.iitemid, p.itemid
FROM tbladdeditem a
JOIN tblprivateitem p
ON p.itemid=a.itemid
WHERE a.userid=?userid
AND p.userid=Coalesce(?userid, 1)
-- if not then
SELECT a.addeditemid, p.iitemid, p.itemid
FROM tbladdeditem a
JOIN tblprivateitem p
ON p.itemid=a.itemid
WHERE a.userid=?userid
AND (p.userid=?userid Or p.userid = 1)
Second, if there is not an index on the userId column in these two tables, consider adding one.
Finally, if these all fail, try converting to two separate queries and unioning them together:
Select a.addeditemid, p.iitemid, p.itemid
From tbladdeditem a
Join tblprivateitem p
On p.itemid=a.itemid
And p.userId = a.Userid
Where p.userid=?userid
Union
Select a.addeditemid, p.iitemid, p.itemid
From tbladdeditem a
Join tblprivateitem p
On p.itemid=a.itemid
And p.userId = a.Userid
Where p.userid = 1
I would try this instead, on your original JOIN you have an OR associated with a parameter, move that to your WHERE clause.
SELECT
tbladdeditem.addeditemid,
tblprivateitem.iitemid,
tblprivateitem.itemid
FROM tbladdeditem
INNER JOIN tblprivateitem
ON tblprivateitem.itemid=tbladdeditem.itemid
WHERE tbladdeditem.userid=?userid
AND (tblprivateitem.userid=?userid OR tblprivateitem.userid=1)

MySQL: Update rows in table by iterating and joining with another one

I have a table papers
CREATE TABLE `papers` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(1000) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`my_count` int(11) NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `title_fulltext` (`title`),
) ENGINE=MyISAM AUTO_INCREMENT=1617432 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
and another table link_table
CREATE TABLE `auth2paper2loc` (
`auth_id` int(11) NOT NULL,
`paper_id` int(11) NOT NULL,
`loc_id` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The id papers.id from the upper table is the same one like the link_table.paper_id in the second table. I want to iterate through every row in the upper table and count how many times this its id appears in the second table and store the "count" into the column "my_count" in the upper table.
Example: If The paper with tid = 1 = paper_id appears 5 times in the table link_table, then my_count = 5.
I can do that by a Python script but it results in too many querys and I have millions of entrys so it is really slow. And I can't figure out the right syntax to make this right inside of MySQL.
This is what I am iterating about in a for-loop in Python (too slow):
SELECT count(link_table.auth_id) FROM link_table
WHERE link_table.paper_id = %s
UPDATE papers SET auth_count = %s WHERE id = %s
Could someone please tell me how to create this one? There must be a way to nest this and put it directly in MySQL so it is faster, isn't there?
How does this perform for you?
update papers a
set my_count = (select count(*)
from auth2paper2loc b
where b.paper_id = a.id);
Use either:
UPDATE PAPERS
SET my_count = (SELECT COUNT(b.paper_id)
FROM AUTH2PAPERLOC b
WHERE b.paper_id = PAPERS.id)
...or:
UPDATE PAPERS
LEFT JOIN (SELECT b.paper_id,
COUNT(b.paper_id) AS numCount
FROM AUTH2PAPERLOC b
GROUP BY b.paper_id) x ON x.paper_id = PAPERS.id
SET my_count = COALESCE(x.numCount, 0)
The COALESCE is necessary to convert the NULL to a zero when there aren't any instances of PAPERS.id in the AUTH2PAPERLOC table.
update papers left join
(select paper_id, count(*) total from auth2paper2loc group by paper_id) X
on papers.id = X.paper_id
set papers.my_count = IFNULL(X.total, 0)