I'm writing an application to:
select a small recordset from a table of subscribers (150k records);
update those rows to indicate that an email is in the process of being sent;
send email to the subscribers in the recordset;
update the rows again to indicate that the email has been sent.
The wrinkle is that the table is simultaneously being accessed by multiple clients to distribute the email workload, which is why there is the intermediate update (to indicate in-process) is used -- to keep the different clients from selecting the same rows, which results in multiple emails being sent to the same subscriber. I've applied some randomizing logic to reduce the likelihood of two clients working with the same data, but it still happens occasionally.
So now I am looking at using SELECT ... FOR UPDATE in order to lock the relevant rows (so another client won't select them). My question: is it better to write the UPDATE statement based on the IDs of the SELECT...FOR UPDATE statement, or to create a loop to update each row individually?
Here's what I've got so far:
DELIMITER $$
CREATE DEFINER=`mydef`#`%` PROCEDURE `sp_SubscribersToSend`(v_limit INTEGER)
BEGIN
START TRANSACTION;
SELECT _ID, email, date_entered, DATE_FORMAT(date_entered, '%b %e, %Y') AS 'date_entered_formatted'
FROM _subscribers
WHERE send_state = 'Send'
AND status = 'Confirmed'
LIMIT v_limit
FOR UPDATE;
[[UPDATE _subscribers SET send_state = 'Sending' WHERE _ID IN (...?)]]
[[OR]]
[[Loop through the resultset and update each row?]]
COMMIT;
END
Seems like a single UPDATE is going to be more efficient; what is the best way to turn the _ID column of the resultset into a comma-delimited list for the IN() clause? (I've been doing this client-side before this) -- or is there a better way altogether?
Instead of trying to create a comma-delimited list, just do an UPDATE with the same criteria as the SELECT
START TRANSACTION;
UPDATE _subscribers
SET send_state = 'Sending'
WHERE send_state = 'Send'
AND status = 'Confirmed'
ORDER BY <something>
LIMIT v_limit;
SELECT _ID, email, date_entered, DATE_FORMAT(date_entered, '%b %e, %Y') AS 'date_entered_formatted'
FROM _subscribers
WHERE send_state = 'Send'
AND status = 'Confirmed'
ORDER BY <something>
LIMIT v_limit;
COMMIT;
The ORDER BY clause is necessary to ensure that both queries process the same rows; if you use LIMIT without ORDER BY, they could select a different subset of rows.
Thanks to Barmar, I took a different tack in the stored procedure:
SET #IDs := null;
UPDATE _subscribers
SET send_state = 'Sending'
WHERE send_state = 'Send'
AND status = 'Confirmed'
AND (SELECT #IDs := CONCAT_WS(',', _ID, #IDs) )
LIMIT v_limit;
SELECT CONVERT(#IDs USING utf8);
As Barmar suggested, it does an UPDATE but also concatenates the IDs of the rows being updated into a variable. Just SELECT that variable, and it gives you a comma-delimited list that can be passed into a PREPARE statement. (I had to use CONVERT because SELECTing the variable was returning a binary/blob value). So...this does not use SELECT...FOR UPDATE as I was originally intending, but it does ensure that the different clients won't be working with the same rows.
Related
I am trying to reduce the number of queries my application uses to build the dashboard and so am trying to gather all the info I will need in advance into one table. Most of the dashboard can be built in javascript using the JSON which will reduce server load doing tons of PHP foreach, which was resulting in excess queries.
With that in mind, I have a query that pulls together user information from 3 other tables, concatenates the results in JSON group by family. I need to update the JSON object any time anything changes in any of the 3 tables, but not sure what the "right " way to do this is.
I could set up a regular job to do an UPDATE statement where date is newer than the last update, but that would miss new records, and if I do inserts it misses updates. I could drop and rebuild the table, but it takes about 16 seconds to run the query as a whole, so that doesn't seem like the right answer.
Here is my initial query:
SET group_concat_max_len = 100000;
SELECT family_id, REPLACE(REPLACE(REPLACE(CONCAT("[", GROUP_CONCAT(family), "]"), "\\", ""), '"[', '['), ']"', ']') as family_members
FROM (
SELECT family_id,
JSON_OBJECT(
"customer_id", c.id,
"family_id", c.family_id,
"first_name", first_name,
"last_name", last_name,
"balance_0_30", pa.balance_0_30,
"balance_31_60", pa.balance_31_60,
"balance_61_90", pa.balance_61_90,
"balance_over_90", pa.balance_over_90,
"account_balance", pa.account_balance,
"lifetime_value", pa.lifetime_value,
"orders", CONCAT("[", past_orders, "]")
) AS family
FROM
customers AS c
LEFT JOIN accounting AS pa ON c.id = pa.customer_id
LEFT JOIN (
SELECT patient_id,
GROUP_CONCAT(
JSON_OBJECT(
"id", id,
"item", item,
"price", price,
"date_ordered", date_ordered
)
) as past_orders
FROM orders
WHERE date_ordered < NOW()
GROUP BY customer_id
) AS r ON r.customer_id = c.id
where c.user_id = 1
) AS results
GROUP BY family_id
I briefly looked into triggers, but what I was hoping for was something like:
create TRIGGER UPDATE_FROM_ORDERS
AFTER INSERT OR UPDATE
ON orders
(EXECUTE QUERY FROM ABOVE WHERE family_id = orders.family_id)
I was hoping to create something like that for each table, but at first glance it doesn't look like you can run complex queries such as that where we are creating nested JSON.
Am I wrong? Are triggers the right way to do this, or is there a better way?
As a demonstration:
DELIMITER $$
CREATE TRIGGER orders_au
ON orders
AFTER UPDATE
FOR EACH ROW
BEGIN
SET group_concat_max_len = 100000
;
UPDATE target_table t
SET t.somecol = ( SELECT expr
FROM ...
WHERE somecol = NEW.family_id
ORDER BY ...
LIMIT 1
)
WHERE t.family_id = NEW.family_id
;
END$$
DELIMITER ;
Notes:
MySQL triggers are row level triggers; a trigger is fired for "for each row" that is affected by the triggering statement. MySQL does not support statement level triggers.
The reference to NEW.family_id is a reference to the value of the family_id column of the row that was just updated, the row that the trigger was fired for.
MySQL trigger prohibits the SQL statements in the trigger from modifying any rows in the orders table. But it can modify other tables.
SQL statements in a trigger body can be arbitrarily complex, as long as its not a bare SELECT returning a resultset, or DML INSERT/UPDATE/DELETE statements. DDL statements (most if not all) are disallowed in a MySQL trigger.
I'm using MySQL 5.6 and I have this issue.
I'm trying to improve my bulk update strategy for this case.
I have a table, called reserved_ids, provided by an external company, to assign unique IDs to its invoices. There is no other way to make this; I can't use auto_increment fields or simulated sequences.
I have this PL pseudocode to make this assignment:
START TRANSACTION;
OPEN invoice_cursor;
read_loop: LOOP
FETCH invoice_cursor INTO internalID;
IF done THEN
LEAVE read_loop;
END IF;
SELECT MIN(SECUENCIAL)
INTO v_secuencial
FROM RESERVED_IDS
WHERE COUNTRY_CODE = p_country_id AND INVOICE_TYPE = p_invoice_type;
DELETE FROM RESERVED_IDS WHERE SECUENCIAL = v_secuencial;
UPDATE MY_INVOICE SET RESERVED_ID = v_secuencial WHERE INVOICE_ID = internalID;
END LOOP read_loop;
CLOSE invoice_cursor;
COMMIT;
So, it's take one - remove - assign, then take next - remove - assign... and so on.
This works, but it's very very slow.
I don't know if there is any approach to make this assignment in a faster way.
I'm looking for something like INSERT INTO SELECT..., but with UPDATE statement, to assign 1000 or 2000 IDs directly, and no one by one.
Please, any suggestion is very helpful for me.
Thanks a lot.
EDIT 1: I have added WHERE clause details, because it was requested by user #vmachan . In the UPDATE...INVOICE clause, I don't filter by other criteria, because I have the direct and indexed invoice ID, which I want to update. Thanks
Finally, I have this solution. It's much faster than my initial approach.
The UPDATE query is
set #a=0;
set #b=0;
UPDATE MY_INVOICE
INNER JOIN
(
select
F.invoice_id,
I.secuencial as RESERVED_ID,
CONCAT_WS(/* format your final invoice ID */) AS FINAL_MY_INVOICE_NUMBER
FROM
(
select if(#a, #a:=#a+1, #a:=1) as current_row, internal_id
from MY_INVOICE
where reserved_id is null
order by internal_id asc
limit 2000
) F
INNER JOIN
(
SELECT if(#b, #b:=#b+1, #b:=1) as current_row, secuencial
from reserved_ids
order by secuencial asc
limit 2000
) I USING (CURRENT_ROW)
) TEMP MY_INVOICE.internal_id=TEMP.INTERNAL_ID
SET MY_INVOICE.RESERVED_ID = TEMP.RESERVED_ID, MY_INVOICE.FINAL_MY_INVOICE_NUMBER=TEMP.FINAL_MY_INVOICE_NUMBER
So, with autogenerated and correlated secuencial numbers #a and #b, we can join two different and no related tables like MY_INVOICE and RESERVED_IDs.
If you want to check this solution, please execute this tricky update following these steps:
Execute #a and then the first inner select in an isolated way: select if(#a, #a:=#a+1, ...
Execute #b and then the second inner select in an isolated way: select if(#b, #b:=#b+1, ...
Execute #a, #b and the big select that builds the TEMP auxiliar table: select F.invoice_id, ...
Execute the UPDATE
Finally, remove the assigned IDs from RESERVED_ID table.
Assignation time reduced drastically. My initial solution was one by one; with this, you assign 2000 (or more) in one single (ok, and a little tricky) update.
Hope this helps.
I have this query
SELECT * FROM outbox where Status=0 ;
then I need to update the selected records so Status should be equal 1
i.e (UPDATE outbox(selected records from SELECT query) SET Status =1 )
any help ?
This is a much harder problem than it sounds. Yes, in the simplistic case where you are only thinking of one user and a few records, it seems easy. But, databases are designed to be ACID-compliant, with multiple users and multiple concurrent transactions that can all be affecting the data at the same time. And there is no single statement in MySQL that does what you want (other databases support an OUTPUT clause, RETURNING or something similar).
One structure that will work in MySQL is to place the items in a temporary table, then do the update, then return them. The following shows the semantics using transactions:
start transaction;
create temporary table TempOutboxStatus0 as
select *
from outbox
where status = 0;
update outbox o
set status = 1
where status = 0;
select *
from TempOutboxStatus0;
commit;
For the update, I actually prefer:
where exists (select 1 from TempOutboxStatus0 t where t.outboxid = o.outboxid);
because its intention is clearer -- and the code is safer in case the conditions subtly change.
Note: you may want to use explicit table locks. Such considerations depend on the storage engine you are using.
BEGIN
Start transaction;
SELECT *
FROM
outbox
where
Status = 0 and
Is_Expired = 0 and
Service_ID=p_service_id
order by
Next_Try_Date asc FOR Update;
update outbox
set
Status=1
where
Status = 0 and
Is_Expired = 0 and
Service_ID=p_service_id;
commit;
END
is this possible .. it seems it works with me
You can do something like that, the outbox is your table:
update outbox set Status = 1 where Status = 0
you can do it like below
$sql=mysql_query("SELECT * FROM outbox where `Status`=0");
while($result=mysql_fetch_array($sql))
{
$update="UPDATE `outbox` SET `Status` =1 where
'your column name'='your previous fetched value');
}
I want to update multiple rows based on a SELECT sql query.
I want to do it ALL IN AN SQL SHELL!
Here is my select:
SELECT #myid := id, #mytitle := title
FROM event
WHERE pid>0 GROUP BY title
ORDER BY start;
Then, I want to do an update with this pseudocode:
foreach($mytitle as $t)
BEGIN
UPDATE event
SET pid=$myid
WHERE title=$t;
END
But I don't know how to ake a loop in SQL.
Maybe there's a way to make it in a single sql query?
I DON'T WANT ANY PHP!!! ONLY SQL SHELL CODE!!!
I want to update every rows with a pid with the id of the first occurence of an event. Start is a timestamp
I think this should do what you want, but if it doesn't (I'm not sure about joining a subquery in an UPDATE query) then you can use a temporary table instead.
UPDATE
event
JOIN (
SELECT
MIN(pid) AS minPID,
title
FROM
event
WHERE
pid > 0
GROUP BY
title
) AS findPIDsQuery ON event.title = findPIDsQuery.title
SET
event.pid = findPIDsQuery.minPID
Pure SQL doesn't really have "loops", per se: it's a set-based descriptive language. I believe the following update will do what you want (though your problem statements leaves much to be desired—we know nothing about the underlying schema).
update event t
set pid = ( select min(id)
from event x
where x.title = t.title
and x.pid > 0
group by x.title
having count(*) > 1
)
Cheers!
I'm trying to write a function to SELECT the least-recently fetched value from a table in my database. I do this by SELECTing a row and then immediately changing the last_used field.
Because this involves a SELECT and UPDATE, I'm trying to do this with locks. The locks are to ensure that concurrent executions of this query won't operate on the same row.
The query runs perfectly fine in phpMyAdmin, but fails in Magento. I get the following error:
SQLSTATE[HY000]: General error
Error occurs here:
#0 /var/www/virtual/magentodev.com/htdocs/lib/Varien/Db/Adapter/Pdo/Mysql.php(249): PDOStatement->fetch(2)
Here is my model's code, including the SQL query:
$write = Mage::getSingleton('core/resource')->getConnection('core_write');
$sql = "LOCK TABLES mytable AS mytable_write WRITE, mytable AS mytable_read READ;
SELECT #val := unique_field_to_grab FROM mytable AS mytable_read ORDER BY last_used ASC LIMIT 1;
UPDATE mytable AS mytable_write SET last_used = unix_timestamp() WHERE unique_field_to_grab = #val LIMIT 1;
UNLOCK TABLES;
SELECT #val AS val;";
$result = $write->raw_fetchrow($sql, 'val');
I've also tried using raw_query and query instead of raw_fetchrow with no luck.
Any thoughts on why this doesn't work? Or is there a better way to accomplish this?
EDIT: I'm starting to think this may be related to the PDO driver, which Magento is definitely using. I think phpMyAdmin is using mysqli, but I can't confirm that.
Probably a function that Magento uses doesn't support multiple sql statements.
Call each statement separately.
exec("LOCK TABLES mytable AS mytable_write WRITE, mytable AS mytable_read READ");
exec("SELECT #val := unique_field_to_grab FROM mytable AS mytable_read ORDER BY last_used ASC LIMIT 1");
exec("UPDATE mytable AS mytable_write SET last_used = unix_timestamp() WHERE unique_field_to_grab = #val LIMIT 1");
exec("UNLOCK TABLES");
exec("SELECT #val AS val");
Use appropriate functions instead of exec().