How to use MySQL's REPLACE Syntax in Rails - mysql

Can I use REPLACE without patching ActiveRecord or executing raw SQL in Rails 4?
I just want to save a record only if there is no corresponding data in its table.
I know find_or_create_by method but I guess this generates twofold queries, SELECT and INSERT.
If there is another INSERT query between the two, it will fail, right?
Or am I worrying too much? (The system I'm working on is not a mission critical one.)

REPLACE is a MySQL extension to the SQL standard.
As ActiveRecord tries to be database agnostic as much as possible, I don't think adding behaviour of a specific database is a priority... last mention I found of people asking for it was on 2009 Forums.
Anyway, I'm not sure if using REPLACE would be the same as using find_or_create_by.
From MySQL Reference:
REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.
find_or_create_by behaves differently, since it will leave the record as it is, if it already exists:
# File 'activerecord/lib/active_record/relation.rb', line 200
def find_or_create_by(attributes, &block)
find_by(attributes) || create(attributes, &block)
end
Also, as you mentioned, find_or_create_by can have a race condition if you don't use it properly. You should use it like this (from ActiveRecord Documentation):
begin
CreditAccount.find_or_create_by(user_id: user.id)
rescue ActiveRecord::RecordNotUnique
retry
end

Related

Update a specific column if it exists, without failing if it does not

I am working with an application which needs to function with any of 300+ different MySQL databases on the same server. The databases all have nearly identical table structures, with slight variations. For example, a particular column might be present in a table for only some of the databases.
I'm wondering if there is a way that, when performing an update on a table, I can update a specific column if it exists, but still successfully execute if the column does not exist.
For example, say I have a basic update statement like this:
UPDATE some_table
SET col1 = "some value",
col2 = "another value",
col3 = "a third value"
WHERE id = 567
What can I do to make it so that, if col3 doesn't actually exist when that query is run, the statement still executes and col1 and col2 are still updated with the new values?
I have tried using IF and CASE, but those seem to only allow changing the value based on some condition, not whether or not a column actually gets updated.
I know I can query the database for the existence of the column, then use a simple if condition in the application code use a different query. However, that requires me to query the database twice: once to see if the column exists, and again to actually update it. I'd prefer to do it with one SQL query if possible. I feel like that application code might start to get unwieldy with lots of extra code to check the existence of this-or-that column and conditionally build queries, instead of just having one query which works regardless of which database the application happens to be running against at the time.
To clarify, any given instance of the application is ever only running against one database; there is a different application instance for each database, but the instances will all be running the same code. These are legacy databases that legacy code is also relying on, so I don't want to modify the actual structures in the database to make them more consistent, for fear of breaking the legacy code.
No, the syntax of your SQL query, including all column identifiers you reference, must be fixed at the time it is parsed, before it validates that the columns exist.
A given UPDATE will either succeed fully or fail fully. There is no way to update some of the columns if the query fails to update all of them.
You have two choices:
Query INFORMATION_SCHEMA.COLUMNS first, to check what columns exist in the table for a given schema. Then format your UPDATE query, including clauses to set each column only if the column exists in that instance of the table.
Or...
Run several UPDATE statements, one for each column you want to update. Each statement will succeed or fail independently, but you can catch the error and continue on to the remaining statements. You can put all these statements in a transaction, so the set of changes is committed atomically, regardless of how many succeed (a single failed statement does not roll back a transaction).
Either way, it requires you to write more code. That's the unavoidable cost of supporting such variable table structure.

Check if rows exist in table

Consider this code. This code inserts the row to the database if it is not found, then only updates it if the row is found. The updateNode() method gives the entity some values based on the user input, so I called it in both cases.
session.beginTransaction();
node = (Node)session.createQuery("from Node").uniqueResult();
if (node == null) {
node = new Node();
updateNode();
session.save(node);
} else {
updateNode();
}
session.getTransaction.commit();
Is there a better way of checking if rows exist in the table aside from using queries?
Is the cat alive or dead? You don't know without checking it, so you'll have to execute the query in the database.
I assume your question is about to avoid writing such a query manually, but rather letting Hibernate do it itself. Then you may want to look at querying by example/prototype.
Regardless of the approach taken, keep concurrency issues in mind though; you may want to apply some unique constraints and/or optimistic/pessimistic locks.
The only way to find out if something already exists in the database is to query it. However you do not need a separate query. You need only one query and not two thanks to mysql's INSERT... ON DUPLICATE KEY UPDATE feature. And it doesn't need any additional java coding either.
If you want to do this with hibernate it will have to be a custom query and you will need to return the inserted row id with LAST_INSERT_ID in your query.

Manually increament primary key - Transaction and racing condition

This may not be a real world issue but is more like a learning topic.
Using PHP, MySQL and PDO, I know all about auto_increment and lastInsertId(). Consider that the primary key has no auto_incerment attribute and we have to use something like SELECT MAX(id) FROM table in order to retrieve last id, increment it manually and then INSERT INTO table (id) VALUES (:lastIdPlusOne). Wrap whole code in beginTransaction and commit.
Is this approach safe? If user A and B at the same time load this script what will happens at the end? both transaction will be failed? Or both will be successful (for instance, if the last id was 10, A will insert 11 and B will insert 12)?
Note that since I am a PHP & MySQL developer, therefor I am more interested in MySQL behavior in this case.
If both got the same max, then the one that inserts first will succeed, and other(s) will fail.
To overcome this issue without using using auto_increment fields, you may use a trigger before insert that does the job (new.id=max) i.e. same logic, but in a trigger, so the DB server is the one who controls it.
Not sure though if this is 100% safe in a master-master replication environment in case of a server failure.
This is #eggyal comment, that I quote here:
You must ensure that you use a locking read to fetch the MAX() in the first (select) query; it will then block until the transaction is committed. However, this is very poor design and should not be used in a production system.

MySQL and implementing something close to sequences?

I am recently in the process of moving from oracle to mysql and would like some advice if how i am implementing something similar to sequences in mysql is a good way.
Essentially how i am currently going to implement it is by having a separate table in mysql for each sequence in oracle and have a single column which represents the last_number and increment this column when ever i insert a new row, that's one way another way i could go about doing it is by creating a single table with several rows representing each sequence and increment each row separately whenever i do an insert.
Another simpler way of doing it i could just do a select max()+1 on the relevant column when inserting data.
I'm basically thinking of switching to the select max()+1 option as it seems simpler to implement, but i would like to get some advice on what you think would be the best way of doing it out of these options, and if there is any pitfalls that i am currently not aware of when using select max()+1.
Also the reason im am not using auto_increment and the function last_insert_id() is i want to follow the ansi standard.
Thanks.
First of all: The max()+1 version is NOT guaranteed to give you a sequence, if you use transactions in a high isolation level.
The way we typically use sequences (if we can't avoid them) is to create a table with an AUTO_INCREMENT value, INSERT INTO it, SELECT last_insert_id(), DELETE FROM table WHERE field<$LASTINSERTID. This is ofcourse done in a stored procedure.
There is a read consistency problem, in that two sessions both running ...
insert into ... select max(..)+1 from ...
... at the same time both see the same value of max(...), hence they both try to insert the same new value.
You have the same problem with your table of maxima method, and you have to use a locking mechanism to avoid multiple session reading the same value. This leads to a concurrency problem where inserts to the table are serialised.

MySQL UPDATE vs INSERT and DELETE

I am working on a web app project and there is a rather large html form that needs to have its data stored in a table. The form and insert are already done but my client wants to be able to load the saved data back into the HTML form and be able to change it, again, this is no problem, but I came across a question when going to do the update, would it be appropriate to just keep the insert query and then delete the old row if it was an edit?
Basically, what already happens is when the form is submitted all of the data is put into a table using INSERT, I also have a flag called edit that contains the primary key ID if the data is for an existing field being updated. I can handle the update function two ways:
a) Create an actual update query with all the fields/data set and use an if/else to decide whether to run the update or insert query.
b) Do the insert every time but add a single line to DELETE WHERE row=editID after the insert is successful.
Since the Delete would only happen if the INSERT was successful I don't run the risk of deleting the data without inserting, thus losing the data, but since INSERT/DELETE is two queries, would it be less efficient than just using an if/else to decide whether to run an insert or update?
There is a second table that uses the auto-increment id as a foreign key, but this table has to be updated every time the form is submitted, so if I delete the row in table A, I will also be deleting the associated rows from table b. This seems like it would be bad programming practice, so I am leaning towards option a) anyway, but it is very tempting just to use the single line option. The DELETE would basically be as follows. Would this in fact be bad programming practice? Aside from conventions, are there any reasons why this is a "never do that!" type of code?
if ($insertFormResults) {
$formId = mysql_insert_id();
echo "Your form was saved successfully.";
if(isset($_POST['edit'])){
$query = "DELETE FROM registerForm WHERE id='$_POST[edit]'";
$result = mysql_query($query);
}
}
Whilst the INSERT/DELETE option would work perfectly well I'd recommend against it as:
Unless you bundle the INSERT/DELETE
up into a single transaction, or
better yet encapsulate the
INSERT/DELETE up into a stored
procedure you do run the theoretical
risk of accumulating duplicates. If
you use a SP or a transaction you're
just effectively rewriting the UPDATE
statement which is obviously
inefficient and moreover will give
rise to a few WTF raised eyebrows
later by anyone maintaining your
code.
Although it doesn't sound like an
issue in your case you are
potentially impacting referential
integrity should you need that.
Furthermore you are loosing the
rather useful ability to easily
retrieve records in creation order.
Probably not a great consideration on
a small application, but you are
going to end up with a seriously
fragmented database fairly quickly
which will slow data retrieval.
Update is only one round trip to the server, which is more efficient. Unless you have a reason that involves the possibility of bad data, always default to using an UPDATE.
It seems to me that doing the delete is pointless, if you run an update in MySql it will only update the record if it is different that what is stored already, is there some reason why you would need to do a delete instead. I usually use a case(switch) to catch update/delete calls from the user,
<?php
switch (action) {
case "delete" :
block of coding;
if the condition equals value1;
break;
case "edit" :
block of coding;
if the condition equals value2;
break;
}
?>