Consider this code. This code inserts the row to the database if it is not found, then only updates it if the row is found. The updateNode() method gives the entity some values based on the user input, so I called it in both cases.
session.beginTransaction();
node = (Node)session.createQuery("from Node").uniqueResult();
if (node == null) {
node = new Node();
updateNode();
session.save(node);
} else {
updateNode();
}
session.getTransaction.commit();
Is there a better way of checking if rows exist in the table aside from using queries?
Is the cat alive or dead? You don't know without checking it, so you'll have to execute the query in the database.
I assume your question is about to avoid writing such a query manually, but rather letting Hibernate do it itself. Then you may want to look at querying by example/prototype.
Regardless of the approach taken, keep concurrency issues in mind though; you may want to apply some unique constraints and/or optimistic/pessimistic locks.
The only way to find out if something already exists in the database is to query it. However you do not need a separate query. You need only one query and not two thanks to mysql's INSERT... ON DUPLICATE KEY UPDATE feature. And it doesn't need any additional java coding either.
If you want to do this with hibernate it will have to be a custom query and you will need to return the inserted row id with LAST_INSERT_ID in your query.
Related
I am working with an application which needs to function with any of 300+ different MySQL databases on the same server. The databases all have nearly identical table structures, with slight variations. For example, a particular column might be present in a table for only some of the databases.
I'm wondering if there is a way that, when performing an update on a table, I can update a specific column if it exists, but still successfully execute if the column does not exist.
For example, say I have a basic update statement like this:
UPDATE some_table
SET col1 = "some value",
col2 = "another value",
col3 = "a third value"
WHERE id = 567
What can I do to make it so that, if col3 doesn't actually exist when that query is run, the statement still executes and col1 and col2 are still updated with the new values?
I have tried using IF and CASE, but those seem to only allow changing the value based on some condition, not whether or not a column actually gets updated.
I know I can query the database for the existence of the column, then use a simple if condition in the application code use a different query. However, that requires me to query the database twice: once to see if the column exists, and again to actually update it. I'd prefer to do it with one SQL query if possible. I feel like that application code might start to get unwieldy with lots of extra code to check the existence of this-or-that column and conditionally build queries, instead of just having one query which works regardless of which database the application happens to be running against at the time.
To clarify, any given instance of the application is ever only running against one database; there is a different application instance for each database, but the instances will all be running the same code. These are legacy databases that legacy code is also relying on, so I don't want to modify the actual structures in the database to make them more consistent, for fear of breaking the legacy code.
No, the syntax of your SQL query, including all column identifiers you reference, must be fixed at the time it is parsed, before it validates that the columns exist.
A given UPDATE will either succeed fully or fail fully. There is no way to update some of the columns if the query fails to update all of them.
You have two choices:
Query INFORMATION_SCHEMA.COLUMNS first, to check what columns exist in the table for a given schema. Then format your UPDATE query, including clauses to set each column only if the column exists in that instance of the table.
Or...
Run several UPDATE statements, one for each column you want to update. Each statement will succeed or fail independently, but you can catch the error and continue on to the remaining statements. You can put all these statements in a transaction, so the set of changes is committed atomically, regardless of how many succeed (a single failed statement does not roll back a transaction).
Either way, it requires you to write more code. That's the unavoidable cost of supporting such variable table structure.
I am recently in the process of moving from oracle to mysql and would like some advice if how i am implementing something similar to sequences in mysql is a good way.
Essentially how i am currently going to implement it is by having a separate table in mysql for each sequence in oracle and have a single column which represents the last_number and increment this column when ever i insert a new row, that's one way another way i could go about doing it is by creating a single table with several rows representing each sequence and increment each row separately whenever i do an insert.
Another simpler way of doing it i could just do a select max()+1 on the relevant column when inserting data.
I'm basically thinking of switching to the select max()+1 option as it seems simpler to implement, but i would like to get some advice on what you think would be the best way of doing it out of these options, and if there is any pitfalls that i am currently not aware of when using select max()+1.
Also the reason im am not using auto_increment and the function last_insert_id() is i want to follow the ansi standard.
Thanks.
First of all: The max()+1 version is NOT guaranteed to give you a sequence, if you use transactions in a high isolation level.
The way we typically use sequences (if we can't avoid them) is to create a table with an AUTO_INCREMENT value, INSERT INTO it, SELECT last_insert_id(), DELETE FROM table WHERE field<$LASTINSERTID. This is ofcourse done in a stored procedure.
There is a read consistency problem, in that two sessions both running ...
insert into ... select max(..)+1 from ...
... at the same time both see the same value of max(...), hence they both try to insert the same new value.
You have the same problem with your table of maxima method, and you have to use a locking mechanism to avoid multiple session reading the same value. This leads to a concurrency problem where inserts to the table are serialised.
I am working with Hibernate 3.6.4 and MySQL.
I have a table with unique constraints on four columns and 3 other columns. When the UI application create new instances of the corresponding Object it may create it with those four properties with values already in the table. the result, upon save, is, of course JDBC Exception of duplicate entry.
Is there a way to tell Hibernate to not insert new entry but update the rest of the three columns or upon each save I need to manually query the DB to see if exist and update accordingly?
Thanks.
The clean and database independent approach for this problem is to first check if such an instance exists and depending on that do an insert or update in your application logic.
That said, there might be a way to take advantage of the MySQL INSERT ... ON DUPLICATE KEY UPDATE feature documented here. In this case you must specify a custom SQL INSERT statement for your entity like described in this related question. But if this works depends on the way your entity IDs are generated to begin with. Take a look at this blog article concerning this issue.
Generally, you must deal with every aspect of the problem that Hibernate thinks a transient instance is persisted, when in fact a persistent instance is updated. This might be an issue with generated entity IDs, other generated entity values, entity versions, concurrency, expected insert/update row count, 2nd level and query cache, etc.
So, I think while this would be a nice thing to experiment with I would definitely not use this feature in a production application.
You must indeed explicitely get the entity with the four unique values, and then update it if it exists or create a new one if it does not. There is no way around that.
BTW, note that even with such a mechanism, you might end up with exceptions if two transactions get the entity concurrently, find that it doesn't exist, and both try to create a new one.
Here is a chunk of the SQL I'm using for a Perl-based web application. I have a number of requests and each has a number of accessions, and each has a status. This chunk of code is there to update the table for every accession_analysis that shares all these fields for each accession in a request.
UPDATE accession_analysis
SET analysis_id = ? ,
reference_id = ? ,
status = ? ,
extra_parameters = ?
WHERE analysis_id = ?
AND reference_id = ?
AND status = ?
AND extra_parameters = ?
and accession_id is (
SELECT accesion_id
FROM accessions
where request_id = ?
)
I have changed the tables so that there's a status table for accession_analysis, so when I update, I update both accession_analysis and accession_analysis_status, which has status, status_text and the id of the accession_analysis, which is a not null auto_increment variable.
I have no strong idea about how to modify this code to allow this. My first pass grabbed all the accessions and looped through them, then filtered for all the fields, then updated. I didn't like that because I had many connections with short SQL commands, which I understood to be bad, but I can't help but think the only way to really do this is to go back to the loop in Perl holding two simpler SQL statements.
Is there a way to do this in SQL that, with my relative SQL inexperience, I'm just not seeing?
The answer depends on which DBMS you're using. The easiest way is to create a trigger on one table that provides the logic of updating the other table. (For any DB newbies -- a trigger is procedural code attached to a table at the DBMS (not application) layer that runs in response to an insert, update or delete on the table.). A similar, slightly less desirable method is to put the logic in a stored procedure and execute that instead of the update statement you're now using.
If the DBMS you're using doesn't support either of these mechanisms, then there isn't a good way to do what you're after while guaranteeing transactional integrity. However if the problem you're solving can tolerate a timing difference in the two tables' updates (i.e. The data in one of the tables is only used at predetermined times, like reporting or some type of batched operation) you could write to one table (live) and create a separate process that runs when needed (later) to update the second table using data from the first table. The correctness of allowing data to be updated at different times becomes a large and immovable design assumption, however.
If this is mostly about connection speed, then one option you have is to write a stored procedure that handles the "double update or insert" transparently. See the manual for stored procedures:
http://dev.mysql.com/doc/refman/5.5/en/create-procedure.html
Otherwise, You probably cannot do it in one statement, see the MySQL INSERT syntax:
http://dev.mysql.com/doc/refman/5.5/en/insert.html
The UPDATE syntax allows for multi-table updates (not in combination with INSERT, though):
http://dev.mysql.com/doc/refman/5.5/en/update.html
Each table needs its own INSERT / UPDATE in the query.
In fact, even if you create a view by JOINing multiple tables, when you INSERT into the view, you can only INSERT with fields belonging to one of the tables at a time.
The modifications made by the INSERT statement cannot affect more than one of the base tables referenced in the FROM clause of the view. For example, an INSERT into a multitable view must use a column_list that references only columns from one base table. For more information about updatable views, see CREATE VIEW.
Inserting data into multiple tables through an sql view (MySQL)
INSERT (SQL Server)
Same is true of UPDATE
The modifications made by the UPDATE statement cannot affect more than one of the base tables referenced in the FROM clause of the view. For more information on updatable views, see CREATE VIEW.
However, you can have multiple INSERTs or UPDATEs per query or stored procedure.
I am working on a web app project and there is a rather large html form that needs to have its data stored in a table. The form and insert are already done but my client wants to be able to load the saved data back into the HTML form and be able to change it, again, this is no problem, but I came across a question when going to do the update, would it be appropriate to just keep the insert query and then delete the old row if it was an edit?
Basically, what already happens is when the form is submitted all of the data is put into a table using INSERT, I also have a flag called edit that contains the primary key ID if the data is for an existing field being updated. I can handle the update function two ways:
a) Create an actual update query with all the fields/data set and use an if/else to decide whether to run the update or insert query.
b) Do the insert every time but add a single line to DELETE WHERE row=editID after the insert is successful.
Since the Delete would only happen if the INSERT was successful I don't run the risk of deleting the data without inserting, thus losing the data, but since INSERT/DELETE is two queries, would it be less efficient than just using an if/else to decide whether to run an insert or update?
There is a second table that uses the auto-increment id as a foreign key, but this table has to be updated every time the form is submitted, so if I delete the row in table A, I will also be deleting the associated rows from table b. This seems like it would be bad programming practice, so I am leaning towards option a) anyway, but it is very tempting just to use the single line option. The DELETE would basically be as follows. Would this in fact be bad programming practice? Aside from conventions, are there any reasons why this is a "never do that!" type of code?
if ($insertFormResults) {
$formId = mysql_insert_id();
echo "Your form was saved successfully.";
if(isset($_POST['edit'])){
$query = "DELETE FROM registerForm WHERE id='$_POST[edit]'";
$result = mysql_query($query);
}
}
Whilst the INSERT/DELETE option would work perfectly well I'd recommend against it as:
Unless you bundle the INSERT/DELETE
up into a single transaction, or
better yet encapsulate the
INSERT/DELETE up into a stored
procedure you do run the theoretical
risk of accumulating duplicates. If
you use a SP or a transaction you're
just effectively rewriting the UPDATE
statement which is obviously
inefficient and moreover will give
rise to a few WTF raised eyebrows
later by anyone maintaining your
code.
Although it doesn't sound like an
issue in your case you are
potentially impacting referential
integrity should you need that.
Furthermore you are loosing the
rather useful ability to easily
retrieve records in creation order.
Probably not a great consideration on
a small application, but you are
going to end up with a seriously
fragmented database fairly quickly
which will slow data retrieval.
Update is only one round trip to the server, which is more efficient. Unless you have a reason that involves the possibility of bad data, always default to using an UPDATE.
It seems to me that doing the delete is pointless, if you run an update in MySql it will only update the record if it is different that what is stored already, is there some reason why you would need to do a delete instead. I usually use a case(switch) to catch update/delete calls from the user,
<?php
switch (action) {
case "delete" :
block of coding;
if the condition equals value1;
break;
case "edit" :
block of coding;
if the condition equals value2;
break;
}
?>