MySQL UPDATE vs INSERT and DELETE - mysql

I am working on a web app project and there is a rather large html form that needs to have its data stored in a table. The form and insert are already done but my client wants to be able to load the saved data back into the HTML form and be able to change it, again, this is no problem, but I came across a question when going to do the update, would it be appropriate to just keep the insert query and then delete the old row if it was an edit?
Basically, what already happens is when the form is submitted all of the data is put into a table using INSERT, I also have a flag called edit that contains the primary key ID if the data is for an existing field being updated. I can handle the update function two ways:
a) Create an actual update query with all the fields/data set and use an if/else to decide whether to run the update or insert query.
b) Do the insert every time but add a single line to DELETE WHERE row=editID after the insert is successful.
Since the Delete would only happen if the INSERT was successful I don't run the risk of deleting the data without inserting, thus losing the data, but since INSERT/DELETE is two queries, would it be less efficient than just using an if/else to decide whether to run an insert or update?
There is a second table that uses the auto-increment id as a foreign key, but this table has to be updated every time the form is submitted, so if I delete the row in table A, I will also be deleting the associated rows from table b. This seems like it would be bad programming practice, so I am leaning towards option a) anyway, but it is very tempting just to use the single line option. The DELETE would basically be as follows. Would this in fact be bad programming practice? Aside from conventions, are there any reasons why this is a "never do that!" type of code?
if ($insertFormResults) {
$formId = mysql_insert_id();
echo "Your form was saved successfully.";
if(isset($_POST['edit'])){
$query = "DELETE FROM registerForm WHERE id='$_POST[edit]'";
$result = mysql_query($query);
}
}

Whilst the INSERT/DELETE option would work perfectly well I'd recommend against it as:
Unless you bundle the INSERT/DELETE
up into a single transaction, or
better yet encapsulate the
INSERT/DELETE up into a stored
procedure you do run the theoretical
risk of accumulating duplicates. If
you use a SP or a transaction you're
just effectively rewriting the UPDATE
statement which is obviously
inefficient and moreover will give
rise to a few WTF raised eyebrows
later by anyone maintaining your
code.
Although it doesn't sound like an
issue in your case you are
potentially impacting referential
integrity should you need that.
Furthermore you are loosing the
rather useful ability to easily
retrieve records in creation order.
Probably not a great consideration on
a small application, but you are
going to end up with a seriously
fragmented database fairly quickly
which will slow data retrieval.

Update is only one round trip to the server, which is more efficient. Unless you have a reason that involves the possibility of bad data, always default to using an UPDATE.

It seems to me that doing the delete is pointless, if you run an update in MySql it will only update the record if it is different that what is stored already, is there some reason why you would need to do a delete instead. I usually use a case(switch) to catch update/delete calls from the user,
<?php
switch (action) {
case "delete" :
block of coding;
if the condition equals value1;
break;
case "edit" :
block of coding;
if the condition equals value2;
break;
}
?>

Related

Update a specific column if it exists, without failing if it does not

I am working with an application which needs to function with any of 300+ different MySQL databases on the same server. The databases all have nearly identical table structures, with slight variations. For example, a particular column might be present in a table for only some of the databases.
I'm wondering if there is a way that, when performing an update on a table, I can update a specific column if it exists, but still successfully execute if the column does not exist.
For example, say I have a basic update statement like this:
UPDATE some_table
SET col1 = "some value",
col2 = "another value",
col3 = "a third value"
WHERE id = 567
What can I do to make it so that, if col3 doesn't actually exist when that query is run, the statement still executes and col1 and col2 are still updated with the new values?
I have tried using IF and CASE, but those seem to only allow changing the value based on some condition, not whether or not a column actually gets updated.
I know I can query the database for the existence of the column, then use a simple if condition in the application code use a different query. However, that requires me to query the database twice: once to see if the column exists, and again to actually update it. I'd prefer to do it with one SQL query if possible. I feel like that application code might start to get unwieldy with lots of extra code to check the existence of this-or-that column and conditionally build queries, instead of just having one query which works regardless of which database the application happens to be running against at the time.
To clarify, any given instance of the application is ever only running against one database; there is a different application instance for each database, but the instances will all be running the same code. These are legacy databases that legacy code is also relying on, so I don't want to modify the actual structures in the database to make them more consistent, for fear of breaking the legacy code.
No, the syntax of your SQL query, including all column identifiers you reference, must be fixed at the time it is parsed, before it validates that the columns exist.
A given UPDATE will either succeed fully or fail fully. There is no way to update some of the columns if the query fails to update all of them.
You have two choices:
Query INFORMATION_SCHEMA.COLUMNS first, to check what columns exist in the table for a given schema. Then format your UPDATE query, including clauses to set each column only if the column exists in that instance of the table.
Or...
Run several UPDATE statements, one for each column you want to update. Each statement will succeed or fail independently, but you can catch the error and continue on to the remaining statements. You can put all these statements in a transaction, so the set of changes is committed atomically, regardless of how many succeed (a single failed statement does not roll back a transaction).
Either way, it requires you to write more code. That's the unavoidable cost of supporting such variable table structure.

SQL Trigger on Update, Insert, Delete on non-specific row, column, or table

I have several databases that are used by several applications (one of which is our own, the others we have no control over in what they do).
Out software has to know when the database has last been changed. For reasons I won't get into to keep this short we decided that going with a new table per database that has a singular field: last_changed_on that has a GetDate() as a value. This way our own software can check when it was last changed and check it to the date it has stored for said database and do things if the date is newer than what is stored in-memory.
After doing some research we decided that working with Triggers was the way to go, but from what I could find online, triggers look at specific columns that you set for Updates.
What I'd like to know is if there is a way to automate the process or just have a trigger that happens whenever anything happens insert, update, remove wise?
So I am looking for something like this:
CREATE TRIGGER LastModifiedTrigger
ON [dbo].[anytable]
AFTER INSERT, UPDATE, DELETE
AS
INSERT INTO dbo.LastModifiedTable (last_modified_on) VALUES (CURRENT_TIMESTAMP)
I know that the above example isn't a correct trigger, I'm rather new to them so I was unsure on how to word it.
It might be interesting to note that I can have my own software run several queries creating the queries automatically for each table and each column, but I'd rather avoid to do that as keeping track of all those triggers will be a pain in the long run.
I'd prefer to have a little triggers per database as possible, if only by not having to make a trigger for each individual column name.
Edit: To clarify: I am trying to avoid having to create an automated script that goes and scans every table, and sequentially every column of every table, to create a trigger to see if something is changed there. My biggest issue at the moment is the trigger behavior on updates, but I'm hoping to avoid having to specify tables as well for insert and delete
Edit 2: To avoid future confusion, I'm looking for a solution to this problem for both SQL Server (MS SQL/T SQL) and MySQL
Edit 3: Turns out that I read the documentation very wrongly and (at least on MySql) the trigger activates on any given updated column without having to define a specific one. Regardless, I'm still wondering if there is a way to just have less triggers than having one for each table in a database. (i.e. 1 for any type of update(), 1 for any type of insert(), and 1 for any type of delete()
EDIT 4: Forgot that the argument for overwriting 1 field will come with performance issues, I've considered this and I'm now working with multiple rows. I've also handled the creating of 3 triggers (insert(), update(), and delete()) for each database through my software's code, I really wished this could've been avoided, but it cannot.
Solution
After a bunch more digging on the internet and keep finding opposite results of what I was looking for, and a bunch of trial and error, I found a solution.
First and foremost: having triggers not being dependent on a table (aka, the trigger activates for every table is impossible, it cannot be done, which is too bad, it would've been nice to keep this out of the program code, but nothing I can do about it.
Second: the issue for updates on not being column specific was an error due to my part for searching for triggers not being dependent on specific columns only giving me examples for triggers that are.
The following solution works for MySql, I have yet to test this on SQL Server, but I expect it to not be too different.
CREATE TRIGGER [tablename]_last_modified_insert
AFTER INSERT/UPDATE/DELETE ON [db].[tablename]
FOR EACH ROW
BEGIN
INSERT INTO [db].last_modified(last_modified_on)
VALUES(current_timestamp())
END
As for dynamically creating these triggers, the following show how I get it to work:
First Query:
SHOW TABLES
I run the above query to get all the tables in the database, exclude the last_modified I made myself, and loop through all of them, creating 3 triggers for each.
A big thank you to Arvo and T2PS for their replies, their comments helped by pointing me in the right direction and writing up the solution.
You're slightly off in the assumption that SQL Server triggers are per-column; the CREATE TRIGGER syntax binds the trigger to the named table for the specified operations. The trigger will be called with two logical tables in scope (inserted & deleted) that contain the rows modified by the operation that caused the trigger to fire; if you wanted to check for specific columns' values or changes, then the trigger logic would need to operate against those logical tables.
If you take this approach, you will need to create a trigger for each table you wish to monitor in this fashion; we've had a similar need to track changes (at a more granular level), we didn't find a "pseudotable" that corresponds to all tables in a schema/database. You should also be aware that locking semantics will come into play by doing this, as you will have triggers from multiple tables all targeting the same row for an update as part of separate operations -- depending on the concurrency model in effect, you could be looking at performance consequences by doing so if you expect multiple DML queries to operate concurrently against your database.
I would suggest checking Arvo's commented link above for suitability instead; querying system views is more likely to avoid contention (and other performance-related) issues from using triggers in your scenario.
After a bunch more digging on the internet and keep finding opposite results of what I was looking for, and a bunch of trial and error, I found a solution.
First and foremost: having triggers not being dependent on a table (aka, the trigger activates for every table is impossible, it cannot be done, which is too bad, it would've been nice to keep this out of the program code, but nothing I can do about it.
Second: the issue for updates on not being column specific was an error due to my part for searching for triggers not being dependent on specific columns only giving me examples for triggers that are.
The following solution works for MySQL, I have yet to test this on SQL Server, but I expect it to not be too different.
CREATE TRIGGER [tablename]_last_modified_insert
AFTER INSERT/UPDATE/DELETE ON [db].[tablename]
FOR EACH ROW
BEGIN
INSERT INTO [db].last_modified(last_modified_on)
VALUES(current_timestamp())
END
As for dynamically creating these triggers, the following show how I get it to work:
First Query:
SHOW TABLES
I run the above query to get all the tables in the database, exclude the last_modified I made myself, and loop through all of them, creating 3 triggers for each.
Perhaps you could use Audit for SQL Server:
CREATE SERVER AUDIT [ServerAuditName]
TO FILE
(
FILEPATH = N'C:\Program Files......'
)
ALTER SERVER AUDIT [ServerAuditName] WITH (STATE=ON)
GO
CREATE DATABASE AUDIT SPECIFICATION [mySpec]
FOR SERVER AUDIT [ServerAuditName]
ADD (INSERT, UPDATE, DELETE ON DATABASE::databasename BY [public])
WITH (STATE=ON)
GO
Then you can query for changes:
SELECT *
FROM sys.fn_get_audit_file ('C:\Program Files......',default,default);
GO

re-inserting a table record and updating an auto increment primary index

I'm running MariaDB 5.5.56.
I'm looking to copy an entire row in a database, change one column, then insert the entire row back into the original database (I don't want to have to specify the individual fields because there's a lot of them). The problem I'm running into is how to deal with an auto-increment/primary key column.
example:
create temporary table t_ownership like ownership;
insert into t_ownership (select * from ownership where name='x' LIMIT 1);
update t_ownership set id='something else';
insert into ownership (select * from t_ownership);
I have a column "recno" that is an auto-increment that will create a collision in the database when I try to re-insert the slightly changed record back into the original table.
Something like this seems to work but doesn't result in an insert:
insert into ownership (select * from t_ownership) ON DUPLICATE KEY UPDATE recno=LAST_INSERT_ID(ownership.recno);
The above statement executes without error but does not add a row to table ownership.
So I think I'm close but not quite there...
What would be the best way to do this? I'd like to avoid doing an insert where I manually specify field/values. I just need to regenerate a new A.I. recno column on the insert.
NULL values inserted into auto-incremented fields end up just getting the next auto-increment value, behaving equivalent to INSERTing without specifying the field; so you should be able to update the source (temp copy) to have NULL for that field.
However, one potential issue that could present itself in scenarios like yours is that the CREATE TEMPORARY TABLE ... LIKE could result in a table that would not allow you to set such fields to NULL; this would require you to either ALTER the temporary table, or create it in a more explicit manner. Either way, it now makes code/queries that do not specify columns even more reliant on knowing columns.
Personally, I would take this route in the first place.
INSERT INTO theTable([list all but the auto-inc column])
SELECT [list all but the auto-inc column, with any replacements or modifications desired]
FROM ...[original query]...
It accomplishes the task in one query, makes the queries more self documenting, and only at the cost of a little typing (most of which a decent database browser, or query builder, will do for you).
The only argument really in favor of your current approach is that the table involved can be changed without necessarily breaking your queries; but that begs the question of whether it would be better for such table changes to break the queries, forcing them to be re-examined. If it is not an issue, it is a minor revision; but the alternative is queries that continue to be valid that have the potential to cause unexpected behavior due to copying information they were never intended to.

Check if rows exist in table

Consider this code. This code inserts the row to the database if it is not found, then only updates it if the row is found. The updateNode() method gives the entity some values based on the user input, so I called it in both cases.
session.beginTransaction();
node = (Node)session.createQuery("from Node").uniqueResult();
if (node == null) {
node = new Node();
updateNode();
session.save(node);
} else {
updateNode();
}
session.getTransaction.commit();
Is there a better way of checking if rows exist in the table aside from using queries?
Is the cat alive or dead? You don't know without checking it, so you'll have to execute the query in the database.
I assume your question is about to avoid writing such a query manually, but rather letting Hibernate do it itself. Then you may want to look at querying by example/prototype.
Regardless of the approach taken, keep concurrency issues in mind though; you may want to apply some unique constraints and/or optimistic/pessimistic locks.
The only way to find out if something already exists in the database is to query it. However you do not need a separate query. You need only one query and not two thanks to mysql's INSERT... ON DUPLICATE KEY UPDATE feature. And it doesn't need any additional java coding either.
If you want to do this with hibernate it will have to be a custom query and you will need to return the inserted row id with LAST_INSERT_ID in your query.

Can I INSERT/UPDATE into two tables with one query?

Here is a chunk of the SQL I'm using for a Perl-based web application. I have a number of requests and each has a number of accessions, and each has a status. This chunk of code is there to update the table for every accession_analysis that shares all these fields for each accession in a request.
UPDATE accession_analysis
SET analysis_id = ? ,
reference_id = ? ,
status = ? ,
extra_parameters = ?
WHERE analysis_id = ?
AND reference_id = ?
AND status = ?
AND extra_parameters = ?
and accession_id is (
SELECT accesion_id
FROM accessions
where request_id = ?
)
I have changed the tables so that there's a status table for accession_analysis, so when I update, I update both accession_analysis and accession_analysis_status, which has status, status_text and the id of the accession_analysis, which is a not null auto_increment variable.
I have no strong idea about how to modify this code to allow this. My first pass grabbed all the accessions and looped through them, then filtered for all the fields, then updated. I didn't like that because I had many connections with short SQL commands, which I understood to be bad, but I can't help but think the only way to really do this is to go back to the loop in Perl holding two simpler SQL statements.
Is there a way to do this in SQL that, with my relative SQL inexperience, I'm just not seeing?
The answer depends on which DBMS you're using. The easiest way is to create a trigger on one table that provides the logic of updating the other table. (For any DB newbies -- a trigger is procedural code attached to a table at the DBMS (not application) layer that runs in response to an insert, update or delete on the table.). A similar, slightly less desirable method is to put the logic in a stored procedure and execute that instead of the update statement you're now using.
If the DBMS you're using doesn't support either of these mechanisms, then there isn't a good way to do what you're after while guaranteeing transactional integrity. However if the problem you're solving can tolerate a timing difference in the two tables' updates (i.e. The data in one of the tables is only used at predetermined times, like reporting or some type of batched operation) you could write to one table (live) and create a separate process that runs when needed (later) to update the second table using data from the first table. The correctness of allowing data to be updated at different times becomes a large and immovable design assumption, however.
If this is mostly about connection speed, then one option you have is to write a stored procedure that handles the "double update or insert" transparently. See the manual for stored procedures:
http://dev.mysql.com/doc/refman/5.5/en/create-procedure.html
Otherwise, You probably cannot do it in one statement, see the MySQL INSERT syntax:
http://dev.mysql.com/doc/refman/5.5/en/insert.html
The UPDATE syntax allows for multi-table updates (not in combination with INSERT, though):
http://dev.mysql.com/doc/refman/5.5/en/update.html
Each table needs its own INSERT / UPDATE in the query.
In fact, even if you create a view by JOINing multiple tables, when you INSERT into the view, you can only INSERT with fields belonging to one of the tables at a time.
The modifications made by the INSERT statement cannot affect more than one of the base tables referenced in the FROM clause of the view. For example, an INSERT into a multitable view must use a column_list that references only columns from one base table. For more information about updatable views, see CREATE VIEW.
Inserting data into multiple tables through an sql view (MySQL)
INSERT (SQL Server)
Same is true of UPDATE
The modifications made by the UPDATE statement cannot affect more than one of the base tables referenced in the FROM clause of the view. For more information on updatable views, see CREATE VIEW.
However, you can have multiple INSERTs or UPDATEs per query or stored procedure.