I've looked around for a while, but I am having trouble finding the answer to this. I want to run a single ALTER TABLE to append values to an ENUM field, without hitting any race conditions. The best way I can think of is something similar to this:
ALTER TABLE 'my_table' MODIFY COLUMN 'my_enum' *results_from_subquery_here*
where the subquery is the following:
(SELECT column_type, concat(TRIM(TRAILING ')' FROM column_type),",'new_enum_value')")
FROM information_schema.columns
as T
WHERE table_name = 'my_table'
and column_name ='my_enum')
Which clearly can't just be appended to the first like that.
I've seen certain approaches that use PREPARE and EXECUTE, or trivially via dbi (in perl), but I want to know if it is possible to do without them. Ie, I want to know if it can be done in a single statement, and avoid race conditions.
Also, I know that ENUMs are "evil", in case you were about to mention that.
No, this operation cannot be performed in a single statement.
The ALTER TABLE statement doesn't have any support for running a SELECT subquery. Which is why you've found what you found: a separate SELECT statement being run, and then a second ALTER TABLE statement being run.
FOLLOWUP
To get this type of operation to be "atomic", you'd need to obtain an exclusive lock on the table. The ALTER TABLE does get an exclusive lock, but I think you're asking about two sessions...
session operation
------- -----------------------
one get enum defn ('a','b') and add 'c'
two get enum defn ('a','b') and add 'fee'
two set enum defn ('a','b','fee')
one set enum defn ('a','b','c')
To prevent one from "overwriting" the other, you'd need to establish some sort of locking mechanism to prevent two sessions from performing this operation concurrently.
(I don't think ENUMs are evil; yes, there are some limitations, and we need to take care in using the ENUM datatype.)
Related
I have a table which was created as a select * from a view (and then added a PK).
I want to periodically update the table with all the data from the view.
I thought the best option is to do this using: INSERT INTO table_a SELECT * FROM view_a ON DUPLICATE KEY UPDATE VALUES(non_key_col_1), VALUES(non_key_col_1), .... ;
Since there are quite a lot of columns, and they might change in the future (then I can re-create the table, but I wish I won't have to edit the periodic insert, I was wondering if there is a way to avoid the explicit specification of all columns?
There no such syntax in mysql unfortunately. You'll have to update all the columns one by one.
You can go with a trigger on insert operation, that is if the primary key exists update the row otherwise insert it. But definitely it is going to impact the performance in case of large data
One thing i can think of is get the column names from INFORMATION_SCHEMA.COLUMNS and use those to dynamically compose your query in your app.
SELECT * FROM information_schema.columns WHERE table_name = 'view_a';
Now you have the columns no matter if the view changes.
Do the same for the table and you have the column differences.
Use those differences to run ALTER TABLE statements or drop it and recreate it all together.
Of course this is probably even more laborious then dropping and recreating the table manually.
I'm running MariaDB 5.5.56.
I'm looking to copy an entire row in a database, change one column, then insert the entire row back into the original database (I don't want to have to specify the individual fields because there's a lot of them). The problem I'm running into is how to deal with an auto-increment/primary key column.
example:
create temporary table t_ownership like ownership;
insert into t_ownership (select * from ownership where name='x' LIMIT 1);
update t_ownership set id='something else';
insert into ownership (select * from t_ownership);
I have a column "recno" that is an auto-increment that will create a collision in the database when I try to re-insert the slightly changed record back into the original table.
Something like this seems to work but doesn't result in an insert:
insert into ownership (select * from t_ownership) ON DUPLICATE KEY UPDATE recno=LAST_INSERT_ID(ownership.recno);
The above statement executes without error but does not add a row to table ownership.
So I think I'm close but not quite there...
What would be the best way to do this? I'd like to avoid doing an insert where I manually specify field/values. I just need to regenerate a new A.I. recno column on the insert.
NULL values inserted into auto-incremented fields end up just getting the next auto-increment value, behaving equivalent to INSERTing without specifying the field; so you should be able to update the source (temp copy) to have NULL for that field.
However, one potential issue that could present itself in scenarios like yours is that the CREATE TEMPORARY TABLE ... LIKE could result in a table that would not allow you to set such fields to NULL; this would require you to either ALTER the temporary table, or create it in a more explicit manner. Either way, it now makes code/queries that do not specify columns even more reliant on knowing columns.
Personally, I would take this route in the first place.
INSERT INTO theTable([list all but the auto-inc column])
SELECT [list all but the auto-inc column, with any replacements or modifications desired]
FROM ...[original query]...
It accomplishes the task in one query, makes the queries more self documenting, and only at the cost of a little typing (most of which a decent database browser, or query builder, will do for you).
The only argument really in favor of your current approach is that the table involved can be changed without necessarily breaking your queries; but that begs the question of whether it would be better for such table changes to break the queries, forcing them to be re-examined. If it is not an issue, it is a minor revision; but the alternative is queries that continue to be valid that have the potential to cause unexpected behavior due to copying information they were never intended to.
I have a fairly basic query:
UPDATE the_table SET col1=[something], col2=[something else] WHERE col1 IS NULL AND col2 IS NULL LIMIT 1;
Immediately after issuing the query, the caller does:
SELECT col3 FROM the_table where col1=[something], col2=[something else];
Unfortunately, concurrent callers are claiming the same row.
I'd rather not do a SELECT FOR UPDATE, because the [select, update, select] would involve three rpcs to the database instead of two (which is bad enough.)
I gather that some dialects of sql allow UPDATE the_table WITH(UPDLOCK), but mine (galera/MySQL) does not. I find it appalling that I'd have to go through this many DB hits to execute such a basic concept. I find that most of my searching efforts end on pages that discuss dialects that DO support UPDLOCK.
Where does it go from here?
Do you have autocommit=1?
Without transactional integrity, some other connection can slip in and change the row before you execute the SELECT.
Note that there could be multiple NULL rows, so the UPDATE may be changing many rows.
Did you check the "rows affected" after the UPDATE? Maybe no rows were changed.
I think that it would be better to either execute all the queries in a transaction or to use a stored proc which will be responsible to make all the select and update stuff and then return back to you the respective data from the last select statement. Having such a flow out of transaction, raises issues as the one you describe. You need to lock the row in order not to allow other callers retrieve "dirty" (not up to date) data.
I am recently in the process of moving from oracle to mysql and would like some advice if how i am implementing something similar to sequences in mysql is a good way.
Essentially how i am currently going to implement it is by having a separate table in mysql for each sequence in oracle and have a single column which represents the last_number and increment this column when ever i insert a new row, that's one way another way i could go about doing it is by creating a single table with several rows representing each sequence and increment each row separately whenever i do an insert.
Another simpler way of doing it i could just do a select max()+1 on the relevant column when inserting data.
I'm basically thinking of switching to the select max()+1 option as it seems simpler to implement, but i would like to get some advice on what you think would be the best way of doing it out of these options, and if there is any pitfalls that i am currently not aware of when using select max()+1.
Also the reason im am not using auto_increment and the function last_insert_id() is i want to follow the ansi standard.
Thanks.
First of all: The max()+1 version is NOT guaranteed to give you a sequence, if you use transactions in a high isolation level.
The way we typically use sequences (if we can't avoid them) is to create a table with an AUTO_INCREMENT value, INSERT INTO it, SELECT last_insert_id(), DELETE FROM table WHERE field<$LASTINSERTID. This is ofcourse done in a stored procedure.
There is a read consistency problem, in that two sessions both running ...
insert into ... select max(..)+1 from ...
... at the same time both see the same value of max(...), hence they both try to insert the same new value.
You have the same problem with your table of maxima method, and you have to use a locking mechanism to avoid multiple session reading the same value. This leads to a concurrency problem where inserts to the table are serialised.
Is there a manner to add something like a where clause as a 'global' parameter for a mysql session.
For example an company has multiple user and you want to query for the user in this company, normally you would use a statement like:
SELECT * FROM users WHERE users.companyId = 2;
The issue is that adding the WHERE clauses would mean a huge impact on the code. Though, we defined the relations and thus I image (though, I don't think it exists), that you could create a session with the "global" constrained that all queries in that session should comply to.
You can create a view
CREATE VIEW view2 AS SELECT * FROM table1 WHERE companyid = 2;
If slowness is your curse, there are a few things you can do:
put an index on the where field(s) in this case companyid.
if you need more speed you can partition the table by companyid.
make the table a memory table, and use hash indexes for = and IN fields.
use InnoDB, instead of MyISAM. InnoDB has faster indexes.
Do not use select *, explicitly select only the fields you need.
See: http://dev.mysql.com/doc/refman/5.0/en/create-view.html
http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
The answer is NO.
As said before, you should put an index on that coloumn. And you can create a view.
Also you could use a temporary table.
From mysql docs:
You can use the TEMPORARY keyword when creating a table. A TEMPORARY
table is visible only to the current connection, and is dropped
automatically when the connection is closed. This means that two
different connections can use the same temporary table name without
conflicting with each other or with an existing non-TEMPORARY table of
the same name. (The existing table is hidden until the temporary table
is dropped.)
As a final thought, I can say that if you're performing the same query over and over again, you should rethink your model diagram, maybe do some denormalization.
What will happen when you perform select from table that has no companyId :) You can create views however, and select from them instead