I need to add a new UUID column to an existing SQLAlchemy / MySQL table where.
For doing so I added in my database model:
class MyTable(db.Model):
uid = db.Column(db.BINARY(16), nullable=False, unique=True, default=uuid.uuid4)
Doing so generates the following alembic upgrade code which of course does not work as the default value if the new column is null:
op.add_column('my_table', sa.Column('uid', sa.BINARY(length=16), nullable=True))
op.create_unique_constraint(None, 'my_table', ['uid'])
I tried to extend the db.Column(db.BINARY(16), nullable=False, unique=True, default=uuid.uuid4) definition with an appropriate server_default=... parameter but wasn't able to find a parameter that generates for each row a new random UUID.
How to add a new column and generate for all existing rows a new random and unique UUID value?
A solution that uses sa.String instead of sa.BINARY would also be acceptable.
In the end I manually created the necessary UPDATE statements in the alembic update file so existing rows are assigned a unique UID.
For new entries default=uuid.uuid4 in the SQLAlchemy column definition is sufficient.
Note that the MySQL UUID() function generates timestamp based UUID values for the existing records, and new records created via Python/SQLAlchemy use uuid.uuid4 which generates a v4 (random) UUID. So both are UUIDs but you will see which UUIDs were generated by UUID() as they only differ in the first block when generating them using the UPDATE statement.
Using binary column type
class MyTable(db.Model):
uid = db.Column(db.BINARY(16), nullable=False, unique=True, default=uuid.uuid4)
def upgrade():
op.add_column('my_table', sa.Column('uid', sa.BINARY(length=16), nullable=False))
op.execute("UPDATE my_table SET uid = (SELECT(UUID_TO_BIN(UUID())))")
op.alter_column('my_table', 'uid', existing_type=sa.BINARY(length=16), nullable=False)
op.create_unique_constraint(None, 'my_table', ['uid'])
Using String/varchar column type
class MyTable(db.Model):
uid = db.Column(db.String(36), nullable=False, unique=True, default=uuid.uuid4)
def upgrade():
op.add_column('my_table', sa.Column('uid', sa.String(length=36), nullable=False))
op.execute("UPDATE my_table SET uid = (SELECT(UUID()))")
op.alter_column('my_table', 'uid', existing_type=sa.String(length=36), nullable=False)
op.create_unique_constraint(None, 'my_table', ['uid'])
As per mysqlalchemy's documentation on server_default:
A text() expression will be rendered as-is, without quotes:
Column('y', DateTime, server_default = text('NOW()'))
y DATETIME DEFAULT NOW()
Based on this, your server_default definition should look like this:
server_default=text('(UUID_TO_BIN(UUID())))')
However, if your mysql version is earlier than v8.0.12, then you cannot use the server side default like this, you need to use either the default with setting uuid from python or you need a trigger as specified in the following SO question: MySQL set default id UUID
Related
I am trying to do a migration to update the value of the column has_bubble_in_countries based on the has_bubble_v1 s column value.
I created before the upgrade() the table:
subscription_old_table = sa.Table(
'Subscription',
sa.MetaData(),
sa.Column('id', sa.Unicode(255), primary_key=True, unique=True, nullable=False),
sa.Column('has_bubble_v1', sa.Boolean, nullable=False, default=False),
sa.Column('has_bubble_in_countries', MutableList.as_mutable(ARRAY(sa.Enum(Country))), nullable=False, default=[], server_default='{}')
)
And then the upgrade() method looks like:
def upgrade():
connection = op.get_bind()
for subscription in connection.execute(subscription_old_table.select()):
if subscription.has_bubble_v1:
connection.execute(
subscription_old_table.update().where(
subscription_old_table.c.id == subscription.id
).values(
has_bubble_in_countries=subscription.has_bubble_in_countries.append(Country.NL),
)
)
# Then drop the column after the data has been migrated
op.drop_column('Subscription', 'has_bubble_v1')
All the rows in the database of has_bubble_in_countries column have this value {} when I check the database using pgadmin's interface.
When the upgrade() function gets to the update method it throws this error:
sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) null value in column "has_bubble_in_countries" of relation "Subscription" violates not-null constraint
DETAIL: Failing row contains (keydsakwlkad, null, 2027-08-14 00:00:00+00,groot abonnement, big, {nl}, null, null, 2022-08-08 08:45:52.875931+00, 3482992, {}, f, null, null, null, t, 2011-05-23 08:55:20.538451+00, 2022-08-08 09:10:15.577283+00, ***null***).
[SQL: UPDATE "Subscription" SET has_bubble_in_countries=%(has_bubble_in_countries)s::country[] WHERE "Subscription".id = %(id_1)s]
[parameters: {'has_bubble_in_countries': None, 'id_1': '1pwohLmjftAZdIaJ'}]
The bolded value from the error is the value that is retrieved for the has_bubble_in_countries column even if it has a server_default='{}' and nullable=False.
Is there any possibility to add a configuration to alembic to recognize the server default s value when it is retrieved from the database? Or how can this be fixed?
I think the problem is actually that are you passing in the result of .append() which is None. Unlike other languages where it is common to return the altered list, append changes the list in place. I'm not sure that is a great idea for a core query result here but it seems to work. Also as far as I know, if you pass in NULL it doesn't trigger the default. The default is used when you pass in no value at all either when inserting or updating.
with Session(engine) as session, session.begin():
for subscription in session.execute(subscription_old_table.select()):
if subscription.has_bubble_v1:
# Append here.
subscription.has_bubble_in_countries.append(Country.NL)
# Then set values:
session.execute(
subscription_old_table.update().where(
subscription_old_table.c.id == subscription.id
).values(has_bubble_in_countries=subscription.has_bubble_in_countries,
)
)
Maybe cloning the list and then adding the element like this would be safer and clearer:
has_bubble_in_countries=subscription.has_bubble_in_countries[:] + [Country.NL]
I have this kind of model:
class A(Base):
id = Column(UUID(as_uuid=True), primary_key=True, server_default=text("uuid_generate_v4()"))
name = Column(String, nullable=False, unique=True)
property = Column(String)
parent_id = Column(UUID(as_uuid=True), ForeignKey(id, ondelete="CASCADE"))
children = relationship(
"A", cascade="all,delete", backref=backref("parent", remote_side=[id])
)
An id is created automatically by the server
I have a relationship from a model to itself (parent and children).
In the background I run a task that periodically receives a message with id of parent and list of pairs (name, property) of children. I would like to update the parent's children in table (Defined by name). Is there a way to do so without reading all children, see which one is missing (name not present is message), need to be updated (name exists but property has changed) or new (name not present in db)?
Do I need to set name to be my primary key and get rid of the UUID?
Thanks
I'd do a single query and compare the result against the message you receive. That way it's easier to handle both additions, removals and updates.
msg_parent_id = 5
msg_children = [('name', 'property'), ('name2', 'property2')]
stmt = select(A).where(A.parent_id == msg_parent_id)
children = session.execute(stmt).scalars()
# Example of determining what to change
name_map = {row.name: row for row in children}
for child_name, child_prop in msg_children:
# Child exists
if child_name in name_map:
# Edit child
if name_map[child_name].property != child_prop:
print(child_name, 'has changed to', property)
del name_map[child_name]
# Add child
else:
print(child_name, 'was added')
# Remove child
for child in name_map.values():
print(child, 'was removed')
Do I need to set name to be my primary key and get rid of the UUID?
Personally I'd add a unique constraint on the name, but still have a separate ID column for the sake of relationships.
Edit for a more ORM orientated way. I believe you can already use A.children = [val1, val2], which is really what you need.
In the past I have used this answer on how to intercept the call, parse the input data, and fetch the existing record from the database if it exists. As part of the that call you could update the property of that record.
Finally use a cascade on the relationship to automatically delete records when parent_id is set to None.
I need create sequence but in generic case not using Sequence class.
USN = Column(Integer, nullable = False, default=nextusn, server_onupdate=nextusn)
, this funcion nextusn is need generate func.max(table.USN) value of rows in model.
I try using this
class nextusn(expression.FunctionElement):
type = Numeric()
name = 'nextusn'
#compiles(nextusn)
def default_nextusn(element, compiler, **kw):
return select(func.max(element.table.c.USN)).first()[0] + 1
but the in this context element not know element.table. Exist way to resolve this?
this is a little tricky, for these reasons:
your SELECT MAX() will return NULL if the table is empty; you should use COALESCE to produce a default "seed" value. See below.
the whole approach of inserting the rows with SELECT MAX is entirely not safe for concurrent use - so you need to make sure only one INSERT statement at a time invokes on the table or you may get constraint violations (you should definitely have a constraint of some kind on this column).
from the SQLAlchemy perspective, you need your custom element to be aware of the actual Column element. We can achieve this either by assigning the "nextusn()" function to the Column after the fact, or below I'll show a more sophisticated approach using events.
I don't understand what you're going for with "server_onupdate=nextusn". "server_onupdate" in SQLAlchemy doesn't actually run any SQL for you, this is a placeholder if for example you created a trigger; but also the "SELECT MAX(id) FROM table" thing is an INSERT pattern, I'm not sure that you mean for anything to be happening here on an UPDATE.
The #compiles extension needs to return a string, running the select() there through compiler.process(). See below.
example:
from sqlalchemy import Column, Integer, create_engine, select, func, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql.expression import ColumnElement
from sqlalchemy.schema import ColumnDefault
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import event
class nextusn_default(ColumnDefault):
"Container for a nextusn() element."
def __init__(self):
super(nextusn_default, self).__init__(None)
#event.listens_for(nextusn_default, "after_parent_attach")
def set_nextusn_parent(default_element, parent_column):
"""Listen for when nextusn_default() is associated with a Column,
assign a nextusn().
"""
assert isinstance(parent_column, Column)
default_element.arg = nextusn(parent_column)
class nextusn(ColumnElement):
"""Represent "SELECT MAX(col) + 1 FROM TABLE".
"""
def __init__(self, column):
self.column = column
#compiles(nextusn)
def compile_nextusn(element, compiler, **kw):
return compiler.process(
select([
func.coalesce(func.max(element.column), 0) + 1
]).as_scalar()
)
Base = declarative_base()
class A(Base):
__tablename__ = 'a'
id = Column(Integer, default=nextusn_default(), primary_key=True)
data = Column(String)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
# will normally pre-execute the default so that we know the PK value
# result.inserted_primary_key will be available
e.execute(A.__table__.insert(), data='single row')
# will run the default expression inline within the INSERT
e.execute(A.__table__.insert(), [{"data": "multirow1"}, {"data": "multirow2"}])
# will also run the default expression inline within the INSERT,
# result.inserted_primary_key will not be available
e.execute(A.__table__.insert(inline=True), data='single inline row')
This source details how to use association proxies to create views and objects with values of an ORM object.
However, when I append an value that matches an existing object in the database (and said value is either unique or a primary key), it creates a conflicting object so I cannot commit.
So in my case is this only useful as a view, and I'll need to use ORM queries to retrieve the object to be appended.
Is this my only option or can I use merge (I may only be able to do this if it's a primary key and not a unique constraint), OR set up the constructor such that it will use an existing object in the database if it exists instead of creating a new object?
For example from the docs:
user.keywords.append('cheese inspector')
# Is translated by the association proxy into the operation:
user.kw.append(Keyword('cheese inspector'))
But I'd like to to be translated to something more like: (of course the query could fail).
keyword = session.query(Keyword).filter(Keyword.keyword == 'cheese inspector').one()
user.kw.append(keyword)
OR ideally
user.kw.append(Keyword('cheese inspector'))
session.merge() # retrieves identical object from the database, or keeps new one
session.commit() # success!
I suppose this may not even be a good idea, but it could be in certain use cases :)
The example shown on the documentation page you link to is a composition type of relationship (in OOP terms) and as such represents the owns type of relationship rather then uses in terms of verbs. Therefore each owner would have its own copy of the same (in terms of value) keyword.
In fact, you can use exactly the suggestion from the documentation you link to in your question to create a custom creator method and hack it to reuse existing object for given key instead of just creating a new one. In this case the sample code of the User class and creator function will look like below:
def _keyword_find_or_create(kw):
keyword = Keyword.query.filter_by(keyword=kw).first()
if not(keyword):
keyword = Keyword(keyword=kw)
# if aufoflush=False used in the session, then uncomment below
#session.add(keyword)
#session.flush()
return keyword
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(64))
kw = relationship("Keyword", secondary=lambda: userkeywords_table)
keywords = association_proxy('kw', 'keyword',
creator=_keyword_find_or_create, # #note: this is the
)
I recently ran into the same problem. Mike Bayer, creator of SQLAlchemy, refered me to the “unique object” recipe but also showed me a variant that uses an event listener. The latter approach modifies the association proxy so that UserKeyword.keyword temporarily points to a plain string and only creates a new Keyword object if the keyword doesn't already exist.
from sqlalchemy import event
# Same User and Keyword classes from documentation
class UserKeyword(Base):
__tablename__ = 'user_keywords'
# Columns
user_id = Column(Integer, ForeignKey(User.id), primary_key=True)
keyword_id = Column(Integer, ForeignKey(Keyword.id), primary_key=True)
special_key = Column(String(50))
# Bidirectional attribute/collection of 'user'/'user_keywords'
user = relationship(
User,
backref=backref(
'user_keywords',
cascade='all, delete-orphan'
)
)
# Reference to the 'Keyword' object
keyword = relationship(Keyword)
def __init__(self, keyword=None, user=None, special_key=None):
self._keyword_keyword = keyword_keyword # temporary, will turn into a
# Keyword when we attach to a
# Session
self.special_key = special_key
#property
def keyword_keyword(self):
if self.keyword is not None:
return self.keyword.keyword
else:
return self._keyword_keyword
#event.listens_for(Session, "after_attach")
def after_attach(session, instance):
# when UserKeyword objects are attached to a Session, figure out what
# Keyword in the database it should point to, or create a new one
if isinstance(instance, UserKeyword):
with session.no_autoflush:
keyword = session.query(Keyword).\
filter_by(keyword=instance._keyword_keyword).\
first()
if keyword is None:
keyword = Keyword(keyword=instance._keyword_keyword)
instance.keyword = keyword
I'm using Hibernate to access MySQL, and I have a table with an auto-increment primary key.
Everytime I insert a row into the table I don't need to specify the primary key. But after I insert a new row, how can I get the relative primary key immediately using hibernate?
Or I can just use jdbc to do this?
When you save the hibernate entity, the id property will be populated for you. So if you have
MyThing thing = new MyThing();
...
// save the transient instance.
dao.save(thing);
// after the session flushes, thing.getId() should return the id.
I actually almost always do an assertNotNull on the id of a persisted entity in my tests to make sure the save worked.
Once you're persisted the object, you should be able to call getId() or whatever your #ID column is, so you could return that from your method. You could also invalidate the Hibernate first level cache and fetch it again.
However, for portability, you might want to look at using Hibernate with sequence style ID generation. This will ease the transition away from MySQL if you ever need to. Certainly, if you use this style of generator, you'll be able to get the ID immediately, because Hibernate needs to resolve the column value before it persists the object:
#Id
#GeneratedValue (generator="MY_SEQ")
#GenericGenerator( name = "MY_SEQ",
strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
parameters = {
#Parameter(name = "sequence_name", value = "MY_SEQ"),
#Parameter(name = "initial_value", value = "1"),
#Parameter(name = "increment_size", value = "10") }
)
#Column ( name = "id", nullable = false )
public Long getId () {
return this.id;
}
It's a bit more complex, but it's the kind of thing you can cut and paste, apart from changing the SEQUENCE name.
When you are calling a save() method in Hibernate, the object doesn't get written to the database immediately. It occurs either when you try to read from the database (from the same table?) or explicitly call flush(). Until the corresponding record is not inserted into the database table, MySQL would not allocate an id for it.
So, the id is available, but not before Hibernate actually inserts the record into the MySQL table.
If you want, you can get the next primary key independently of an object using:
Session session = SessionFactoryUtil.getSessionFactory().getCurrentSession();
Query query = session.createSQLQuery( "select nextval('schemaName.keySequence')" );
Long key = (Long) query.list().get( 0 );
return key;
Well in case of auto increment generator class, when we use the save() method it returns the primary key (assuming its id). So it returns that particular id, so you can do this
int id = session.save(modelClass);
And return id