How does SqlAlchemy handle unique constraint in table definition - sqlalchemy

I have a table with the following declarative definition:
class Type(Base):
__tablename__ = 'Type'
id = Column(Integer, primary_key=True)
name = Column(String, unique = True)
def __init__(self, name):
self.name = name
The column "name" has a unique constraint, but I'm able to do
type1 = Type('name1')
session.add(type1)
type2 = Type(type1.name)
session.add(type2)
So, as can be seen, the unique constraint is not checked at all, since I have added to the session 2 objects with the same name.
When I do session.commit(), I get a mysql error since the constraint is also in the mysql table.
Is it possible that sqlalchemy tells me in advance that I can not make it or identifies it and does not insert 2 entries with the same "name" columm?
If not, should I keep in memory all existing names, so I can check if they exist of not, before creating the object?

SQLAlechemy doesn't handle uniquness, because it's not possible to do good way. Even if you keep track of created objects and/or check whether object with such name exists there is a race condition: anybody in other process can insert a new object with the name you just checked. The only solution is to lock whole table before check and release the lock after insertion (some databases support such locking).

AFAIK, sqlalchemy does not handle uniqueness constraints in python behavior. Those "unique=True" declarations are only used to impose database level table constraints, and only then if you create the table using a sqlalchemy command, i.e.
Type.__table__.create(engine)
or some such. If you create an SA model against an existing table that does not actually have this constraint present, it will be as if it does not exist.
Depending on your specific use case, you'll probably have to use a pattern like
try:
existing = session.query(Type).filter_by(name='name1').one()
# do something with existing
except:
newobj = Type('name1')
session.add(newobj)
or a variant, or you'll just have to catch the mysql exception and recover from there.

From the docs
class MyClass(Base):
__tablename__ = 'sometable'
__table_args__ = (
ForeignKeyConstraint(['id'], ['remote_table.id']),
UniqueConstraint('foo'),
{'autoload':True}
)

.one() throws two kinds of exceptions:
sqlalchemy.orm.exc.NoResultFound and sqlalchemy.orm.exc.MultipleResultsFound
You should create that object when the first exception occurs, if the second occurs you're screwed anyway and shouldn't make is worse.
try:
existing = session.query(Type).filter_by(name='name1').one()
# do something with existing
except NoResultFound:
newobj = Type('name1')
session.add(newobj)

Related

Django MySQL UUID

I had a django model field which was working in the default sqlite db:
uuid = models.TextField(default=uuid.uuid4, editable=False, unique=True).
However, when I tried to migrate to MySQL, I got the error:
django.db.utils.OperationalError: (1170, "BLOB/TEXT column 'uuid' used in key specification without a key length")
The first thing I tried was removing the unique=True, but I got the same error. Next, since I had another field (which successfully migrated ):
id = models.UUIDField(default=uuid.uuid4, editable=False)
I tried changing uuid to UUIDField, but I still get the same error. Finally, I changed uuid to:
uuid = models.TextField(editable=False)
But I am still getting the same error when migrating (DROP all the tables, makemigrations, migrate --run-syncdb). Ideally, I want to have a UUIDField or TextField with default = uuid.uuid4, editable = False, and unique = True, but I am fine doing these tasks in the view when creating the object.
You need to set max_length for your uuid field. Since UUID v4 uses 36 charactes, you can set it to max_length=36 (make sure you don't have longer values in the db already):
uuid = models.TextField(default=uuid.uuid4, editable=False, unique=True, max_length=36)

Is there a specific ordering needed for classes in Peewee models?

I'm currently trying to create an ORM model in Peewee for an application. However, I seem to be running into an issue when querying a specific model. After some debugging, I found out that it is whatever below a specific model, it's failing.
I've moved around models (with the given ForeignKeys still being in check), and for some odd reason, it's only what is below a specific class (User).
def get_user(user_id):
user = User.select().where(User.id==user_id).get()
return user
class BaseModel(pw.Model):
"""A base model that will use our MySQL database"""
class Meta:
database = db
class User(BaseModel):
id = pw.AutoField()
steam_id = pw.CharField(max_length=40, unique=True)
name = pw.CharField(max_length=40)
admin = pw.BooleanField(default=False)
super_admin = pw.BooleanField()
#...
I expected to be able to query Season like every other model. However, this the peewee error I run into, when I try querying the User.id of 1 (i.e. User.select().where(User.id==1).get() or get_user(1)), I get an error returned with the value not even being inputted.
UserDoesNotExist: <Model: User> instance matching query does not exist:
SQL: SELECT `t1`.`id`, `t1`.`steam_id`, `t1`.`name`, `t1`.`admin`, `t1`.`super_admin` FROM `user` AS `t1` WHERE %s LIMIT %s OFFSET %s
Params: [False, 1, 0]
Does anyone have a clue as to why I'm getting this error?
Read the error message. It is telling you that the user with the given ID does not exist.
Peewee raises an exception if the call to .get() does not match any rows. If you want "get or None if not found" you can do a couple things. Wrap the call to .get() with a try / except, or use get_or_none().
http://docs.peewee-orm.com/en/latest/peewee/api.html#Model.get_or_none
Well I think I figured it out here. Instead of querying directly for the server ID, I just did a User.get(1) as that seems to do the trick. More reading shows there's a get by id as well.

Updating object fields from separate processes? (kind of upsert)

I have Task objects with several attributes. These tasks are bounced between several processes (using Celery) and I'd like to update the task status in a database.
Every update should update only non-NULL attributes of the object. So far I have something like:
def del_empty_attrs(task):
for name in (key for key, val in vars(task).iteritems() if val is None):
delattr(task, name)
def update_task(session, id, **kw):
task = session.query(Task).get(id)
if task is None:
task = Task(id=id)
for key, value in kw.iteritems():
if not hasattr(task, key):
raise AttributeError('Task does not have {} attribute'.format(key))
setattr(task, key, value)
del_empty_attrs(task) # Don't update empty fields
session.merge(task)
However, get either IntegrityError or StaleDataError. What the right way to do this?
I think the problem is that every process has its own session, but I'm not sure.
a lot more detail would be needed to say for sure, but there is a race condition in this code:
def update_task(session, id, **kw):
# 1.
task = session.query(Task).get(id)
if task is None:
# 2.
task = Task(id=id)
for key, value in kw.iteritems():
if not hasattr(task, key):
raise AttributeError('Task does not have {} attribute'.format(key))
setattr(task, key, value)
del_empty_attrs(task) # Don't update empty fields
# 3.
session.merge(task)
If two processes both encounter #1, and find the object for the given id to be None, they both proceed to create a new Task() object with the given primary key (assuming id here is the primary key attribute). Both processes then race down to the Session.merge() which will attempt to emit an INSERT for the row. One process gets the INSERT, the other one gets an IntegrityError as it did not INSERT the row before the other one did.
There's no simple answer for how to "fix" this, it depends on what you're trying to do. One approach might be to ensure that no two processes work on the same pool of primary key identifiers. Another would be to ensure that all INSERTs of non-existent rows are handled by a single process.
Edit: other approaches might involve going with an "optimistic" approach, where SAVEPOINT (e.g. Session.begin_nested()) is used to intercept an IntegrityError on an INSERT, then continue on after it occurs.

Errors creating generic relations using content types (object_pk)

I am working to use django's ContentType framework to create some generic relations for a my models; after looking at how the django developers do it at django.contrib.comments.models I thought I would imitate their approach/conventions:
from django.contrib.comments.models, line 21):
content_type = models.ForeignKey(ContentType,
verbose_name='content type',
related_name="content_type_set_for_%(class)s")
object_pk = models.TextField('object ID')
content_object = generic.GenericForeignKey(ct_field="content_type", fk_field="object_pk")
That's taken from their source and, of course, their source works for me (I have comments with object_pk's stored just fine (integers, actually); however, I get an error during syncdb on table creation that ends:
_mysql_exceptions.OperationalError: (1170, "BLOB/TEXT column 'object_pk' used in key specification without a key length")
Any ideas why they can do it and I can't ?
After looking around, I noticed that the docs actually state:
Give your model a field that can store a primary-key value from the models you'll be relating to. (For most models, this means an IntegerField or PositiveIntegerField.)
This field must be of the same type as the primary key of the models that will be involved in the generic relation. For example, if you use IntegerField, you won't be able to form a generic relation with a model that uses a CharField as a primary key.
But why can they do it and not me ?!
Thanks.
PS: I even tried creating an AbstractBaseModel with these three fields, making it abstract=True and using that (in case that had something to do with it) ... same error.
After I typed out that really long question I looked at the mysql and realized that the error was stemming from:
class Meta:
unique_together = (("content_type", "object_pk"),)
Apparently, I can't have it both ways. Which leaves me torn. I'll have to open a new question about whether it is better to leave my object_pk options open (suppose I use a textfield as a primary key?) or better to enforce the unique_togetherness...

How do I use a Rails ActiveRecord migration to insert a primary key into a MySQL database?

I need to create an AR migration for a table of image files. The images are being checked into the source tree, and should act like attachment_fu files. That being the case, I'm creating a hierarchy for them under /public/system.
Because of the way attachment_fu generates links, I need to use the directory naming convention to insert primary key values. How do I override the auto-increment in MySQL as well as any Rails magic so that I can do something like this:
image = Image.create(:id => 42, :filename => "foo.jpg")
image.id #=> 42
Yikes, not a pleasant problem to have. The least-kludgy way I can think of to do it is to have some code in your migration that actually "uploads" all the files through attachment-fu, and therefore lets the plugin create the IDs and place the files.
Something like this:
Dir.glob("/images/to/import/*.{jpg,png,gif}").each do |path|
# simulate uploading the image
tempfile = Tempfile.new(path)
tempfile.set_encoding(Encoding::BINARY) if tempfile.respond_to?(:set_encoding)
tempfile.binmode
FileUtils.copy_file(path, tempfile.path)
# create as you do in the controller - may need other metadata here
image = Image.create({:uploaded_data => tempfile})
unless image.save
logger.info "Failed to save image #{path} in migration: #{image.errors.full_messages}"
end
tempfile.close!
end
A look at attachment-fu's tests might be useful.
Unlike, say Sybase, in MySQL if you specify the id column in the insert statement's column list, you can insert any valid, non-duplicate value in the id. No need to do something special.
I suspect the rails magic is just to not let rails know the id is auto-increment. If this is the only way you'll be inserting into this table, then don't make the id auto_increment. Just make in an int not null primary key.
Though frankly, this is using a key as data, and so it makes me uneasy. If attachment_fu is just looking for a column named "id", make a column named id that's really data, and make a column named "actual_id" the actual, synthetic, auto_incremented key.
image = Image.create(:filename => "foo.jpg") { |r| r.id = 42 }
Here's my kluge:
class AddImages < ActiveRecord::Migration
def self.up
Image.destroy_all
execute("ALTER TABLE images AUTO_INCREMENT = 1")
image = Image.create(:filename => "foo.jpg")
image.id #=> 1
end
def self.down
end
end
I'm not entirely sure I understand why you need to do this, but if you only need to do this a single time, for a migration, just use execute in the migration to set the ID (assuming it's not already taken, which I can't imagine it would be):
execute "INSERT INTO images (id, filename) VALUES (42, 'foo.jpg')"
I agree with AdminMyServer although I believe you can still perform this task on the object directly:
image = Image.new :filename => "foo.jpg"
image.id = 42
image.save
You'll also need to ensure your id auto-increment is updated at the end of the process to avoid clashes in the future.
newValue = Images.find(:first, :order => 'id DESC').id + 1
execute("ALTER TABLE images AUTO_INCREMENT = #{newValue}")
Hope this helps.