SQLAlchemy - Auto Lookup Foreign Key Relationship for Insert - mysql

I am trying to get an SQLAlchemy ORM class to automatically:
either lookup the foreign key id for a field
OR
for entries where the field isn't yet in foreign key table, add the row to the foreign key table - and use the auto generated id in the original table.
To illustrate:
Class Definition
class EquityDB_Base(object):
#declared_attr
def __tablename__(cls):
return cls.__name__.lower()
__table_args__ = {'mysql_engine': 'InnoDB'}
__mapper_args__= {'always_refresh': True}
id = Column(Integer, primary_key=True)
def fk(tablename, nullable=False):
return Column("%s_id" % tablename, Integer,
ForeignKey("%s.id" % tablename),
nullable=nullable)
class Sector(EquityDB_Base, Base):
name = Column(String(40))
class Industry(EquityDB_Base, Base):
name = Column(String(40))
sector_id = fk('sector')
sector = relationship('Sector', backref='industries')
class Equity(EquityDB_Base, Base):
symbol = Column(String(10), primary_key=True)
name = Column(String(40))
industry_id = fk('industry')
industry = relationship('Industry', backref='industries')
Using the Class to Set Industry and Sector
for i in industry_record[]:
industry = Industry(id=i.id,
name=i.name,
sector=Sector(name=i.sector_name))
session.merge(industry)
Result
Unfortunately, when I run this - the database adds individual rows to the sector table for each duplicate use of 'sector_name' - for instance, if 10 industries use 'Technology' as their sector name, I get 10 unique sector_id for each one of the 10 industries.
What I WANT - is for each time a sector name is presented that is already in the database, for it to auto-resolve to the appropriate sector_id
I am clearly just learning SQLAlchemy, but can't seem to figure out how to enable this behavior.
Any help would be appreciated!

See answer to a similar question create_or_get entry in a table.
Applying the same logic, you would have something like this:
def create_or_get_sector(sector_name):
obj = session.query(Sector).filter(Sector.name == sector_name).first()
if not obj:
obj = Sector(name = sector_name)
session.add(obj)
return obj
and use it like below:
for i in industry_record[:]:
industry = Industry(id=i.id,
name=i.name,
sector=create_or_get_sector(sector_name=i.sector_name))
session.merge(industry)
One thing you should be careful about is which session instance is used there in the create_or_get_sector.

Related

SQLAlchmemy — get related objects with reflected tables

I am quite new to sqlalchemy, I guess I am missing just a little piece here.
There is this Database (sql):
create table CEO (
id int not null auto_increment,
name char(255) not null,
primary key(id),
unique(name)
);
create table Company (
id int not null auto_increment,
name char (255) not null,
ceo int not null,
primary key(id),
foreign key(ceo) references CEO(id)
);
That code:
from sqlalchemy import create_engine, Table, Column, Integer, String, ForeignKey
from sqlalchemy.orm import registry, relationship, Session
engine = create_engine(
"mysql+pymysql:xxxxxxxx",
echo=True,
future=True
)
mapper_registry = registry()
Base = mapper_registry.generate_base()
#####################
## MAPPING CLASSES ##
#####################
class CEO(Base):
__table__ = Table('CEO', mapper_registry.metadata, autoload_with=engine)
companies = relationship('Company', lazy="joined")
class Company(Base):
__table__ = Table('Company', mapper_registry.metadata, autoload_with=engine)
##########################
## FINALLY THE QUESTION ##
##########################
with Session(engine, future=True) as session:
for row in session.query(CEO).all():
for company in row.companies:
## Just the id of the Ceo is yielded here
print(company.ceo)
So CEO.companies works as expected, but Company.ceo does not, even though the FOREIGN KEY is defined.
What is a proper setup for the Company Mapper class, such that Company.ceo yields the related object?
I could figure out, that the automatic setup did not work, because the column Company.ceo exists in the Database and represents the ID of a given row. To make everything work, I needed to rename Company.ceo to Company.ceo_id and add the relation manually like so:
CompanyTable = Table('Company', Base.metadata, autoload_with=engine)
class Company(Base):
__table__ = CompanyTable
ceo_id = CompanyTable.c.ceo
ceo = relationship('CEO')
I would like to know if it would be possible to rename the column within the Table(…) call, such that I could get rid of the extra CompanyTable thing.

Modifying entry in django admin creates duplicate

I am using the django admin to modify records in a table. The problem is that whenever I modify an entry, when I click save, instead of modifying that entry, the old one is not modified and a new entry containing the modified details is being added.
For example, if I have the following:
Aardvark | Orycteropus | Some description | aardvark | animals/images/aardvark.jpg
when I change the first field to Aardvarkon, I get the following:
Aardvark | Orycteropus | Some description | aardvark | animals/images/aardvark.jpg
Aardvarkon | Orycteropus | Some description | aardvark | animals/images/aardvark.jpg
I have the following django model:
def article_file_name(instance, filename):
return ANIMAL_IMAGES_BASE_DIR[1:] + instance.ai_species_species_sanitized + '.jpg'
class ai_species(models.Model):
ai_species_species = models.CharField('Species', max_length=100, primary_key=True, db_column='species')
ai_species_genus = models.ForeignKey(ai_genera, max_length=50, blank=False, null=False, db_column='genus')
ai_species_description = models.CharField('Description', max_length=65000, db_column='description')
ai_species_species_sanitized = models.CharField(max_length=100, blank=False, null=False, db_column='species_sanitized')
image_url = models.ImageField(max_length=100, storage=OverwriteStorage(), validators=[validate_jpg_extension], upload_to=article_file_name)
class Meta:
db_table = 'Species'
verbose_name = 'Animal species'
verbose_name_plural = 'Animal species'
def __unicode__(self): # Required, don't remove.
return self.ai_species_species
And the following helpers:
def validate_jpg_extension(value):
if not value.name.lower().endswith('.jpg') and not value.name.lower().endswith('.jpeg'):
raise ValidationError(u'Invalid file format! Only jpg or jpeg files allowed!')
class OverwriteStorage(FileSystemStorage):
def get_available_name(self, name):
# If the filename already exists, remove it.
if self.exists(name):
os.remove(os.path.join(settings.MEDIA_ROOT, name))
return name
This is the MySQL table schema for this table:
This is a very counter-intuitive behavior and I haven't found any other occurrences of this online. Any help on this would be greatly appreciated.
Here's the culprit:
ai_species_species = models.CharField('Species', max_length=100, primary_key=True, db_column='species')
Since you've defined the species as the primary key, any time you change this field in the admin it will create a new record (because there isn't already a record with that primary key).
FYI, primary keys aren't supposed to be things that change for a given record, since changing the primary key will invalidate every foreign key (ForeignKey, OneToOneField, and ManyToManyField) that refers to the record.
BTW, you don't need to be prefixing the field names with ai_species_; it's cluttering. Removing those prefixes would remove the need for the db_column parameters as well.

During synchronization Foreign key null (error)

In my database I want to synchronize two tables. I use auth_user(Default table provided by Django) table for registration and there was another table user-profile that contain entities username, email, age etc. During the synchronization how to update Foriegn key?
def get_filename(instance,filename):
return "upload_files/%s_%s" % (str(time()).replace('.','_'),filename)
def create_profile(sender, **kwargs):
if kwargs["created"]:
p = profile(username = kwargs["instance"], email=kwargs["instance"])
p.save()
models.signals.post_save.connect(create_profile, sender=User)
class profile(models.Model):
username = models.CharField(max_length = 30)
email = models.EmailField()
age = models.PositiveIntegerField(default='15')
picture = models.FileField(upload_to='get_filename')
auth_user_id = models.ForeignKey(User)
Here in table profile during synchronization all columns are filled except auth_user_id. and there was an error
Exception Value:
(1048, "Column 'auth_user_id_id' cannot be null")
You have to alter your table and change the column auth_user_id_id datatype attribute that allows null.
Something like this:-
ALTER TABLE mytable MODIFY auth_user_id_id int;
Assuming auth_user_id_id as int datatype.(Columns are nullable by default)

Joining 2 Tables on Multiple Non Foreign Key Columns in Flask with SQLAlchemy and Retrieving All Columns

I have a few tables shown below that I would like to join on columns that are not foreign keys to each other's tables and then have access to the columns of both. Here are the classes:
class Yi(db.Model):
year = db.Column(db.Integer(4), primary_key=True)
industry_id = db.Column(db.String(5), primary_key=True)
wage = db.Column(db.Float())
complexity = db.Column(db.Float())
class Ygi(db.Model, AutoSerialize):
year = db.Column(db.Integer(4), primary_key=True)
geo_id = db.Column(db.String(8), primary_key=True)
industry_id = db.Column(db.String(5), primary_key=True)
wage = db.Column(db.Float())
So, what I would like to get are the columns of both tables joined by the IDs I specify, in this case Year and industry_id. Is this possible? Here is the SQL I've written to achieve this...
SELECT
yi.complexity, ygi.*
FROM
yi, ygi
WHERE
yi.year = ygi.year and
yi.industry_id = ygi.industry_id
One dirty way is :
q=session.query(Ygi,Yi.complexity).\
filter(Yi.year==Ygi.year).\
filter(Yi.industry_id==Ygi.industry_id)
Which gives you :
SELECT ygi.year AS ygi_year, ygi.geo_id AS ygi_geo_id,
ygi.industry_id AS ygi_industry_id, ygi.wage AS ygi_wage,
yi.complexity AS yi_complexity
FROM ygi, yi
WHERE yi.year = ygi.year
AND yi.industry_id = ygi.industry_id
I find this dirty because it does not use the join() method.
You can figure out how to use the join() with the SQLAlchemy documentation
Then, you can choose to use a virtual model : see answer of TokenMacGuy in this question Mapping a 'fake' object in SQLAlchemy.
It will be a good solution.
Or you will just have a YiYgi class that will not be a sqlalchemy.Base derived class but just an object. It more a "hand-fashion" way to do it.
The class will have a classmethod get() method that will:
call the query you build just before,
call the init with the returned request lines and build up one instance per line
This is an example :
class YiYgi(object):
def __init__(self,year, geo_id, industry_id, wage, complexity):
# Initialize all your fields
self.year = year
self.geo_id = geo_id
self.industry_id = industry_id
self.wage = wage + 100 # You can even make some modifications to the values here
self.complexity = complexity
#classmethod
def get_by_year_and_industry(cls, year, industry_id):
""" Return a list of YiYgi instances, void list if nothing available """
q = session.query(Ygi,Yi.complexity).\
filter(Yi.year==Ygi.year).\
filter(Yi.industry_id==Ygi.industry_id)
results = q.all()
yiygi_list = []
for result in results:
# result is a tuple with (YGi instance, Yi.complexity value)
ygi_result = result[0]
yiygi = YiYgi(ygi_result.ygi_year,
ygi_result.geo_id,
ygi_result.industry_id,
ygi_result.wage,
result[1])
yiygi_list.append(yiygi)
return yiygi_list

sqlalchemy - select records by month in MySQL

when i have a table in MySQL:
create table t
(
id integer primary key,
time datetime not null,
value integer not null
)
and an mapping class:
class T(Base):
__tablename__ = 't'
id = Column(INTEGER, primary_key=True, nullable=False, unique=True)
time = Column(DATETIME, nullable=False)
value = Column(INTEGER, nullable=False)
how can i select all values that have given month from this table using SQLAlchemy?
MySQL has the month function: select value from t where month(time) = 4
but SQLAlchemy has no month function.
Without loading all Ts into the session, one can use Functions to filter non April objects straight-away:
from sqlalchemy.sql import func
qry = session.query(T).filter(func.MONTH(T.time) == 4)
for t in qry:
print t.value
A very old question but a better answer is here:
from sqlalchemy import extract
session.query(T).filter(extract('month', T.time)==7).all()
This will return all the records into a database in July.
If for example you want the records from all April months irrespective of year or day:
for t in session.query(T):
if t.time.month == 4: print t.value