I'm integrating Huey with a simple pyramid app. I'm not using a global SQLAlchemy session in the app (I'm making use of the latest alchemy scaffold). However, there seems to be no other straightforward way to provide a session to periodic tasks.
from huey import RedisHuey
huey = RedisHuey(password=os.environ.get('REDIS_PASSWORD', ''))
DBSession = scoped_session(sessionmaker())
#huey.periodic_task(crontab(minute='*/1'))
def notify_not_confirmed_assignments():
# TODO: Use a non-global DB session
assignments = DBSession.query(Assignment).filter_by(date=next_date).all()
Does Huey offer hooks to close the DB connection on task completion? What is the best way to provide a thread-safe connection to these tasks?
Thanks in advance!
You can build session object with a factory in the tasks:
factory = sessionmaker()
factory.configure(bind=engine)
session = factory()
No need to use scoped session, just initialize the engine and pass it to factory.
scoped_session provides you with a contextual/thread-local session (i.e. it corresponds to an individual DB connection in each thread, and it's also possible to configure a custom scope when you need a separate session per something which is not a thread.
So, basically, all you need to do is to have a properly-configured pseudo-global variable (similar to what you have now) and make sure you call DBSession.begin() at the start of the task and DBSession.commit() at the end - doing that manually is probably a chore but it can easily be abstracted into a context manager
def my_task():
with magically_start_session() as session:
session.query(...)
or into a decorator:
#huey.periodic_task(crontab(minute='*/1'))
#start_session
def notify_not_confirmed_assignments(session):
# TODO: Use a non-global DB session
assignments = session.query(...)
Related
I use FastApi with SqlAlchemy as context manager
#contextmanager
def get_session(): # need to patch all over tests files
session_ = SessionLocal()
try:
yield session_
except Exception as e:
session_.rollback()
router.py
#router.get('/get_users'):
def get_users(): # No dependencies
with get_session() as session:
users = session.query(Users).all()
return users
I need to override get_session during all my tests (I use pytest)
I could do it with #patch and patch all test. But it's not most effective way because i need to use decorator to patch in each test file, correct specify full path.
I wonder is there quick way to do it in one place or maybe use fixture?
You could try the approach in this answer to a similar question: define a pytest fixture with arguments scope="session", autouse=True, patching the context manager. If you need, you can also provide a new callable:
from contextlib import contextmanager
from unittest.mock import patch
import pytest
#pytest.fixture(scope="session", autouse=True, new_callable=fake_session)
def default_session_fixture():
with patch("your_filename.get_session"):
yield
#contextmanager
def fake_session():
yield ... # place your replacement session here
As a side note, I would highly recommend using FastAPI's dependency injection for handling your SQLAlchemy session, especially since it has built in support for exactly this kind of patching.
I'm trying to implement python-social-auth in Flask. I've ironed out tons of kinks whilst trying to interpret about 4 tutorials and a full Flask-book at the same time, and feel I've reached sort of an impasse with Flask-migrate.
I'm currently using the following code to create the tables necessary for python-social-auth to function in a flask-sqlalchemy environment.
from social.apps.flask_app.default import models
models.PSABase.metadata.create_all(db.engine)
Now, they're obviously using some form of their own Base, not related to my actual db-object. This in turn causes Flask-Migrate to completely miss out on these tables and remove them in migrations. Now, obviously I can remove these db-drops from every removal, but I can imagine it being one of those things that at one point is going to get forgotten about and all of a sudden I have no OAuth-ties anymore.
I've gotten this solution to work with the usage (and modification) of the manage.py-command syncdb as suggested by the python-social-auth Flask example
Miguel Grinberg, the author of Flask-Migrate replies here to an issue that seems to very closely resemble mine.
The closest I could find on stack overflow was this, but it doesn't shed too much light on the entire thing for me, and the answer was never accepted (and I can't get it to work, I have tried a few times)
For reference, here is my manage.py:
#!/usr/bin/env python
from flask.ext.script import Server, Manager, Shell
from flask.ext.migrate import Migrate, MigrateCommand
from app import app, db
manager = Manager(app)
manager.add_command('runserver', Server())
manager.add_command('shell', Shell(make_context=lambda: {
'app': app,
'db_session': db.session
}))
migrate = Migrate(app, db)
manager.add_command('db', MigrateCommand)
#manager.command
def syncdb():
from social.apps.flask_app.default import models
models.PSABase.metadata.create_all(db.engine)
db.create_all()
if __name__ == '__main__':
manager.run()
And to clarify, the db init / migrate / upgrade commands only create my user table (and the migration one obviously), but not the social auth ones, while the syncdb command works for the python-social-auth tables.
I understand from the github response that this isn't supported by Flask-Migrate, but I'm wondering if there's a way to fiddle in the PSABase-tables so they are picked up by the db-object sent into Migrate.
Any suggestions welcome.
(Also, first-time poster. I feel I've done a lot of research and tried quite a few solutions before I finally came here to post. If I've missed something obvious in the guidelines of SO, don't hesitate to point that out to me in a private message and I'll happily oblige)
After the helpful answer from Miguel here I got some new keywords to research. I ended up at a helpful github-page which had further references to, amongst others, the Alembic bitbucket site which helped immensely.
In the end I did this to my Alembic migration env.py-file:
from sqlalchemy import engine_from_config, pool, MetaData
[...]
# add your model's MetaData object here
# for 'autogenerate' support
# from myapp import mymodel
# target_metadata = mymodel.Base.metadata
from flask import current_app
config.set_main_option('sqlalchemy.url',
current_app.config.get('SQLALCHEMY_DATABASE_URI'))
def combine_metadata(*args):
m = MetaData()
for metadata in args:
for t in metadata.tables.values():
t.tometadata(m)
return m
from social.apps.flask_app.default import models
target_metadata = combine_metadata(
current_app.extensions['migrate'].db.metadata,
models.PSABase.metadata)
This seems to work absolutely perfectly.
The problem is that you have two sets of models, each with a different SQLAlchemy metadata object. The models from PSA were generated directly from SQLAlchemy, while your own models were generated through Flask-SQLAlchemy.
Flask-Migrate only sees the models that are defined via Flask-SQLAlchemy, because the db object that you give it only knows about the metadata for those models, it knows nothing about these other PSA models that bypassed Flask-SQLAlchemy.
So yeah, end result is that each time you generate a migration, Flask-Migrate/Alembic find these PSA tables in the db and decides to delete them, because it does not see any models for them.
I think the best solution for your problem is to configure Alembic to ignore certain tables. For this you can use the include_object configuration in the env.py module stored in the migrations directory. Basically you are going to write a function that Alembic will call every time it comes upon a new entity while generating a migration script. The function will return False when the object in question is one of these PSA tables, and True for every thing else.
Update: Another option, which you included in the response you wrote, is to merge the two metadata objects into one, then the models from your application and PSA are inspected by Alembic together.
I have nothing against the technique of merging multiple metadata objects into one, but I think it is not a good idea for an application to track migrations in models that aren't yours. Many times Alembic will not be able to capture a migration accurately, so you may need to make minor corrections on the generated script before you apply it. For models that are yours, you are capable of detecting these inaccuracies that sometimes show up in migration scripts, but when the models aren't yours I think you can miss stuff, because you will not be familiar enough with the changes that went into those models to do a good review of the Alembic generated script.
For this reason, I think it is a better idea to use my proposed include_object configuration to leave the third party models out of your migrations. Those models should be migrated according to the third party project's instructions instead.
I use two models as following:-
One which is use using db as
db = SQLAlchemy()
app['SQLALCHEMY_DATABASE_URI'] = 'postgresql://postgres:' + POSTGRES_PASSWORD + '#localhost/Flask'
db.init_app(app)
class User(db.Model):
pass
the other with Base as
Base = declarative_base()
uri = 'postgresql://postgres:' + POSTGRES_PASSWORD + '#localhost/Flask'
engine = create_engine(uri)
metadata = MetaData(engine)
Session = sessionmaker(bind=engine)
session = Session()
class Address(Base):
pass
Since you created user with db.Model you can use flask migrate on User and class Address used Base which handles fetching pre-existing table from the database.
So my scenario drilled down to the essence is as follows:
Essentially, I have a config file containing a set of SQL queries whose result sets need to be exported as CSV files.
Since some queries may return billions of rows, and because something may interrupt the process (bug, crash, ...), I want to use a framework such as spring batch, which gives me restartabilty and job monitoring.
I am using a file based H2 database for persisting spring batch jobs.
So, here are my questions:
Upon creating a Job, I need to provide my RowMapper some initial configuration. So what happens when a job needs to be restarted after a e.g. crash? Concretly:
Is the state of the RowMapper automatically persisted, and upon restart Spring batch will try to restore the object from its database, or
will the RowMapper object be used that is part of the original spring batch XML config file, or
I have to maintain the RowMapper's state using the step's/job's ExecutionContext?
Above question is related to whether there is magic going on when using the spring batch XML configuration, or whether I could as well create all these beans in a programmatic way:
Since I need to parse my own config format into a spring batch job config, I rather just use spring batch's Java classes (beans) and fill them out appropriately, rather attempting to manually write out valid XML. However, if my Job crashes, I would create all the beans myself again. Does spring batch automagically restore the Job state from its database?
If I really need XML, is there a way to serialize a spring-batch JobRepository (or one of these objects) as a spring batch XML config?
Right now, I tried to configure my Step with the following code - but I am unsure if this is the proper way to do this:
Is TaskletStep the way to go?
Is the way I create the chunked reader/writer correct, or is there some other object which I should use instead?
I would have assumed that opening of the reader and writer would occur automatically as part of the JobExecution, but if I don't open these resources prior to running the Job, I get an exception telling me that I need to open them first. Maybe I need to create some other object that manages the resoures (jdbc connection and file handle)?
JdbcCursorItemReader<Foobar> itemReader = new JdbcCursorItemReader<Foobar>();
itemReader.setSql(sqlStr);
itemReader.setDataSource(dataSource);
itemReader.setRowMapper(rowMapper);
itemReader.afterPropertiesSet();
ExecutionContext executionContext = new ExecutionContext();
itemReader.open(executionContext);
FlatFileItemWriter<String> itemWriter = new FlatFileItemWriter<String>();
itemWriter.setLineAggregator(new PassThroughLineAggregator<String>());
itemWriter.setResource(outResource);
itemWriter.afterPropertiesSet();
itemWriter.open(executionContext);
int commitInterval = 50000;
CompletionPolicy completionPolicy = new SimpleCompletionPolicy(commitInterval);
RepeatTemplate repeatTemplate = new RepeatTemplate();
repeatTemplate.setCompletionPolicy(completionPolicy);
RepeatOperations repeatOperations = repeatTemplate;
ChunkProvider<Foobar> chunkProvider = new SimpleChunkProvider<Foobar>(itemReader, repeatOperations);
ItemProcessor<Foobar, String> itemProcessor = new ItemProcessor<Foobar, String>() {
/* Custom implemtation */ };
ChunkProcessor<Foobar> chunkProcessor = new SimpleChunkProcessor<Foobar, String>(itemProcessor, itemWriter);
Tasklet tasklet = new ChunkOrientedTasklet<QuadPattern>(chunkProvider, chunkProcessor); //new SplitFilesTasklet();
TaskletStep taskletStep = new TaskletStep();
taskletStep.setName(taskletName);
taskletStep.setJobRepository(jobRepository);
taskletStep.setTransactionManager(transactionManager);
taskletStep.setTasklet(tasklet);
taskletStep.afterPropertiesSet();
job.addStep(taskletStep);
Most of you questions are really complex and can be difficult give a good answer without write a long paper.
I'm new with spring-batch as you, and I found a lot of really useful info - and all the answers to your questions - reading Spring batch in action: it's completed, well explained, full of example and cover all aspects of framework (reader/writer/processor, job/tasklet/chunk lifecycle/persistence, tx/resources management, job flow, integration with other service, partitioning, restarting/retry, failure management and a lot of interesting things).
Hope to help
I am using Flask and a SQLAlchemy extension. Also I am using the declarative way to write my models as described in the extension's documentation.
For one of my models, I have some code I need to run after a new row has been inserted, updated or deleted. I was wondering how to do it? Ideally I would just add functions to the model.
Look at SQLAlchemy's Mapper Events. You can bind a callback function to the after_insert, after_update, and after_delete events.
Example:
from sqlalchemy import event
def after_insert_listener(mapper, connection, target):
# 'target' is the inserted object
print(target.id_user)
event.listen(User, 'after_insert', after_insert_listener)
As far as I have understood, dependency injection separates the application wiring logic from the business logic. Additionally, I try to adhere to the law of Demeter by only injecting direct collaborators.
If I understand this article correctly, proper dependency injection means that collaborators should be fully initialized when they are injected, unless lazy instantiation is required. This would mean (and is actually mentioned in the article) that objects like database connections and file streams should be up and ready at injection time.
However, opening files and connections could result in an exception, which should be handled at some point. What is the best way to go about this?
I could handle the exception at 'wire time', like in the following snippet:
class Injector:
def inject_MainHelper(self, args):
return MainHelper(self.inject_Original(args))
def inject_Original(self, args):
return open(args[1], 'rb')
class MainHelper:
def __init__(self, original):
self.original = original
def run(self):
# Do stuff with the stream
if __name__ == '__main__':
injector = Injector()
try:
helper = injector.inject_MainHelper(sys.argv)
except Exception:
print "FAILED!"
else:
helper.run()
This solution, however, starts to mix business logic with wiring logic.
Another solution is using a provider:
class FileProvider:
def __init__(self, filename, load_func, mode):
self._load = load_func
self._filename = filename
self._mode = mode
def get(self):
return self._load(self._filename, self._mode)
class Injector:
def inject_MainHelper(self, args):
return MainHelper(self.inject_Original(args))
def inject_Original(self, args):
return FileProvider(args[1], open, 'rb')
class MainHelper:
def __init__(self, provider):
self._provider = provider
def run(self):
try:
original = self._provider.get()
except Exception:
print "FAILED!"
finally:
# Do stuff with the stream
if __name__ == '__main__':
injector = Injector()
helper = injector.inject_MainHelper(sys.argv)
helper.run()
The drawback here is the added complexity of a provider and a violation of the law of Demeter.
What is the best way to deal with exceptions like this when using a dependency-injection framework as discussed in the article?
SOLUTION, based on the discussion with djna
First, as djna correctly points out, there is no actual mixing of business and wiring logic in my first solution. The wiring is happening in its own, separate class, isolated from other logic.
Secondly, there is the case of scopes. Instead of one, there are two smaller scopes:
The scope where the file is not verified yet. Here, the injection engine cannot assume anything about the file's state yet and cannot build objects that depend on it.
The scope where the file is successfully opened and verified. Here, the injection engine can create objects based on the extracted contents of the file, without the worry of blowing up on file errors.
After entering the first scope and obtaining enough information on opening and validating a file, the business logic tries to actually validate and open the file (harvesting the fruit, as djna puts it). Here, exceptions can be handled accordingly. When it is certain the file is loaded and parsed correctly, the application can enter the second scope.
Thirdly, not really related to the core problem, but still an issue: the first solution embeds business logic in the main loop, instead of the MainHelper. This makes testing harder.
class FileProvider:
def __init__(self, filename, load_func):
self._load = load_func
self._filename = filename
def load(self, mode):
return self._load(self._filename, mode)
class Injector:
def inject_MainHelper(self, args):
return MainHelper(self.inject_Original(args))
def inject_Original(self, args):
return FileProvider(args[1], open)
def inject_StreamEditor(self, stream):
return StreamEditor(stream)
class MainHelper:
def __init__(self, provider):
self._provider = provider
def run(self):
# In first scope
try:
original = self._provider.load('rb')
except Exception:
print "FAILED!"
return
# Entering second scope
editor = Injector().inject_StreamEditor(original)
editor.do_work()
if __name__ == '__main__':
injector = Injector()
helper = injector.inject_MainHelper(sys.argv)
helper.run()
Note that I have cut some corners in the last snippet. Refer to the mentioned article for more information on entering scopes.
I've had discussion about this in the contect of Java EE, EJB 3 and resources.
My understanding is that we need to distinguish between injection of the Reference to a resource and the actual use of a resource.
Take the example of a database connection we have some pseudo-code
InjectedConnectionPool icp;
public void doWork(Stuff someData) throws Exception{
Connection c = icp.getConnection().
c.writeToDb(someData);
c.close(); // return to pool
}
As I understand it:
1). That the injected resource can't be the connection itself, rather it must be a connection pool. We grab connections for a short duration and return them.
2). That any Db connection may be invalidated at any time by a failure in the DB or network. So the connection pooling resource must be able to deal with throwing away bad connections and getting new ones.
3). A failure of injection means that the component will not be started. This could happen if, for example, the injection is actually a JNDI lookup. If there's no JNDI entry we can't find the connection pool definition, can't create the pool and so can't start the component. This is not the same as actually opening a connection to the DB ...
4). ... at the time of initialisation we don't actually need to open any connections, a failure to do so just gives us an empty pool - ie. exactly the same state as if we had been running for a while and the DB went away, the pool would/could/should throw away the stale connections.
This model seems to nicely define a set of responsibilities that Demeter might accept. Injection has respobilitiy to prepare the ground, make sure that when the code needs to do something it can. The code has the responsibility to harvest the fruit, try to use the prepared material and cope with actual resource failures and opposed to failures to find out about resources.