I'd very much like to avoid state binding to entities when querying through session, and take advantage of class mapping without relying on:
session.query(SomeClass)
I have no need for transactions, eager/deferred loading, change tracking, or any of the other features offered. Essentially I want to manually bind a ResultProxy to the mapped class, and have a list of instances that do not have any references to SQLA (such as state).
I tried Query.instances, but it requires a session instance:
engine = create_engine('sqlite:///:memory:')
meta = MetaData()
meta.bind = engine
table = Table('table, meta,
Column('id', Integer, primary_key=True),
Column('field1', String(16), nullable=False),
Column('field2', String(60)),
Column('field3', String(20), nullable=False)
)
class Table(object)
pass
meta.create_all(checkfirst=True)
for i in range(10):
user.insert({'field1': 'field1'+i,'field2': 'field2'+i*2,'field3': 'field3'+i*4})
mapper(Table, table)
query = Query((Table,))
query.instances(engine.text("SELECT * FROM table").execute())
Results in:
Traceback (most recent call last):
File "sqlalchemy/orm/mapper.py", line 2507, in _instance_processor
session_identity_map = context.session.identity_map
AttributeError: 'NoneType' object has no attribute 'identity_map'
I'm stuck at this point. I've looked through Query.instances, and the manual setup to replicate session seems very extensive. It requires Query, QueryContext, _MapperEntity and an elaborate choreography that would make most ballet companies blush.
tl;dr Want to use SQLAlchemy's query generation (anything that returns ResultProxy instance) and have results mapped to their respective classes while skipping anything to do with session, identity_map & Unit of Work.
you could expunge the object, so it does not maintain a session, but it would still have _sa attributes.
Seeing it's been a while since you asked, it'll probably not serve you, but maybe to others
Related
We are working on a Top-Down-RPG-like Multiplayer game for learning purposes (and fun!) with some friends. We already have some Entities in the Game and Inputs are working, but the network implementation gives us headache :D
The Issues
When trying to convert with dict some values will still contain the pygame.Surface, which I dont want to transfer and it causes errors when trying to jsonfy them. Other objects I would like to transfer in a simplyfied way like Rectangle cannot be converted automatically.
Already functional
Client-Server connection
Transfering JSON objects in both directions
Async networking and synchronized putting into a Queue
Situation
A new player connects to the server and wants to get the current game state with all objects.
Data-Structure
We use a "Entity-Component" based architecture, so we separated the game logic very strictly into "systems", while the data is stored in the "components" of each Entity. The Entity is a very simple container and has nothing more than a ID and a list of components. Example Entity (shorten for better readability):
Entity
|-- Component (Moveable)
|-- Component (Graphic)
| |- complex datatypes like pygame.SURFACE
| `- (...)
`- Component (Inventory)
We tried different approaches, but all seems not to fit very well or feel "hacky".
pickle
Very Python near, so not easy to implement other clients in future. And I´ve read about some security risks when creating items from network in this dynamic way how pickle it offers. It does not even solve the Surface/Rectangle issue.
__dict__
Still contains the reference to the old objects, so a "cleanup" or "filter" for unwanted datatypes happens also in the origin. A deepcopy throws Exception.
...\Python\Python36\lib\copy.py", line 169, in deepcopy
rv = reductor(4)
TypeError: can't pickle pygame.Surface objects
Show some code
The method of the "EnitityManager" Class which should generate the Snapshot of all Entities, including their components. This Snapshot should be converted to JSON without any errors - and if possible without much configuration in this core-class.
class EnitityManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = deepcopy(e.__dict__)
# Components are Objects, but dictionary is required for transfer
cmp_obj_list = result['entities'][e.id]['components']
# Empty the current list of components, its going to be filled with
# dictionaries of each cmp which are cleaned for the dump, because
# of the errors directly coverting the whole datastructure to JSON
result['entities'][e.id]['components'] = {}
for cmp in cmp_obj_list:
cmp_copy = deepcopy(cmp)
cmp_dict = cmp_copy.__dict__
# Only list, dict, int, str, float and None will stay, while
# other Types are being simply deleted including their key
# Lists and directories will be cleaned ob recursive as well
cmp_dict = self.clean_complex_recursive(cmp_dict)
result['entities'][e.id]['components'][type(cmp_copy).__name__] \
= cmp_dict
logging.debug("EntityMgr: Entity#3: %s" % result['entities'][3])
return result
Expectation and actual results
We can find a way to manually override elements which we dont want. But as the list of components will increase we have to put all the filter logic into this core class, which should not contain any components specializations.
Do we really have to put all the logic into the EntityManager for filtering the right objects? This does not feel good, as I would like to have all convertion to JSON done without any hardcoded configuration.
How to convert all this complex data in a most generic approach?
Thanks for reading so far and thank you very much for your help in advance!
Interesting articles which we were already working threw and maybe helpful for others with similar issues
https://gafferongames.com/post/what_every_programmer_needs_to_know_about_game_networking/
http://code.activestate.com/recipes/408859/
https://docs.python.org/3/library/pickle.html
UPDATE: Solution - thx 2 sloth
We used a combination of the following architecture, which works really great so far and is also good to maintain!
Entity Manager now calls the get_state() function of the entity.
class EntitiyManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = e.get_state()
return result
The Entity has only some basic attributes to add to the state and forwards the get_state() call to all the Components:
class Entity:
def get_state(self):
state = {'name': self.name, 'id': self.id, 'components': {}}
for cmp in self.components:
state['components'][type(cmp).__name__] = cmp.get_state()
return state
The components itself now inherit their get_state() method from their new superclass components, which simply cares about all simple datatypes:
class Component:
def __init__(self):
logging.debug('generic component created')
def get_state(self):
state = {}
for attr, value in self.__dict__.items():
if value is None or isinstance(value, (str, int, float, bool)):
state[attr] = value
elif isinstance(value, (list, dict)):
# logging.warn("Generating state: not supporting lists yet")
pass
return state
class GraphicComponent(Component):
# (...)
Now every developer has the opportunity to overlay this function to create a more detailed get_state() function for complex types directly in the Component Classes (like Graphic, Movement, Inventory, etc.) if it is required to safe the state in a more accurate way - which is a huge thing for maintaining the code in future, to have these code pieces in one Class.
Next step is to implement the static method for creating the items from the state in the same Class. This makes this working really smooth.
Thank you so much sloth for your help.
Do we really have to put all the logic into the EntityManager for filtering the right objects?
No, you should use polymorphism.
You need a way to represent your game state in a form that can be shared between different systems; so maybe give your components a method that will return all of their state, and a factory method that allows you create the component instances out of that very state.
(Python already has the __repr__ magic method, but you don't have to use it)
So instead of doing all the filtering in the entity manager, just let him call this new method on all components and let each component decide that the result will look like.
Something like this:
...
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = {'components': {}}
for cmp in e.components:
result['entities'][e.id]['components'][type(cmp).__name__] = cmp.get_state()
...
And a component could implement it like this:
class GraphicComponent:
def __init__(self, pos=...):
self.image = ...
self.rect = ...
self.whatever = ...
def get_state(self):
return { 'pos_x': self.rect.x, 'pos_y': self.rect.y, 'image': 'name_of_image.jpg' }
#staticmethod
def from_state(state):
return GraphicComponent(pos=(state.pos_x, state.pos_y), ...)
And a client's EntityManager that recieves the state from the server would iterate for the component list of each entity and call from_state to create the instances.
I have a simple example model where I would like to generate names for the objects of the Position rule that were not given a name with as <NAME>. This is needed so that I can find them later with the built-in FQN scope provider.
My idea would be to do this in the position_name_generator object processor but that will be only be called after the whole model is parsed. I don´t really understand the reason for that, since by the time I would need a Position object in the Project, the objects are already created, still the object processor will not be called.
Another idea would be to do this in a custom scope provider for Position.location which would then first do the name generation and then use the built-in FQN to find the Location object. Although this would work, I consider this hacky and I would prefer to avoid it.
What would be the textX way of solving this issue?
(Please take into account that this is only a small example. In reality a similar functionality is required for a rather big and complex model. To change this behaviour with the generated names is not possible since it is a requirement.)
import textx
MyLanguage = """
Model
: (locations+=Location)*
(employees+=Employee)*
(positions+=Position)*
(projects+=Project)*
;
Project
: 'project' name=ID
('{'
('use' use=[Position])*
'}')?
;
Position
: 'define' 'position' employee=[Employee|FQN] '->' location=[Location|FQN] ('as' name=ID)?
;
Employee
: 'employee' name=ID
;
Location
: 'location' name=ID
( '{'
(sub_location+=Location)+
'}')?
;
FQN
: ID('.' ID)*
;
Comment:
/\/\/.*$/
;
"""
MyCode = """
location Building
{
location Entrance
location Exit
}
employee Hans
employee Juergen
// Shall be referred to with the given name: "EntranceGuy"
define position Hans->Building.Entrance as EntranceGuy
// Shall be referred to with the autogenerated name: <Employee>"At"<LastLocation>
define position Juergen->Building.Exit
project SecurityProject
{
use EntranceGuy
use JuergenAtExit
}
"""
def position_name_generator(obj):
if "" == obj.name:
obj.name = obj.employee.name + "At" + obj.location.name
def main():
meta_model = textx.metamodel_from_str(MyLanguage)
meta_model.register_scope_providers({
"Position.location": textx.scoping.providers.FQN(),
})
meta_model.register_obj_processors({
"Position": position_name_generator,
})
model = meta_model.model_from_str(MyCode)
assert model, "Could not create model..."
if "__main__" == __name__:
main()
What is the textx way to solve this...
The use case you describe is to define the name of an object based on other model elements, including a reference to other model elements. This is currently not part of any test and use cases included in our test suite and the textx docu.
Object processors are executed at defined stages during model construction (see http://textx.github.io/textX/stable/scoping/#using-the-scope-provider-to-modify-a-model). In the described setup they are executed after reference resolution. Since the name to be defined/deduced itself is required for reference resolution, object processors cannot be used here (even if we allow to control when object processors are executed, before or after scope resolution, the described setup still will not work).
Given the dynamics of model loading (see http://textx.github.io/textX/stable/scoping/#using-the-scope-provider-to-modify-a-model), the solution is located within a scope provider (as you suggested). Here, we allow to control the order of reference resolution, such that references to the object being named by a custom procedure are postponed, until references required to deduce/define the name resolved.
Possible workaround
A preliminary sketch of how your use case can be solved is discussed in a https://github.com/textX/textX/pull/194 (with an attached issue https://github.com/textX/textX/issues/193). This textx PR contains a version of scoping.py you could probably use for your project (just copy and rename the module). A full-fledged solution could be part of the textx TEP-001, where we plan to make scoping more controllable to the end-user.
Playing around with this absolutely interesting issue revealed new aspects to me for the textx framework.
names dependent on model contents (involving unresolved references). This name resolution, which can be Postponed (in the referenced PR, see below), in terms of our reference resolution logic.
Even more interesting are the consequences of that: What happens to references pointing to locations, where unresolved names are found? Here, we must postpone the reference resolution process, because we cannot know if the name might match when resolved...
Your example is included: https://github.com/textX/textX/blob/analysis/issue193/tests/functional/test_scoping/test_name_resolver/test_issue193_auto_name.py
I'm building a python app around an existing (mysql) database and am using automap to infer tables and relationships:
base = automap_base()
self.engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
# reflect the tables
base.prepare(self.engine, reflect=True)
self.TableName = base.classes.table_name
Using this I can do things like session.query(TableName) etc...
However, I'm worried about performance, because every time the app runs it will do the whole inference again.
Is this a legitimate concern?
If so, is there a possibility to 'cache' the output of Automap?
I think that "reflecting" the structure of your database is not the way to go. Unless your app tries to "infer" things from the structure, like static code analysis would for source files, then it is unnecessary. The other reason for reflecting it at run-time would be the reduced time to begin "using" the database using SQLAlchemy. However:
Another option would be to use something like SQLACodegen (https://pypi.python.org/pypi/sqlacodegen):
It will "reflect" your database once and create a 99.5% accurate set of declarative SQLAlchemy models for you to work with. However, this does require that you keep the model subsequently in-sync with the structure of the database. I would assume that this is not a big concern seeing as the tables you're already-working with are stable-enough such that run-time reflection of their structure does not impact your program much.
Generating the declarative models is essentially a "cache" of the reflection. It's just that SQLACodegen saved it into a very readable set of classes + fields instead of data in-memory. Even with a changing structure, and my own "changes" to the generated declarative models, I still use SQLACodegen later-on in a project whenever I make database changes. It means that my models are generally consistent amongst one-another and that I don't have things such as typos and data-mismatches due to copy-pasting.
Performance can be a legitimate concern. If the database schema is not changing, it can be time consuming to reflect the database every time a script is run. This is more of an issue during development, not starting up a long running application. It's also a significant time saver if your database is on a remote server (again, particularly during development).
I use code that is similar to the answer here (as noted by #ACV). The general plan is to perform the reflection the first time, then pickle the metadata object. The next time the script is run, it will look for the pickle file and use that. The file can be anywhere, but I place mine in ~/.sqlalchemy_cache. This is an example based on your code.
import os
from sqlalchemy.ext.declarative import declarative_base
self.engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
metadata_pickle_filename = "mydb_metadata"
cache_path = os.path.join(os.path.expanduser("~"), ".sqlalchemy_cache")
cached_metadata = None
if os.path.exists(cache_path):
try:
with open(os.path.join(cache_path, metadata_pickle_filename), 'rb') as cache_file:
cached_metadata = pickle.load(file=cache_file)
except IOError:
# cache file not found - no problem, reflect as usual
pass
if cached_metadata:
base = declarative_base(bind=self.engine, metadata=cached_metadata)
else:
base = automap_base()
base.prepare(self.engine, reflect=True) # reflect the tables
# save the metadata for future runs
try:
if not os.path.exists(cache_path):
os.makedirs(cache_path)
# make sure to open in binary mode - we're writing bytes, not str
with open(os.path.join(cache_path, metadata_pickle_filename), 'wb') as cache_file:
pickle.dump(Base.metadata, cache_file)
except:
# couldn't write the file for some reason
pass
self.TableName = base.classes.table_name
For anyone using declarative table class definitions, assuming a Base object defined as e.g.
Base = declarative_base(bind=engine)
metadata_pickle_filename = "ModelClasses_trilliandb_trillian.pickle"
# ------------------------------------------
# Load the cached metadata if it's available
# ------------------------------------------
# NOTE: delete the cached file if the database schema changes!!
cache_path = os.path.join(os.path.expanduser("~"), ".sqlalchemy_cache")
cached_metadata = None
if os.path.exists(cache_path):
try:
with open(os.path.join(cache_path, metadata_pickle_filename), 'rb') as cache_file:
cached_metadata = pickle.load(file=cache_file)
except IOError:
# cache file not found - no problem
pass
# ------------------------------------------
# define all tables
#
class MyTable(Base):
if cached_metadata:
__table__ = cached_metadata.tables['my_schema.my_table']
else:
__tablename__ = 'my_table'
__table_args__ = {'autoload':True, 'schema':'my_schema'}
...
# ----------------------------------------
# If no cached metadata was found, save it
# ----------------------------------------
if cached_metadata is None:
# cache the metadata for future loading
# - MUST DELETE IF THE DATABASE SCHEMA HAS CHANGED
try:
if not os.path.exists(cache_path):
os.makedirs(cache_path)
# make sure to open in binary mode - we're writing bytes, not str
with open(os.path.join(cache_path, metadata_pickle_filename), 'wb') as cache_file:
pickle.dump(Base.metadata, cache_file)
except:
# couldn't write the file for some reason
pass
Important Note!! If the database schema changes, you must delete the cached file to force the code to autoload and create a new cache. If you don't, the changes will be be reflected in the code. It's an easy thing to forget.
The answer to your first question is largely subjective. You are adding database queries to fetch the reflection metadata to the application load time. Whether or not that overhead is significant depends on your project requirements.
For reference, I have an internal tool at work that uses a reflection pattern because the the load-time is acceptable for our team. That might not be the case if it were an externally-facing product. My hunch is that for most applications the reflection overhead will not dominate the total application load time.
If you decide it is significant for your purposes, this question has an interesting answer where the user pickles the database metadata in order to locally cache it.
Adding to this, what #Demitri answered is close to correct but (at least in sqlalchemy 1.4.29), the example will fail on the last line self.TableName = base.classes.table_name when generating from the cached file. In this case declarative_base() has no attribute classes.
To fix is as simple as altering:
if cached_metadata:
base = declarative_base(bind=self.engine, metadata=cached_metadata)
else:
base = automap_base()
base.prepare(self.engine, reflect=True) # reflect the tables
to
if cached_metadata:
base = automap_base(declarative_base(bind=self.engine, metadata=cached_metadata))
base.prepare()
else:
base = automap_base()
base.prepare(self.engine, reflect=True) # reflect the tables
This will create your automap object with the proper attributes.
Is there a way to mark tables as read-only when using Pyramid with SQLAlchemy?
Some brief background: I am working on an app with an existing database which is very complicated. My app adds a few new tables to this database, but updating (update, insert, delete) other tables is unsafe because the schema is not well understood (and isn't documented).
My basic aim is to fail-fast before any unsafe db updates so the offending code can be found and removed, so I and future developers don't cause any data issues with this app.
My naive approach: add a pyramid_tm commit veto hook that checks objects to be updated (new, dirty, and deleted in the db session) and vetos if there are any read-only tables.
Is there an existing mechanism to handle this? Or a better approach?
If you're using the "declarative" approach, you can try making the classes themselves "read-only":
class MyBase(object):
"""
This class is a superclass of SA-generated Base class,
which in turn is the superclass of all db-aware classes
so we can define common functions here
"""
def __setattr__(self, name, value):
"""
Raise an exception if attempting to assign to an atribute of a "read-only" object
Transient attributes need to be prefixed with "_t_"
"""
if (getattr(self, '__read_only__', False)
and name != "_sa_instance_state"
and not name.startswith("_t_")):
raise ValueError("Trying to assign to %s of a read-only object %s" % (name, self))
super(MyBase, self).__setattr__(name, value)
Base = declarative_base(cls=MyBase)
Similarly you can try to override __init__ method to prevent the objects from being instantiated.
However, solving this problem at the application level feels a bit fragile - I would just look at creating a special user in the database for your app to connect with, which would save limited permissions on the tables you want to keep intact.
this is part of a project that involves working with tg2 against 2 databases one of them (which this model uses is mssql). since that table I need to read/write from is created and managed with a different application I don't want turbogears to overwrite/change the table - just work with the existing table - so I use sqlalchemy magical 'autoload' reflection (I also don't know every detail of this table configuration in the mssql db)
some of the reflection is done in model.__init__.py and not in the class (as some sqlalchemy tutorial suggest) because of tg2 innerworking
this is the error message I get:(the table name in db is SOMETABLE and in my app the class is activities)
sqlalchemy.exc.ArgumentError: Mapper Mapper|Activities|SOMETABLE could not
assemble any primary key columns for mapped table 'SOMETABLE'
this is activities class:
class Activities(DeclarativeBase2):
__tablename__ = 'SOMETABLE'
#tried the classic way, I used in another place without tg but didn't work here - the reflection should be outside the class
#__table_args__= {'autoload':True
#,'autoload_with':engine2
#}
def __init__(self,**kw):
for k,v in kw.items():
setattr(self,k,v)
and this is model.__init__.py init model method (where the reflection is called):
def init_model(engine1,engine2):
"""Call me before using any of the tables or classes in the model."""
DBSession.configure(bind=engine1)
DBSession2.configure(bind=engine2)
# If you are using reflection to introspect your database and create
# table objects for you, your tables must be defined and mapped inside
# the init_model function, so that the engine is available if you
# use the model outside tg2, you need to make sure this is called before
# you use the model.
#
# See the following example:
metadata.bind = engine1
metadata2.bind = engine2
#metadata2=MetaData(engine2)
global t_reflected
#
t_reflected = Table("SOMETABLE", metadata2,
autoload=True, autoload_with=engine2)
#
mapper(Activities, t_reflected
so I think I need to tell sqlalchemy what is the primary key - but how do I do it while using the reflection (I know the primary key field)?
EDIT the working solution:
def init_model(engine1,engine2):
"""Call me before using any of the tables or classes in the model."""
DBSession.configure(bind=engine1)
DBSession2.configure(bind=engine2)
# If you are using reflection to introspect your database and create
# table objects for you, your tables must be defined and mapped inside
# the init_model function, so that the engine is available if you
# use the model outside tg2, you need to make sure this is called before
# you use the model.
#
# See the following example:
metadata.bind = engine1
metadata2.bind = engine2
#metadata2=MetaData(engine2)
global t_reflected
#
t_reflected = Table("SOMETABLE", metadata2,String,primary_key=True),
autoload=True, autoload_with=engine2)# adding the primary key column here didn't work
#
mapper(Activities, t_reflected, non_primary=True)# notice the needed non_primary - for some reason I can't do the whole mapping in one file and I Have to do part in init_model and part in the model - quite annoying
also in the model I had to add the primary key column making it:
class Activities(DeclarativeBase2):
__tablename__ = 'SOMETABLE'
#tried the classic way, I used in another place without tg but didn't work here - the reflection should be outside the class
EVENTCODE = Column(String, primary_key=True)# needed because the reflection couldn't find the primary key .
of course I also had to add various imports in model.__init__.py to make this work
the strange thing is it turned out that it complained about not finding a primary key before it even connected to the db and when a standalone sqlalchemy class (without tg2) doing the same key - didn't complain at all. makes you wonder
You can mix-and-match: see Overriding Reflected Columns in the documentation. In your case the code would look similar to this:
t_reflected = Table("SOMETABLE", metadata2,
Column('id', Integer, primary_key=True), # override reflected '???' column to have primary key
autoload=True, autoload_with=engine2)
edit-1: Model version: I also think that only declarative version should work as well, in which case you should not define a table t_reflected and also should not map those manually using mapper(...) because declarative classes are automatically mapped:
class Activities(DeclarativeBase2):
__table__ = Table('SOMETABLE', metadata2,
Column('EVENTCODE', Unicode, primary_key=True),
autoload=True, autoload_with=engine2,
)
def __init__(self,**kw):
for k,v in kw.items():
setattr(self,k,v)