Apache Airflow: How to template Volumes in DockerOperator using Jinja Templating

Apache Airflow: How to template Volumes in DockerOperator using Jinja Templating - jinja2

I want to convey list of volumes into DockerOperator using Jinja template:
hard coded volumes works fine:
volumes=['first:/dest', 'second:/sec_destination']
however following jinja template does not work:
volumes=[f"{{{{ ti.xcom_pull(task_ids='my_task', key='dockerVolumes')
}}}}"]
500 Server Error: Internal Server Error ("invalid mode:
/sec_destination')")
I found workaround that is acceptable for me however is not perfect:
acceptable only for cases where volues would have always 2 elements
volumes=[f"{{{{ ti.xcom_pull(task_ids='my_task', key='dockerVolumes')[0] }}}}", f"{{{{ ti.xcom_pull(task_ids='my_task', key='dockerVolumes')[1] }}}}"]

For anyone who is using airflow >= 2.0.0
volumes parameter was deprecated in favor of mounts which is a list of docker.types.Mount. Fortunately, airflow evaluates templates recursively, which means that every object with template_parameters that is a value of any field in template_fields of the parent object will be evaluated as well. So in order to evaluate docker.types.Mount fields we need to do two things:
Add mounts to DockerOperator.template_fields
Add template_fields = (<field_name_1>, ..., <field_name_n>) to every docker.types.Mount.
So to template source, target, and type parameters in the DockerOperator subclass you can implement it the following way:
class DockerOperatorExtended(DockerOperator):
template_fields = (*DockerOperator.template_fields, 'mounts')
def __init__(self, **kwargs):
mounts = kwargs.get('mounts', [])
for mount in mounts:
mount.template_fields = ('Source', 'Target', 'Type')
kwargs['mounts'] = mounts
super().__init__(**kwargs)

In order to provide a value of a field by template, that field must be part of template_fields.
docker operator does not have volume as template_fields that is why you cannot set it via jinja2.
The solution for this is to extend DockerOperator and include volume as template_fields.

Another solution is writing your own ninja filter (for spliting pulled string from xcom) and add it as elem of 'user_defined_filters' in DAG object initialization.

Related

Multiplayer game using pygame [duplicate]

We are working on a Top-Down-RPG-like Multiplayer game for learning purposes (and fun!) with some friends. We already have some Entities in the Game and Inputs are working, but the network implementation gives us headache :D
The Issues
When trying to convert with dict some values will still contain the pygame.Surface, which I dont want to transfer and it causes errors when trying to jsonfy them. Other objects I would like to transfer in a simplyfied way like Rectangle cannot be converted automatically.
Already functional
Client-Server connection
Transfering JSON objects in both directions
Async networking and synchronized putting into a Queue
Situation
A new player connects to the server and wants to get the current game state with all objects.
Data-Structure
We use a "Entity-Component" based architecture, so we separated the game logic very strictly into "systems", while the data is stored in the "components" of each Entity. The Entity is a very simple container and has nothing more than a ID and a list of components. Example Entity (shorten for better readability):
Entity
|-- Component (Moveable)
|-- Component (Graphic)
| |- complex datatypes like pygame.SURFACE
| `- (...)
`- Component (Inventory)
We tried different approaches, but all seems not to fit very well or feel "hacky".
pickle
Very Python near, so not easy to implement other clients in future. And I´ve read about some security risks when creating items from network in this dynamic way how pickle it offers. It does not even solve the Surface/Rectangle issue.
__dict__
Still contains the reference to the old objects, so a "cleanup" or "filter" for unwanted datatypes happens also in the origin. A deepcopy throws Exception.
...\Python\Python36\lib\copy.py", line 169, in deepcopy
rv = reductor(4)
TypeError: can't pickle pygame.Surface objects
Show some code
The method of the "EnitityManager" Class which should generate the Snapshot of all Entities, including their components. This Snapshot should be converted to JSON without any errors - and if possible without much configuration in this core-class.
class EnitityManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = deepcopy(e.__dict__)
# Components are Objects, but dictionary is required for transfer
cmp_obj_list = result['entities'][e.id]['components']
# Empty the current list of components, its going to be filled with
# dictionaries of each cmp which are cleaned for the dump, because
# of the errors directly coverting the whole datastructure to JSON
result['entities'][e.id]['components'] = {}
for cmp in cmp_obj_list:
cmp_copy = deepcopy(cmp)
cmp_dict = cmp_copy.__dict__
# Only list, dict, int, str, float and None will stay, while
# other Types are being simply deleted including their key
# Lists and directories will be cleaned ob recursive as well
cmp_dict = self.clean_complex_recursive(cmp_dict)
result['entities'][e.id]['components'][type(cmp_copy).__name__] \
= cmp_dict
logging.debug("EntityMgr: Entity#3: %s" % result['entities'][3])
return result
Expectation and actual results
We can find a way to manually override elements which we dont want. But as the list of components will increase we have to put all the filter logic into this core class, which should not contain any components specializations.
Do we really have to put all the logic into the EntityManager for filtering the right objects? This does not feel good, as I would like to have all convertion to JSON done without any hardcoded configuration.
How to convert all this complex data in a most generic approach?
Thanks for reading so far and thank you very much for your help in advance!
Interesting articles which we were already working threw and maybe helpful for others with similar issues
https://gafferongames.com/post/what_every_programmer_needs_to_know_about_game_networking/
http://code.activestate.com/recipes/408859/
https://docs.python.org/3/library/pickle.html
UPDATE: Solution - thx 2 sloth
We used a combination of the following architecture, which works really great so far and is also good to maintain!
Entity Manager now calls the get_state() function of the entity.
class EntitiyManager:
def generate_world_snapshot(self):
""" Returns a dictionary with all Entities and their components to send
this to the client. This function will probably generate a lot of data,
but, its to send the whole current game state when a new player
connects or when a complete refresh is required """
# It should be possible to add more objects to the snapshot, so we
# create our own Snapshot-Datastructure
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = e.get_state()
return result
The Entity has only some basic attributes to add to the state and forwards the get_state() call to all the Components:
class Entity:
def get_state(self):
state = {'name': self.name, 'id': self.id, 'components': {}}
for cmp in self.components:
state['components'][type(cmp).__name__] = cmp.get_state()
return state
The components itself now inherit their get_state() method from their new superclass components, which simply cares about all simple datatypes:
class Component:
def __init__(self):
logging.debug('generic component created')
def get_state(self):
state = {}
for attr, value in self.__dict__.items():
if value is None or isinstance(value, (str, int, float, bool)):
state[attr] = value
elif isinstance(value, (list, dict)):
# logging.warn("Generating state: not supporting lists yet")
pass
return state
class GraphicComponent(Component):
# (...)
Now every developer has the opportunity to overlay this function to create a more detailed get_state() function for complex types directly in the Component Classes (like Graphic, Movement, Inventory, etc.) if it is required to safe the state in a more accurate way - which is a huge thing for maintaining the code in future, to have these code pieces in one Class.
Next step is to implement the static method for creating the items from the state in the same Class. This makes this working really smooth.
Thank you so much sloth for your help.

Do we really have to put all the logic into the EntityManager for filtering the right objects?
No, you should use polymorphism.
You need a way to represent your game state in a form that can be shared between different systems; so maybe give your components a method that will return all of their state, and a factory method that allows you create the component instances out of that very state.
(Python already has the __repr__ magic method, but you don't have to use it)
So instead of doing all the filtering in the entity manager, just let him call this new method on all components and let each component decide that the result will look like.
Something like this:
...
result = {'entities': {}}
entities = self.get_all_entities()
for e in entities:
result['entities'][e.id] = {'components': {}}
for cmp in e.components:
result['entities'][e.id]['components'][type(cmp).__name__] = cmp.get_state()
...
And a component could implement it like this:
class GraphicComponent:
def __init__(self, pos=...):
self.image = ...
self.rect = ...
self.whatever = ...
def get_state(self):
return { 'pos_x': self.rect.x, 'pos_y': self.rect.y, 'image': 'name_of_image.jpg' }
#staticmethod
def from_state(state):
return GraphicComponent(pos=(state.pos_x, state.pos_y), ...)
And a client's EntityManager that recieves the state from the server would iterate for the component list of each entity and call from_state to create the instances.

Web2Py - generic view for csv?

For web2py there are generic views e.g. for JSON.
I could not find a sample.
When looking at the web2py manual 10.1.2 and 10.1.6, its written:
'.. define a "generic.csv" file, but one would have to specify the name of the object to be serialized ("animals" in the example)'
Looking at the generic pdf view
{{
import os
from gluon.contrib.generics import pdf_from_html
filename = '%s/%s.html' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
html=response.render(filename)
else:
html=BODY(BEAUTIFY(response._vars))
pass
=pdf_from_html(html)
}}
and also the specified csv (Manual charpter 10.1.6):
{{
import cStringIO
stream=cStringIO.StringIO() animals.export_to_csv_file(stream)
response.headers['Content-Type']='application/vnd.ms-excel'
response.write(stream.getvalue(), escape=False)
}}
Massimo is writing: 'web2py does not provide a "generic.csv";'
He is not fully against it but..
So lets try to get it and deactivate when necessary.
The generic view should look similar to (the non working)
(well, this we better call pseudocode as it is not working):
{{
import os
from gluon.contrib.generics export export_to_csv_file(stream)
filename = '%s/%s' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
csv=response.render(filename)
else:
csv=BODY(BEAUTIFY(response._vars))
pass
= export_to_csv_file(stream)
}}
Whats wrong?
Or is there a sample?
Is there a reson not to have a generic csv?

{{
import os
from gluon.contrib.generics export export_to_csv_file(stream)
filename = '%s/%s' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
csv=response.render(filename)
else:
csv=BODY(BEAUTIFY(response._vars))
pass
= export_to_csv_file(stream)
}}
Adapting the generic.pdf code so literally as above would not work for CSV output, as the generic.pdf code is first executing the standard HTML template and then simply converting the generated HTML to a PDF. This approach does not make sense for CSV, as CSV requires data of a particular structure.
As stated in the documentation:
Notice that one could also define a "generic.csv" file, but one would
have to specify the name of the object to be serialized ("animals" in
the example). This is why we do not provide a "generic.csv" file.
The execution of a view is triggered by a controller action returning a dictionary. The keys of the dictionary become available as variables in the view execution environment (the entire dictionary is also available as response._vars). If you want to create a generic.csv view, you therefore need to establish some conventions about what variables are in the returned dictionary as well as the possible structure(s) of the returned data.
For example, the controller could return something like dict(data=mydata). The code in generic.csv would then access the data variable and could convert it to CSV. In that case, there would have to be some convention about the structure of data -- perhaps it could be required to be a list of dictionaries or a DAL Rows object (or optionally either one).
Another possible convention is for the controller to return something like dict(columns=mycolumns, rows=myrows), where columns is a list of column names and rows is a list of lists containing the data for each row.
The point is, there is no universal convention for what the controller might return and how that can be converted into CSV, so you first need to decide on some conventions and then write generic.csv accordingly.
For example, here is a very simple generic.csv that would work only if the controller returns dict(rows=myrows), where myrows is a DAL Rows object:
{{
import cStringIO
stream=cStringIO.StringIO() rows.export_to_csv_file(stream)
response.headers['Content-Type']='application/vnd.ms-excel'
response.write(stream.getvalue(), escape=False)
}}

I tried:
# Sample from Web2Py manual 10.1.1 Page 464
def count():
session.counter = (session.counter or 0) + 1
return dict(counter=session.counter, now = request.now)
#and my own creation from a SQL table (if possible used for json and csv):
def csv_rt_bat_c_x():
battdat = db().select(db.csv_rt_bat_c.rec_time, db.csv_rt_bat_c.cellnr,
db.csv_rt_bat_c.volt_act, db.csv_rt_bat_c.id).as_list()
return dict(battdat=battdat)
Bot times I get an error when trying csv. It works for /default/count.json but not for /default/count.csv
I suppose the requirement:
dict(rows=myrows)
"where myrows is a DAL Rows object" is not met.

textX: How to generate object names with ObjectProcessors?

I have a simple example model where I would like to generate names for the objects of the Position rule that were not given a name with as <NAME>. This is needed so that I can find them later with the built-in FQN scope provider.
My idea would be to do this in the position_name_generator object processor but that will be only be called after the whole model is parsed. I don´t really understand the reason for that, since by the time I would need a Position object in the Project, the objects are already created, still the object processor will not be called.
Another idea would be to do this in a custom scope provider for Position.location which would then first do the name generation and then use the built-in FQN to find the Location object. Although this would work, I consider this hacky and I would prefer to avoid it.
What would be the textX way of solving this issue?
(Please take into account that this is only a small example. In reality a similar functionality is required for a rather big and complex model. To change this behaviour with the generated names is not possible since it is a requirement.)
import textx
MyLanguage = """
Model
: (locations+=Location)*
(employees+=Employee)*
(positions+=Position)*
(projects+=Project)*
;
Project
: 'project' name=ID
('{'
('use' use=[Position])*
'}')?
;
Position
: 'define' 'position' employee=[Employee|FQN] '->' location=[Location|FQN] ('as' name=ID)?
;
Employee
: 'employee' name=ID
;
Location
: 'location' name=ID
( '{'
(sub_location+=Location)+
'}')?
;
FQN
: ID('.' ID)*
;
Comment:
/\/\/.*$/
;
"""
MyCode = """
location Building
{
location Entrance
location Exit
}
employee Hans
employee Juergen
// Shall be referred to with the given name: "EntranceGuy"
define position Hans->Building.Entrance as EntranceGuy
// Shall be referred to with the autogenerated name: <Employee>"At"<LastLocation>
define position Juergen->Building.Exit
project SecurityProject
{
use EntranceGuy
use JuergenAtExit
}
"""
def position_name_generator(obj):
if "" == obj.name:
obj.name = obj.employee.name + "At" + obj.location.name
def main():
meta_model = textx.metamodel_from_str(MyLanguage)
meta_model.register_scope_providers({
"Position.location": textx.scoping.providers.FQN(),
})
meta_model.register_obj_processors({
"Position": position_name_generator,
})
model = meta_model.model_from_str(MyCode)
assert model, "Could not create model..."
if "__main__" == __name__:
main()

What is the textx way to solve this...
The use case you describe is to define the name of an object based on other model elements, including a reference to other model elements. This is currently not part of any test and use cases included in our test suite and the textx docu.
Object processors are executed at defined stages during model construction (see http://textx.github.io/textX/stable/scoping/#using-the-scope-provider-to-modify-a-model). In the described setup they are executed after reference resolution. Since the name to be defined/deduced itself is required for reference resolution, object processors cannot be used here (even if we allow to control when object processors are executed, before or after scope resolution, the described setup still will not work).
Given the dynamics of model loading (see http://textx.github.io/textX/stable/scoping/#using-the-scope-provider-to-modify-a-model), the solution is located within a scope provider (as you suggested). Here, we allow to control the order of reference resolution, such that references to the object being named by a custom procedure are postponed, until references required to deduce/define the name resolved.
Possible workaround
A preliminary sketch of how your use case can be solved is discussed in a https://github.com/textX/textX/pull/194 (with an attached issue https://github.com/textX/textX/issues/193). This textx PR contains a version of scoping.py you could probably use for your project (just copy and rename the module). A full-fledged solution could be part of the textx TEP-001, where we plan to make scoping more controllable to the end-user.
Playing around with this absolutely interesting issue revealed new aspects to me for the textx framework.
names dependent on model contents (involving unresolved references). This name resolution, which can be Postponed (in the referenced PR, see below), in terms of our reference resolution logic.
Even more interesting are the consequences of that: What happens to references pointing to locations, where unresolved names are found? Here, we must postpone the reference resolution process, because we cannot know if the name might match when resolved...
Your example is included: https://github.com/textX/textX/blob/analysis/issue193/tests/functional/test_scoping/test_name_resolver/test_issue193_auto_name.py

Symfony4 Extension->processConfiguration merges Parameters strangely

I have created a Symfony Bundle -- using the DependencyInjection-mechanism that Symfony4 provides.
When my custom Extension is initialized, the provided parameters from my_extension.yaml get piped into my "load"-method as expected, as described in the SymfonyDocs ( https://symfony.com/doc/current/bundles/configuration.html ).
However, if I add an additional yaml-file into the designated package-scope (e.g. config/packages/dev/my_extension.yaml), these parameters end up being merged by the built-in processConfiguration() - method in a strange way:
Whereas the first config is parsed correctly and all parameter keys are kept intact, the second file is not merged as expected, but all the contained values get transformed into numeric array keys, i.e. the original parameter keys get lost on the way.
Example:
Contents of config/packages/my_extension.yaml
my_extension:
parameters:
some_attribute: "original_value1"
some_other_attribute: "original_value2"
Contents of config/packages/dev/my_extension.yaml
my_extension:
parameters:
some_attribute: "new_value1"
specific_attribute: "new_value3"
Results in a merged configuration array that looks like this
parameters:
some_attribute: "new_value1"
some_other_attribute: "new_value2"
0: "new_value1"
1: "new_value2"
2: "new_value3"
while I would expect the resulting configuration to be
parameters:
some_attribute: "new_value1"
some_other_attribute: "original_value2"
specific_attribute: "new_value3"
The last (correct) result is what I get if I manually merge the configs in the "load"-method of my extension like this:
$mergedConfig = [];
foreach($configs as $config) {
$mergedConfig = array_replace_recursive($mergedConfig, $config);
}
$config = $this->processConfiguration($configuration, [$mergedConfig]);
However, why can't I rely on the built-in merging strategy that Symfony4 provides for this scenario? Is this a bug or did I get anything wrong about how Symfony is supposed to merge parameters from different Config-sources?

accessing a variable outside a Requesthandler

i'm using CSS3 accordion effect, and i want to detect if a hacker will
make a script to make a parallel request; ie:
i've a login form and a registration form in the same page, but only
one is visible because there is a CSS3: to access the page, the user
agent must be HTML5 compatible.
the tip i use is:
class Register(tornado.web.RequestHandler):
def post(self):
tt = self.get_argument("_xsrf") + str(time.time())
rtime = float(tt.replace(self.get_argument("_xsrf"), ""))
print rtime
class LoginHandler(BaseHandler):
def post(self):
tt = self.get_argument("_xsrf") + str(time.time())
ltime = float(tt.replace(self.get_argument("_xsrf"), ""))
print ltime
i've used the xsrf variable because it's unique for every user, to
avoid making the server think that the request is coming from the same
machine.
now what i want: how to make the difference between time values:
abs(ltime - rtime) ; mean, how do i access to rtime outside the class,
i just know how to access the value outside the method, i want to make
this operation to detect if the value is small, then the user is using
a script to make a parallel request to kill the server!
in other words (for general python users)
if i have:
class Product:
def info(self):
self.price = 1000
def show(self):
print self.price
>>> car = Product()
>>> car.info()
>>> car.show()
1000
but what if i've another
class User:
pass
then how do make a method that prints me the self.price, i've tried
inheritance, but got error: AttributeError: User instance has no
attribute 'price', so only methods are passed, not attributs?

It sounds like you need to understand Model objects and patterns that use persistant storage of data. tornado.web.RequestHandler and any object that you subclass from it only exists for the duration of your request. From when the URL is received on the server to when data is sent back to the browser via a self.write() or self.finish().
I would recommend you look at some of the Django or Flask tutorials for some basic ideas of how to build a MVC application in Python (There is no Tornado Tutorials that cover this that I know of).

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008