Symfony4 Extension->processConfiguration merges Parameters strangely - configuration

I have created a Symfony Bundle -- using the DependencyInjection-mechanism that Symfony4 provides.
When my custom Extension is initialized, the provided parameters from my_extension.yaml get piped into my "load"-method as expected, as described in the SymfonyDocs ( https://symfony.com/doc/current/bundles/configuration.html ).
However, if I add an additional yaml-file into the designated package-scope (e.g. config/packages/dev/my_extension.yaml), these parameters end up being merged by the built-in processConfiguration() - method in a strange way:
Whereas the first config is parsed correctly and all parameter keys are kept intact, the second file is not merged as expected, but all the contained values get transformed into numeric array keys, i.e. the original parameter keys get lost on the way.
Example:
Contents of config/packages/my_extension.yaml
my_extension:
parameters:
some_attribute: "original_value1"
some_other_attribute: "original_value2"
Contents of config/packages/dev/my_extension.yaml
my_extension:
parameters:
some_attribute: "new_value1"
specific_attribute: "new_value3"
Results in a merged configuration array that looks like this
parameters:
some_attribute: "new_value1"
some_other_attribute: "new_value2"
0: "new_value1"
1: "new_value2"
2: "new_value3"
while I would expect the resulting configuration to be
parameters:
some_attribute: "new_value1"
some_other_attribute: "original_value2"
specific_attribute: "new_value3"
The last (correct) result is what I get if I manually merge the configs in the "load"-method of my extension like this:
$mergedConfig = [];
foreach($configs as $config) {
$mergedConfig = array_replace_recursive($mergedConfig, $config);
}
$config = $this->processConfiguration($configuration, [$mergedConfig]);
However, why can't I rely on the built-in merging strategy that Symfony4 provides for this scenario? Is this a bug or did I get anything wrong about how Symfony is supposed to merge parameters from different Config-sources?

Related

Web2Py - generic view for csv?

For web2py there are generic views e.g. for JSON.
I could not find a sample.
When looking at the web2py manual 10.1.2 and 10.1.6, its written:
'.. define a "generic.csv" file, but one would have to specify the name of the object to be serialized ("animals" in the example)'
Looking at the generic pdf view
{{
import os
from gluon.contrib.generics import pdf_from_html
filename = '%s/%s.html' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
html=response.render(filename)
else:
html=BODY(BEAUTIFY(response._vars))
pass
=pdf_from_html(html)
}}
and also the specified csv (Manual charpter 10.1.6):
{{
import cStringIO
stream=cStringIO.StringIO() animals.export_to_csv_file(stream)
response.headers['Content-Type']='application/vnd.ms-excel'
response.write(stream.getvalue(), escape=False)
}}
Massimo is writing: 'web2py does not provide a "generic.csv";'
He is not fully against it but..
So lets try to get it and deactivate when necessary.
The generic view should look similar to (the non working)
(well, this we better call pseudocode as it is not working):
{{
import os
from gluon.contrib.generics export export_to_csv_file(stream)
filename = '%s/%s' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
csv=response.render(filename)
else:
csv=BODY(BEAUTIFY(response._vars))
pass
= export_to_csv_file(stream)
}}
Whats wrong?
Or is there a sample?
Is there a reson not to have a generic csv?
{{
import os
from gluon.contrib.generics export export_to_csv_file(stream)
filename = '%s/%s' % (request.controller,request.function)
if os.path.exists(os.path.join(request.folder,'views',filename)):
csv=response.render(filename)
else:
csv=BODY(BEAUTIFY(response._vars))
pass
= export_to_csv_file(stream)
}}
Adapting the generic.pdf code so literally as above would not work for CSV output, as the generic.pdf code is first executing the standard HTML template and then simply converting the generated HTML to a PDF. This approach does not make sense for CSV, as CSV requires data of a particular structure.
As stated in the documentation:
Notice that one could also define a "generic.csv" file, but one would
have to specify the name of the object to be serialized ("animals" in
the example). This is why we do not provide a "generic.csv" file.
The execution of a view is triggered by a controller action returning a dictionary. The keys of the dictionary become available as variables in the view execution environment (the entire dictionary is also available as response._vars). If you want to create a generic.csv view, you therefore need to establish some conventions about what variables are in the returned dictionary as well as the possible structure(s) of the returned data.
For example, the controller could return something like dict(data=mydata). The code in generic.csv would then access the data variable and could convert it to CSV. In that case, there would have to be some convention about the structure of data -- perhaps it could be required to be a list of dictionaries or a DAL Rows object (or optionally either one).
Another possible convention is for the controller to return something like dict(columns=mycolumns, rows=myrows), where columns is a list of column names and rows is a list of lists containing the data for each row.
The point is, there is no universal convention for what the controller might return and how that can be converted into CSV, so you first need to decide on some conventions and then write generic.csv accordingly.
For example, here is a very simple generic.csv that would work only if the controller returns dict(rows=myrows), where myrows is a DAL Rows object:
{{
import cStringIO
stream=cStringIO.StringIO() rows.export_to_csv_file(stream)
response.headers['Content-Type']='application/vnd.ms-excel'
response.write(stream.getvalue(), escape=False)
}}
I tried:
# Sample from Web2Py manual 10.1.1 Page 464
def count():
session.counter = (session.counter or 0) + 1
return dict(counter=session.counter, now = request.now)
#and my own creation from a SQL table (if possible used for json and csv):
def csv_rt_bat_c_x():
battdat = db().select(db.csv_rt_bat_c.rec_time, db.csv_rt_bat_c.cellnr,
db.csv_rt_bat_c.volt_act, db.csv_rt_bat_c.id).as_list()
return dict(battdat=battdat)
Bot times I get an error when trying csv. It works for /default/count.json but not for /default/count.csv
I suppose the requirement:
dict(rows=myrows)
"where myrows is a DAL Rows object" is not met.

Saving JSON data to DB in Zotonic

Im trying to write a small app that retrieves a JSON file (it contains a list of items, which all have some properties), saves its contents to the DB and then displays some of it later on. I have Zotonic up and running, and generating some HTML is no problem.
ATM i'm stuck trying to figure out how to define a custom resource and how to get the data from the JSON in the DB. When the data is there I should be fine, that part seems covered ok by the documentation.
I wrote some standalone erlang scripts that fetch the data and I noticed that Zotonic has a library for decoding JSON so that part should be fine. Any tips on where to put which code or where to look further?
The z_db module allows for creating custom tables by using:
z_db:create_table(Table, Cols, Context).
The Table variable is your table name which can be either an atom or a list containing a single atom.
The Cols is a list of column definitions, which are defined by records. Currently the record definition (you can find this in include/zotonic.hrl) is:
-record(column_def, {name, type, length, is_nullable=true, default, primary_key}).
See Erlang docs on records for more info on records
Example code which I put in users/sites/[sitename]/models/m_[sitename].erl:
init(Context) ->
case z_db:table_exists(?table,Context) of
false ->
z_db:create_table(tablename,
[
#column_def{name=id, type="serial"},
#column_def{name=gid, type="integer", is_nullable=false},
#column_def{name=magnitude, type="real"},
#column_def{name=depth, type="real"},
#column_def{name=location, type="character varying"},
#column_def{name=time, type="integer"},
#column_def{name=date, type="integer"}
], Context);
true -> ok
end,
ok.
Pay attention to what options of the record you specify. Most of the errors I got were e.g. from specifying a length on the integer fields.
The models/m_sitename:init/1 does not get called on site start. The sitename:init/1 does get called so I call the init function there to ensure the table exists. Example:
init(Context) ->
m_sitename:init(Context).
It is called by Zotonic with the Context variable of the site automatically. You can get this variable manually as well with z:c(sitename).. So if you call the m_sitename:init(Context). from somewhere else you would do:
m_sitename:init(z:c(sitename)).
Next, insertion in the DB can be done with:
z_db:insert(Table, PropList, Context).
Where Table is again an atom or a list containing a single atom representing the table name. Context is the same as above.
PropList is a property list which is a list containing tuples consisting of two elements where the first is an atom and the second is its associated value/property. Example:
PropList = [
{row, Value},
{anotherrow, AnotherValue}
].
Table = tablename.
Context = z:c(sitename).
z_db:insert(Table, PropList, Context).
See Erlang docs on Property Lists for more info on property lists.
=== The dependencies have been updated so if you build from source the step directly below is no longer needed ===
The JSON part is bit more tricky. Included with Zotonic are mochijson2 and as a secondary dependency also jiffy. The latest version of jiffy contains jiffy:decode/2 which allows you to specify maps as a return type. Much more readable than the standard {struct, {struct, <<"">>}} monster. To update to the latest version edit the line in deps/twerl/rebar.config that says
{jiffy, ".*", {git, "https://github.com/davisp/jiffy.git", {tag, "0.8.3"}}},
to
{jiffy, ".*", {git, "https://github.com/davisp/jiffy.git", {tag, "0.14.3"}}},
Now run z:m(). in the Zotonic shell. (you must do this after every change in your code).
Now check in the Zotonic shell if there is a jiffy:decode/2 available by typing jiffy: <tab>, it will show a list of available functions and their arity.
To retrieve a JSON file from the internet run:
{ok, {{_, 200, _}, _, Body}} = httpc:request(get, {"url-to-JSON-here", []}, [], [])
Which will yield the variable Body with the contents. See Erlang docs on http client for more info on this call.
Next convert the contents of Body to Erlang terms with:
JsonData = jiffy:decode(Body, [return_maps]).
What you have to do next depends a lot on the structure of your JSON resource. Keep in mind that everything is now in binary UTF-8 encoded strings! If you print JsonData to screen (just enter JsonData. in your Zotonic/Erlang shell) you will see a lot of #map{<<"key"", <<"Value">>} this.
My data was nested so I had to extract the needed data like this:
[{_,ItemList}|_] = ListData.
This gave me a list of maps, and in order to deal with them as individual items I used the following function:
get_maps([]) ->
done;
get_maps([First|Rest]) ->
Map = maps:get(<<"properties">>, First),
case is_map(Map) of
true ->
map_to_proplist(Map),
get_maps(Rest);
false -> done
end,
done;
get_maps(_) ->
done.
As you might remember, the z_db:insert/3 function needs a property list to populate rows, so that what the call to map_to_proplist/1 is for. How this function looks is completely dependent on how your data looks but as an example here is what worked for me:
map_to_proplist(Map) ->
case is_map(Map) of
true ->
{Value1,_} = string:to_integer(binary_to_list(maps:get(<<"key1">>, Map))),
{Value2,_} = string:to_float(binary_to_list(maps:get(<<"key2">>, Map))),
{Value3,_} = string:to_float(binary_to_list(maps:get(<<"key3">>, Map))),
Value4 = binary_to_list(maps:get(<<"key4">>, Map)),
{Value5,_} = string:to_integer(binary_to_list(maps:get(<<"key5">>, Map))),
{Value6,_} = string:to_integer(binary_to_list(maps:get(<<"key6">>, Map))),
PropList = [{rowname1, Value1}, {rowname2, Value2}, {rowname3, Value3}, {rowname4, Value4}, {rowname5, Value5}, {rowname6, Value6}],
m_sitename:insert_items(PropList,z:c(sitename)),
ok;
false ->
ok
end.
See the documentation on string:to_list/1 as to why the tuples are needed when casting. The call to m_sitename:insert_items(PropList,z:c(sitename)) calls the z_db:insert/3 in models/m_sitename.erl but wrapped in a catch:
insert_items(PropList,Context) ->
(catch z_db:insert(?table, PropList, Context)).
Ok, quite a long post but this should get you up and running if you were looking for this answer.
The above was done with Zotonic 0.13.2 on Erlang/OTP 18.
A repost (except the JSON part) of my post in the Zotonic Developers group.

How can I set an expression to the FileSpec property on Foreach File enumerator?

I'm trying to create an SSIS package to process files from a directory that contains many years worth of files. The files are all named numerically, so to save processing everything, I want to pass SSIS a minimum number, and only enumerate files whose name (converted to a number) is higher than my minimum.
I've tried letting the ForEach File loop enumerate everything and then exclude files in a Script Task, but when dealing with hundreds of thousands of files, this is way too slow to be suitable.
The FileSpec property lets you specify a file mask to dictate which files you want in the collection, but I can't quite see how to specify an expression to make that work, as it's essentially a string match.
If there's an expression within the component somewhere which basically says Should I Enumerate? - Yes / No, that would be perfect. I've been experimenting with the below expression, but can't find a property to which to apply it.
(DT_I4)REPLACE( SUBSTRING(#[User::ActiveFilePath],FINDSTRING( #[User::ActiveFilePath], "\", 7 ) + 1 ,100),".txt","") > #[User::MinIndexId] ? "True" : "False"
Here is one way you can achieve this. You could use Expression Task combined with Foreach Loop Container to match the numerical values of the file names. Here is an example that illustrates how to do this. The sample uses SSIS 2012.
This may not be very efficient but it is one way of doing this.
Let's assume there is a folder with bunch of files named in the format YYYYMMDD. The folder contains files for the first day of every month since 1921 like 19210101, 19210201, 19210301 .... all the upto current month 20121101. That adds upto 1,103 files.
Let's say the requirement is only to loop through the files that were created since June 1948. That would mean the SSIS package has to loop through only the files greater than 19480601.
On the SSIS package, create the following three parameters. It is better to configure parameters for these because these values are configurable across environment.
ExtensionToMatch - This parameter of String data type will contain the extension that the package has to loop through. This will supplement the value to FileSpec variable that will be used on the Foreach Loop container.
FolderToEnumerate - This parameter of String data type will store the folder path that contains the files to loop through.
MinIndexId - this parameter of Int32 data type will contain the minimum numerical value above which the files should match the pattern.
Create the following four parameters that will help us loop through the files.
ActiveFilePath - This variable of String data type will hold the file name as the Foreach Loop container loops through each file in the folder. This variable is used in the expression of another variable. To avoid error, set it to a non-empty value, say 1.
FileCount - This is a dummy variable of Int32 data type will be used for this sample to illustrate the number of files that the Foreach Loop container will loop through.
FileSpec - This variable of String data type will hold the file pattern to loop through. Set the expression of this variable to below mentioned value. This expression will use the extension specified on the parameters. If there are no extensions, it will *.* to loop through all files.
"*" + (#[$Package::ExtensionToMatch] == "" ? ".*" : #[$Package::ExtensionToMatch])
ProcessThisFile - This variable of Boolean data type will evaluate whether a particular file matches the criteria or not.
Configure the package as shown below. Foreach loop container will loop through all the files matching the pattern specified on the FileSpec variable. An expression specified on the Expression Task will evaluate during runtime and will populate the variable ProcessThisFile. The variable will then be used on the Precedence constraint to determine whether to process the file or not.
The script task within the Foreach loop container will increment the counter of variable FileCount by 1 for each file that successfully matches the expression.
The script task outside the Foreach loop will simply display how many files were looped through by the Foreach loop container.
Configure the Foreach loop container to loop through the folder using the parameter and the files using the variable.
Store the file name in variable ActiveFilePath as the loop passes through each file.
On the Expression task, set the expression to the following value. The expression will convert the file name without the extension to a number and then will check if it evaluates to greater than the given number in the parameter MinIndexId
#[User::ProcessThisFile] = (DT_BOOL)((DT_I4)(REPLACE(#[User::ActiveFilePath], #[User::FileSpec] ,"")) > #[$Package::MinIndexId] ? 1: 0)
Right-click on the Precedence constraint and configure it to use the variable ProcessThisFile on the expression. This tells the package to process the file only if it matches the condition set on the expression task.
#[User::ProcessThisFile]
On the first script task, I have the variable User::FileCount set to the ReadWriteVariables and the following C# code within the script task. This increments the counter for file that successfully matches the condition.
public void Main()
{
Dts.Variables["User::FileCount"].Value = Convert.ToInt32(Dts.Variables["User::FileCount"].Value) + 1;
Dts.TaskResult = (int)ScriptResults.Success;
}
On the second script task, I have the variable User::FileCount set to the ReadOnlyVariables and the following C# code within the script task. This simply outputs the total number of files that were processed.
public void Main()
{
MessageBox.Show(String.Format("Total files looped through: {0}", Dts.Variables["User::FileCount"].Value));
Dts.TaskResult = (int)ScriptResults.Success;
}
When the package is executed with MinIndexId set to 1948061 (excluding this), it outputs the value 773.
When the package is executed with MinIndexId set to 20111201 (excluding this), it outputs the value 11.
Hope that helps.
From investigating how the ForEach loop works in SSIS (with a view to creating my own to solve the issue) it seems that the way it works (as far as I could see anyway) is to enumerate the file collection first, before any mask is specified. It's hard to tell exactly what's going on without seeing the underlying code for the ForEach loop but it seems to be doing it this way, resulting in slow performance when dealing with over 100k files.
While #Siva's solution is fantastically detailed and definitely an improvement over my initial approach, it is essentially just the same process, except using an Expression Task to test the filename, rather than a Script Task (this does seem to offer some improvement).
So, I decided to take a totally different approach and rather than use a file-based ForEach loop, enumerate the collection myself in a Script Task, apply my filtering logic, and then iterate over the remaining results. This is what I did:
In my Script Task, I use the asynchronous DirectoryInfo.EnumerateFiles method, which is the recommended approach for large file collections, as it allows streaming, rather than having to wait for the entire collection to be created before applying any logic.
Here's the code:
public void Main()
{
string sourceDir = Dts.Variables["SourceDirectory"].Value.ToString();
int minJobId = (int)Dts.Variables["MinIndexId"].Value;
//Enumerate file collection (using Enumerate Files to allow us to start processing immediately
List<string> activeFiles = new List<string>();
System.Threading.Tasks.Task listTask = System.Threading.Tasks.Task.Factory.StartNew(() =>
{
DirectoryInfo dir = new DirectoryInfo(sourceDir);
foreach (FileInfo f in dir.EnumerateFiles("*.txt"))
{
FileInfo file = f;
string filePath = file.FullName;
string fileName = filePath.Substring(filePath.LastIndexOf("\\") + 1);
int jobId = Convert.ToInt32(fileName.Substring(0, fileName.IndexOf(".txt")));
if (jobId > minJobId)
activeFiles.Add(filePath);
}
});
//Wait here for completion
System.Threading.Tasks.Task.WaitAll(new System.Threading.Tasks.Task[] { listTask });
Dts.Variables["ActiveFilenames"].Value = activeFiles;
Dts.TaskResult = (int)ScriptResults.Success;
}
So, I enumerate the collection, applying my logic as files are discovered and immediately adding the file path to my list for output. Once complete, I then assign this to an SSIS Object variable named ActiveFilenames which I'll use as the collection for my ForEach loop.
I configured the ForEach loop as a ForEach From Variable Enumerator, which now iterates over a much smaller collection (Post-filtered List<string> compared to what I can only assume was an unfiltered List<FileInfo> or something similar in SSIS' built-in ForEach File Enumerator.
So the tasks inside my loop can just be dedicated to processing the data, since it has already been filtered before hitting the loop. Although it doesn't seem to be doing much different to either my initial package or Siva's example, in production (for this particular case, anyway) it seems like filtering the collection and enumerating asynchronously provides a massive boost over using the built in ForEach File Enumerator.
I'm going to continue investigating the ForEach loop container and see if I can replicate this logic in a custom component. If I get this working I'll post a link in the comments.
The best you can do is use FileSpec to specify a mask, as you said. You could include at least some specs in it, like files starting with "201" for 2010, 2011 and 2012. Then, in some other task, you could filter out those you don't want to process (for instance, 2010).

Is there an easy way for cfengine3 to copy different files based on the OS its running

I have two different versions of linux/unix each running cfengine3. Is it possible to have one promises.cf file I can put on both machines that will copy different files based on what os is on the clients? I have been searching around the internet for a few hours now and have not found anything useful yet.
There are several ways of doing this. At the simplest, you can simply have different files: promises depending on the operating system, for example:
files:
ubuntu_10::
"/etc/hosts"
copy_from => mycopy("$(repository)/etc.hosts.ubuntu_10");
suse_9::
"/etc/hosts"
copy_from => mycopy("$(repository)/etc.hosts.suse_9");
redhat_5::
"/etc/hosts"
copy_from => mycopy("$(repository)/etc.hosts.redhat_5");
windows_7::
"/etc/hosts"
copy_from => mycopy("$(repository)/etc.hosts.windows_7");
This example can be easily simplified by realizing that the built-in CFEngine variable $(sys.flavor) contains the type and version of the operating system, so we could rewrite this example as follows:
"/etc/hosts"
copy_from => mycopy("$(repository)/etc.$(sys.flavor)");
A more flexible way to achieve this task is known in CFEngine terminology as "hierarchical copy." In this pattern, you specify an arbitrary list of variables by which you want files to be differentiated, and the order in which they should be considered, from most specific to most general. When the copy promise is executed, the most-specific file found will be copied.
This pattern is very simple to implement:
# Use single copy for all files
body agent control
{
files_single_copy => { ".*" };
}
bundle agent test
{
vars:
"suffixes" slist => { ".$(sys.fqhost)", ".$(sys.uqhost)", ".$(sys.domain)",
".$(sys.flavor)", ".$(sys.ostype)", "" };
files:
"/etc/hosts"
copy_from => local_dcp("$(repository)/etc/hosts$(suffixes)");
}
As you can see, we are defining a list variable called $(suffixes) that contains the criteria by which we want to differentiate the files. All the variables contained in this list are automatically defined by CFEngine, although you could use any arbitrary CFEngine variables. Then we simply include that variable, as a scalar, in our copy_from parameter. Because CFEngine does automatic list expansion, it will try each variable in turn, executing the copy promise multiple times (one for each value in the list) and copy the first file that exists. For example, for a Linux SuSE 11 machine called superman.justiceleague.com, the #(suffixes) variable will contain the following values:
{ ".superman.justiceleague.com", ".superman", ".justiceleague.com", ".suse_11",
".linux", "" }
When the file-copy promise is executed, implicit looping will cause these strings to be appended in sequence to "$(repository)/etc/hosts", so the following filenames will be attempted in sequence: hosts.superman.justiceleague.com, hosts.justiceleague.com, hosts.suse_11, hosts.linux and hosts. The first one to exist will be copied over /etc/hosts in the client, and the rest will be skipped.
For this technique to work, we have to enable "single copy" on all the files you want to process. This is a configuration parameter that tells CFEngine to copy each file at most once, ignoring successive copy operations for the same destination file. The files_single_copy parameter in the agent control body specifies a list of regular expressions to match filenames to which single-copy should apply. By setting it to ".*" we match all filenames.
For hosts that don't match any of the existing files, the last item on the list (an empty string) will cause the generic hosts file to be copied. Note that the dot for each of the filenames is included in $(suffixes), except for the last element.
I hope this helps.
(p.s. and shameless plug: this is taken from my upcoming book, "Learning CFEngine 3", published by O'Reilly)

How can I access the information associated to an object from a Mercurial plugin?

I am trying to write a small Mercurial extension, which, given the path to an object stored within the repository, it will tell you the revision it's at. So far, I'm working on the code from the WritingExtensions article, and I have something like this:
cmdtable = {
# cmd name function call
"whichrev": (whichrev,[],"hg whichrev FILE")
}
and the whichrev function has almost no code:
def whichrev(ui, repo, node, **opts):
# node will be the file chosen at the command line
pass
So , for example:
hg whichrev text_file.txt
Will call the whichrev function with node being set to text_file.txt. With the use of the debugger, I found that I can access a filelog object, by using this:
repo.file("text_file.txt")
But I don't know what I should access in order to get to the sha1 of the file.I have a feeling I may not be working with the right function.
Given a path to a tracked file ( the file may or may not appear as modified under hg status ), how can I get it's sha1 from my extension?
A filelog object is pretty low level, you probably want a filectx:
A filecontext object makes access to data related to a particular filerevision convenient.
You can get one through a changectx:
ctx = repo['.']
fooctx = ctx['foo']
print fooctx.filenode()
Or directly through the repo:
fooctx = repo.filectx('foo', '.')
Pass None instead of . to get the working copy ones.