PowerShell module design - Export-ModuleMember

PowerShell module design - Export-ModuleMember - function

I am building a module that exports a cmdlet that I would like to make available through my profile. The implementation of this cmdlet is spread across multiple implementation files that contain implementation functions I don't want to make publicly available. So I use Export-ModuleMember to hide them.
File get_something.psm1
import-module .\get_something_impl.psm1
function Get-Something {
[cmdletbinding()]
Get-SomethingImplementation
}
Export-ModuleMember -Function Get-Something
I then add get_something.psm1 to my profile. By exporting only Get-Something, all of my implementation functions remain "private".
The issue I'm experiencing is that when using the Export-ModuleMember command, I have to import a module in my implementation files every time I need a function inside of it. For example, assume I have a module, person.psm1, with a function, Get-Person, that I need to call throughout all of my implementation files. Now I must import person.psm1 in every single file that I need to call Get-Person. This is a result of using Export-ModuleMember-Function Get-Something. Without it, I would only need to import person.psm1 once and it would be available.
In essence, Export-ModuleMember is not only blocking my implementation to the outside. It's blocking it from my own implementation.
Is this expected and considered a normal aspect of designing PowerShell modules?

This was actually a bit of debate during the development of modules. Originally, Export-ModuleMember was required to export any function. This became tedious and limiting. So, by default, all functions from a module are visible, but variables and aliases are not, as long as you've never used Export-ModuleMember within the .PSM1.
If you use Export-ModuleMember, it begins to restrict that list. It may not be a bad idea to export a smaller number of functions, but you have to use it somewhat carefully.
You can either write:
Export-ModuleMember -Function a,b,c
which exports a few functions.
or
Export-ModuleMember -Function *
The latter one is equivalent to omitting Export-ModuleMember altogether.
You can use more restrictive wildcards if you'd like, but I find that 99% of the time, you don't need to bother with it at all.
The other thing you seem to be asking is how best to handle module dependencies. Nowadays, it's fairly common to import a module or two when writing a script, just like it's fairly common to include an assembly or two in a C# project. If you're doing this inside of a module, you can use the -Global flag on Import-Module, and avoid using -Force (which will reload the module). This makes it a notch more efficient to reuse the module in different functions. It also makes it less likely to have problems with "cycling" (unloading and reloading) the module, which, unfortunately, many modules do not do well.
The alternative to referencing the module in each function is using a module manifest (Get-Help New-ModuleManifest). Module manifests are very interesting, and required learning for many parts of module development. If you include a module in the RequiredModules list of the Module manifest, it will be automatically loaded before the module is imported (at least in PowerShell 3 and greater). If you include a module in the NestedModules list of the module manifest, it will be loaded as part of the module, and the commands exported by the module will be exported by your module instead.
Module design is a tricky beast, but it's very rewarding to do right. Best of luck.

Related

Clojurescript decoupled file & namespace?

I'm using reagent to build several alternate root components, only one of which will be mounted on any given page; definitely either/or. These have a degree of commonality in their makeup, hence it will be convenient to move what is common among them to a common namespace.
What would be ideal is if in the file for each of these components I had the option to switch namespace into common, and add defs particular to the component, then switch back, thus avoiding circular dependencies nor needing any kind of inheritance.
I recalled this being possible in common lisp, how wonderful it was, and it also seems possible in clojure.
From Clojurescript docs: ns must be the first form and can only be used once, and in-ns is only usable from the repl.
I'm wondering if there's a way to achieve this kind of thing in clojurescript which is still eluding me.
If not I may need to reconsider my assumptions behind multiple alternate root components; the "many builds within one build" kind of idea, if that makes sense.
Update after some futher experimentation and confusion:
another option might be to split a single namespace across multiple files (is this possible?). Not sure what direction to turn in here.
The fact that in reagent I am using atoms in the global namespace is what's creating the need for circular dependencies if I use a separate namespace for common. Hence, wonder about one global namespace, in which case multiple files might help. Or is the way forward one giant file and one namespace??
Update: I've realised there is a great tension between keeping all app state globally (in my current case, multiple atoms), and passing app state around. My pattern currently is everything global, don't pass any of it around. Passing the necessary state as parameters to fns in the common namespace would solve the problem here (duh!), but then there's the question of what principles are being followed here regarding app state. If I just added a param whenever I needed one, but started with the idea that everything was global, there'd be no real principle to it...

In ClojureScript, everything is pre-compiled into a single static JavaScript "executable", so there is nothing like the repl you are used to in Clojure. Indeed, in CLJS the "Var" concept doesn't really after the compiler, they are just static (constant) variables and cannot be rebound.
Having said that, CLJS does emulate the behavior of Clojure dynamic variables via the binding form, so that may help you to reach your goal. As in CLJ, it creates what amounts to a (thread-local) global variable. This is a degenerate case in CLJS since there is only one thread. However, the source code looks identical to the CLJ case.
Another way to accomplish this is to just use a plain atom as a global variable so you don't have to pass a parameter around.
As always, when using a global variable, it reduces the number of parameters in function call trees, but it creates invisible dependencies between different parts of the code. Somethimes convenient, but usually a bad tradeoff.

tcl - when to use package+namespace vs interp?

I'm just starting with TCL and trying to get my head around how to best define and integrate modules. There seem to be much effort put into the package+namespace concept, but from what I can tell interp is more powerful and lean for every thinkable scenario. In particular when it comes to hiding and renaming procedures, but also the lack of creep in the global namespace. The only reason to use package+namespaces seem to be because "once upon a time Sun said so".
When should I ever use package+namespace instead of interp?

Namespaces and packages work together. Interpreters are something else.
A namespace is a small scale naming context in Tcl. It can contain commands, variables and other namespaces. You can refer to entities in a namespace via either local names (foo) or via qualified names (bar::foo); if a qualified name starts with ::, it is relative to the (interpreter-)global namespace, and can be used to refer to its command or variable from anywhere in the interpreter. (FWIW, the TclOO object system builds extensively on top of namespaces; there is one namespace per object.)
A package is a high-level concept for a bunch of code supplied by some sort of library. Packages have abstract names (the name do not have to correspond to how the library's implementation is stored on disk) and a distinct version; you can ask for a particular version if necessary, though most of the time you don't bother. Packages can be implemented by multiple mechanisms, but they almost all come down to sourceing some number of Tcl scripts and loading some number of DLLs. Almost all packages declare commands, and they conventionally are encouraged to put those commands in a namespace with the same general name as the package. However, quite a few older packages do not do this for various reasons, mostly to do with compatibility with existing code.
An interpreter is a security context in Tcl. By default, Tcl creates one interpreter (plus another if it sets up the console window in wish). Named entities in one interpreter are completely distinct from named entities in another interpreter with a few key exceptions:
Channels have common names across all interpreters. This means that an interpreter can talk about channels owned by another interpreter, but merely being able to mention its name doesn't give permission to access the channel. (The stdin, stdout and stderr channels are shared by default.)
The interp alias command can be used to make alias commands, which are such that invoking a command (the alias) in one interpreter can cause a command (the implementation) in another interpreter to be called, with all arguments safely passed over. This allows one interpreter to expose whatever special calls it wants another interpreter to access without losing control, but it is up to the implementation of those commands to act safely on those arguments.
A safe interpreter is one with the unsafe commands of Tcl profiled out by default. (That's things like open, socket, source, load, cd, etc.) The parent interpreter that created the safe child interpreter can use the alias mechanism to add in exactly the functionality desired; it's very much analogous to an OS system call except you can easily make your own application-specific ones.
Tcl's threading package is designed to create one interpreter per thread (and the aliasing mechanism does not work across threads). That means that there's very little in the way of shared resources by default, and inter-thread communication is done via queued message passing.
In general, packages are required at most once per interpreter and are how you are recommended to access most third-party functionality. Namespaces are fairly lightweight and are used for all sorts of things, and interpreters are considered to be expensive; lots of quite thoroughly production-grade Tcl scripts only ever work with a single interpreter. (Threads are even more expensive than interpreters; it's good practice to match the number of threads you create to the hardware load that you wish to impose, probably through the use of suitable thread pools.)

The purpose of a module is to provide modular code, i.e. code that can easily be used by applications beyond the module writer's knowledge and control, and that encapsulates their own internals.
Package-namespace- and interpreter-based modules are probably equally good at encapsulation, but it's not as easy to make interpreter-based modules that play well with arbitrary applications (it is of course possible).
My own opinion is that interpreters are application level (I mostly use them for user input and for controlled evaluation), not module level. Both namespaces and packages have their warts, but in most cases they do what is expected of them with a minimum of fuss.
My recommendation is that if you are writing modules for your own benefit and interpreters serve you well, by all means use them. If you write modules that other people are to use, possibly including yourself in 18 months, you should stick with namespaces and packages.

How to encapsulate the autotools' macro definitions?

In the autoconf manual, it is noted that
AC_INIT (package, version, [bug-report], [tarname], [url])
defines multiple macro names such as AC_PACKAGE_NAME and PACKAGE_NAME.
Running configure also generates a config file with definition like the following:
define HAVE_LIBGMP 1
As I am writing C++ code, I find these macros annoying yet useful. In fact, it happened many times that I needed to link with a library that uses the autotools and thus has these macros in its headers. So the situation is that there is conflict on headers macros such as:
define PACKAGE_NAME "library"
define PACKAGE_NAME "mine"
So, I was wondering if there was a way to tell the autotools to define at least some of these macros inside some kind of structure as follows:
`struct header_information{
static string package_name;
static bug_report;
....
}`
and then initialize it with the right macro names.
This solution would keep these informations encapsulated and does not pollute the global namespace ?

It seems to me like you want to abuse a package-private, build-system-ony configuration header file (config.h) that just so happens to define a convenient macro name that you'd like to use. I think the pretty obvious answer is "don't do that", or else you're on your own.
Unless I'm misunderstanding you?
Those defines are there so that the particular library can use them. It's not meant for other things to include. In fact, the majority of the things in config.h are completely useless outside of the particular package.
That doesn't mean that the library that config.h file belongs to couldn't provide what you're looking for, by defining a public struct in a header that uses those variables. Or perhaps a library that uses pkg-config (if you're just looking for package names) can provide some of information for you. But I don't think that autotools would or should provide that information to you.

How does one make a self-contained package / library of functions

I am writing some support code in the common subset of Matlab/Octave, which comes in the form of a bunch of functions. Let's call it a package.
I want to be able to organize the package, i.e.,
put all the relevant function files in a single place, where users
are not supposed to store their code;
have some internal organization ('subpackages');
prevent namespace pollution;
have some mechanism for user code to 'import' parts of the package;
I don't necessarily want all functions I provide to be
visible from user clients.
On the Matlab side of things, this functionality is pretty much provided by package directories and the 'import' mechanism. This functionality doesn't appear to be available in Octave though (as of 3.6.1).
Given that, I wonder what options remain for organizing my support code package in Octave.
The option of putting everything in a directory and just have the user code do an ADDPATH feels rather unrefined, and doesn't give the level of control I want -- it only addresses point #1 of the list above.

There is plenty documentation here and examples in OctaveForge. Just browse the SVN.
Also there are personal packages all around. For example this one
Happy coding!

Why make global Lua functions local?

I've been looking at some Lua source code, and I often see things like this at the beginning of the file:
local setmetatable, getmetatable, etc.. = setmetatable, getmetatable, etc..
Do they only make the functions local to let Lua access them faster when often used?

Local data are on the stack, and therefore they do access them faster. However, I seriously doubt that the function call time to setmetatable is actually a significant issue for some program.
Here are the possible explanations for this:
Prevention from polluting the global environment. Modern Lua convention for modules is to not have them register themselves directly into the global table. They should build a local table of functions and return them. Thus, the only way to access them is with a local variable. This forces a number of things:
One module cannot accidentally overwrite another module's functions.
If a module does accidentally do this, the original functions in the table returned by the module will still be accessible. Only by using local modname = require "modname" will you be guaranteed to get exactly and only what that module exposed.
Modules that include other modules can't interfere with one another. The table you get back from require is always what the module stores.
A premature optimization by someone who read "local variables are accessed faster" and then decided to make everything local.
In general, this is good practice. Well, unless it's because of #2.

In addition to Nicol Bolas's answer, I'd add on to the 3rd point:
It allows your code to be run from within a sandbox after it's been loaded.
If the functions have been excluded from the sandbox and the code is loaded from within the sandbox, then it won't work. But if the code is loaded first, the sandbox can then call the loaded code and be able to exclude setmetatable, etc, from the sandbox.

I do it because it allows me to see the functions used by each of my modules
Additionally it protects you from others changing the functions in global environment.
That it is a free (premature) optimisation is a bonus.

Another subtle benefit: It clearly documents which variables (functions, modules) are imported by the module. And if you are using the module statement, it enforces such declarations, because the global environment is replaced (so globals are not available).

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008