What exactly is global data?
This may seem like a very rudimentary question but the reason I'm asking is because I'm wondering has the term become stretched over time - i.e it dosn't just apply to data in the "Global" namespace (in c++) or a variable that is available in every scope.
So, what do you consider to be global data?
I agree David, global can quite often mean different things to different people in different languages!
Personally I hate globals which are truly global, i.e. available to everything, everywhere. The greater the restriction of a scopes' variable, generally the better.
The scope of the information often has to be open to lots of functions within a module, this is OK, but should be limited where required. These I would define a module globals or local globals.
Variables which are shared between modules through a defined interface and only included as required are OK-ish, but data passed back and forth (or pointers to data) from/to functions are the best.
Of course, this is all my personal opinion in my native language (C) and might not agree with everyone's opinion!
Global data is a variable that can be put in any local scope (i.e. of a function) without passing it as a parameter or class attribute.
In some languages you need a global or extern keyword to import it, in others it enters the function's scope automatically.
Related
I'm using reagent to build several alternate root components, only one of which will be mounted on any given page; definitely either/or. These have a degree of commonality in their makeup, hence it will be convenient to move what is common among them to a common namespace.
What would be ideal is if in the file for each of these components I had the option to switch namespace into common, and add defs particular to the component, then switch back, thus avoiding circular dependencies nor needing any kind of inheritance.
I recalled this being possible in common lisp, how wonderful it was, and it also seems possible in clojure.
From Clojurescript docs: ns must be the first form and can only be used once, and in-ns is only usable from the repl.
I'm wondering if there's a way to achieve this kind of thing in clojurescript which is still eluding me.
If not I may need to reconsider my assumptions behind multiple alternate root components; the "many builds within one build" kind of idea, if that makes sense.
Update after some futher experimentation and confusion:
another option might be to split a single namespace across multiple files (is this possible?). Not sure what direction to turn in here.
The fact that in reagent I am using atoms in the global namespace is what's creating the need for circular dependencies if I use a separate namespace for common. Hence, wonder about one global namespace, in which case multiple files might help. Or is the way forward one giant file and one namespace??
Update: I've realised there is a great tension between keeping all app state globally (in my current case, multiple atoms), and passing app state around. My pattern currently is everything global, don't pass any of it around. Passing the necessary state as parameters to fns in the common namespace would solve the problem here (duh!), but then there's the question of what principles are being followed here regarding app state. If I just added a param whenever I needed one, but started with the idea that everything was global, there'd be no real principle to it...
In ClojureScript, everything is pre-compiled into a single static JavaScript "executable", so there is nothing like the repl you are used to in Clojure. Indeed, in CLJS the "Var" concept doesn't really after the compiler, they are just static (constant) variables and cannot be rebound.
Having said that, CLJS does emulate the behavior of Clojure dynamic variables via the binding form, so that may help you to reach your goal. As in CLJ, it creates what amounts to a (thread-local) global variable. This is a degenerate case in CLJS since there is only one thread. However, the source code looks identical to the CLJ case.
Another way to accomplish this is to just use a plain atom as a global variable so you don't have to pass a parameter around.
As always, when using a global variable, it reduces the number of parameters in function call trees, but it creates invisible dependencies between different parts of the code. Somethimes convenient, but usually a bad tradeoff.
Let's I have a class foo, with a variable bar. Now... I want that if there is a class moo, which has class foo as a superclass, I want that any attempts to write to, or better yet, even refer directly to bar will be errored out. This could save situations when someone is using my code (which could be compiled to byte-code), to not override, by having one's own variable with the same name
TclOO simply does not support the concept. Classes are not security boundaries in TclOO, just as namespaces are not security boundaries in plain Tcl (TclOO objects are really just fancy namespaces). Tcl's security boundaries are between interpreters, and between the Tcl script level and the (usually) C implementation level. We're considering adding “private” instance variables for Tcl 8.7, but even those won't be truly private; their names will still be predictable if you know how (and they will be accessible from outside the class; that's important for when using the variable with third-party code such as Tk). To reiterate: classes are not security boundaries.
If you have something that must be locked out of sight, it is easiest to implement it in C. You can plug in methods implemented in C into TclOO (applying whatever controls you can think of) and those methods can use the (C level only) metadata mechanism to create instance-attached storage that they can use. All the callbacks are in place to do deletion correctly at the right time. Methods in C are not much more complicated than commands in C; the function callback signature is a little different and the usage is a bit more complicated (because there are other standard operations on methods such as copying them) but if you can do one, you can figure out how to do the other too.
I've been looking at some Lua source code, and I often see things like this at the beginning of the file:
local setmetatable, getmetatable, etc.. = setmetatable, getmetatable, etc..
Do they only make the functions local to let Lua access them faster when often used?
Local data are on the stack, and therefore they do access them faster. However, I seriously doubt that the function call time to setmetatable is actually a significant issue for some program.
Here are the possible explanations for this:
Prevention from polluting the global environment. Modern Lua convention for modules is to not have them register themselves directly into the global table. They should build a local table of functions and return them. Thus, the only way to access them is with a local variable. This forces a number of things:
One module cannot accidentally overwrite another module's functions.
If a module does accidentally do this, the original functions in the table returned by the module will still be accessible. Only by using local modname = require "modname" will you be guaranteed to get exactly and only what that module exposed.
Modules that include other modules can't interfere with one another. The table you get back from require is always what the module stores.
A premature optimization by someone who read "local variables are accessed faster" and then decided to make everything local.
In general, this is good practice. Well, unless it's because of #2.
In addition to Nicol Bolas's answer, I'd add on to the 3rd point:
It allows your code to be run from within a sandbox after it's been loaded.
If the functions have been excluded from the sandbox and the code is loaded from within the sandbox, then it won't work. But if the code is loaded first, the sandbox can then call the loaded code and be able to exclude setmetatable, etc, from the sandbox.
I do it because it allows me to see the functions used by each of my modules
Additionally it protects you from others changing the functions in global environment.
That it is a free (premature) optimisation is a bonus.
Another subtle benefit: It clearly documents which variables (functions, modules) are imported by the module. And if you are using the module statement, it enforces such declarations, because the global environment is replaced (so globals are not available).
What are some examples of a dynamically scoped language? And what are the reasons for choosing that design? Is it because it is easy to implement?
Mathematica is another language that is dynamically scoped, via the Block construct. This is actually quite useful when working with formulas. It allows you to write things like
In[1]:= expr = a*t^2 + b*t+ c;
In[2]:= Block[{a = 1, b = -1, c = 2}, Table[expr, {t, 5}]]
Out[2]= {2, 4, 8, 14, 22}
which wouldn't work at all if variables like a and t were scoped lexically. It works particularly nicely with Mathematica's rule-rewriting system, which will, among other things, leave variables unevaluated (as symbolic expressions) if it doesn't have an existing definition for them.
Mathematica can fake lexical scoping with the Module construct, but what this really does is rewrite the expression in terms of new, allegedly unique symbol (you can cause clashes if you predict what the next unique symbol will be, which is easy in most cases). This means
Module[{x = 4},
Table[x * t, {t, 5}]]
will be turned into something like this:
Block[{x$134 = 4},
Table[x$134 * t, {t, 5}]
Emacs Lisp, in one of its libraries, has a construct (really a Lisp macro) called lexical-let that pulls exactly the same trick to fake lexical scoping.
There are performance advantages to real lexical scoping when you're compiling your language which you don't get with the fake lexicals of ELisp or Mathematica, since you need some mapping between the dynamic variable and its current value, which means doing lookups (through a hash table or property list or something) and additional layers of indirection.
EDIT: If you have only lexical variables, you can fake dynamic scoping by storing the original value of a global, lexical variable on entering the scope and guaranteeing that the old value is restored upon exiting the scope. In order to ensure that, you'll need something like Lisp's UNWIND-PROTECT or a finally block. I've seen this done using C++ destructors as well, mostly as an exercise.
Dynamically scoped languages are much easier to implement. To access variables which is not in the current activaiton record / stack frame, one just follows the control links. Static/lexical access links are then not needed, making stack frames smaller.
Dynamic variables can be "unpredictable" at runtime, because one needs to know in which order the actual stackframes are to know which variable will be used. This information is not available by just looking at the static structure of the code. One could quite easily get caught out if the actual call graph of the program is not easy to predict at implementation time. Thats why most languages today have static scoping (most Exception systems however, are dynamic as this is the most practical).
However in some cases, dynamically scoped variables are very useful. For example when redirecting output, you could using dynamic variables set standard output for local code and all code called from there on.
(let ((*standard-output* *some-other-stream*))
(stuff))
In this common-lisp example (from Seibel), standard output is bound to another stream for the duration of the let form, (inside its enclosing parens). When execution leaves the let, it goes back to whatever it was beforehand. See http://gigamonkeys.com/book/variables.html Peter Seibels free and excellent book, Practical Common Lisp, for a good discussion. In Seibels own words:
Dynamic bindings make global variables much more manageable, but it's important to notice they still allow action at a distance. Binding a global variable has two at a distance effects--it can change the behavior of downstream code, and it also opens the possibility that downstream code will assign a new value to a binding established higher up on the stack. You should use dynamic variables only when you need to take advantage of one or both of these characteristics.
Well, there's a bunch of websites that discuss the pro's and con's, so I'm not going there.
One interesting language that has some features that faintly resemble dynamic scope is XSLT; although XSLT's templates and variables and the like are lexically scoped, XSLT is of course all about XML - and the current position in the xml tree is "dynamically scoped" in the sense that the context node is global and thus that XPath expressions are evaluated not according to XSLT's lexical scope but according to it's dynamic evaluation.
Dynamic scope is/was easier to implement with interpreters. Most early Lisp interpreters were using dynamic scope. After several years lexical scope was found to have an advantage, but was first mostly implemented in Lisp compilers. Several implementations appeared that implemented dynamic scope in interpreted code and lexical scope in compiled code. Some provided a special language construct to provide closures. Lisp dialects like Scheme and Common Lisp required then that there is no difference between interpreted and compiled code and thus interpreted based implementations had to implement lexical scope, too.
Early Smalltalk implementations implemented dynamic scope. All kinds of Lisp dialect implementations implemented dynamic scope (Interlisp, UCI Lisp, Lisp Machine Lisp, MacLisp, ...).
Almost all new Lisp dialects from the last 20 years use lexical scope by default or even exclusively. Several publications have described in detail how to implement Lisp with lexical scope - so there is no excuse not to use lexical scope.
All the shell languages (bash, ksh, etc) use dynamic scoping.
I've read many times and agree with avoiding the use of globals to keep code orthogonal. Does the use of the config file to keep read only information that your program uses similar to using Globals?
If you're using config files in place of globals, then yes, they are similar.
Config files should only be used in cases where the end-user (presumably a computer-savvy user, like a developer) needs to declare settings for an application or piece of code, while keeping their hands out of the code itself.
My first reaction would be that it is not the same. I think the problem with globals is the read+write scenario. Config-files are readonly (at least in terms of execution).
In the same way constants are not considered bad programming behaviour. Config-files, at least in the way I use them, are just easy-changable constants.
Well, since a config file and a global variable can both have the effect of propagating changes throughout a system - they are roughly similar.
But... in the case of a configuration file that change is usually going to take place in a single, highly-visible (to the developer) location, and global variables can affect change in very sneaky and hard to track down ways -- so in this way the two concepts are not similar.
Having a configuration file ususally helps with DRY concepts, and it shouldn't hurt the orthogonality of the system, either.
Bonus points for using the $25 word 'orthogonal'. I had to look that one up in Wikipedia to find out the non-Euclidean definition.
Configuration files are really meant to be easily editable by the end user as a way of telling the program how to run.
A more specialized form of configuration files, user preferences, are used to remember things between program executions.
Global is related to a unique instance for an object which will never change, whereas config file is used as container for reference values, for objects within the application that can change.
One "global" object will never change during runtime, the other object is initialized through config file, but can change later on.
Actually, those objects not only can change during the lifetime of the application, they can also monitor the config file in order to realize "hot-change" (modification of their value without stopping/restarting the application), if that config file is modified.
They are absolutely not the same or replacements for eachother. A config file, or object can be used non-globally, ie passed explicitly.
You can of course have a global variable that refers to a config object, and that would be defeating the purpose.