Can a scope be made private in standard Tcl? - tcl

I tried to ask this in a different question but did a poor job of it. This first block of code is from a JS example in David Flanagan's book. The main point of the example is that after getCounter() returns, the scope that is shared by the methods next and reset is private and no longer accessible by the rest of the code. In other words, n cannot be changed except through the two methods.
The second block of code is from an example from John Ousterhout's book on Tcl and refers to namespaces. I am not implying that his example had to do with scope; I'd just like to know if it can be made to do so. It works about the same as the JS example, as far as having the procedures change the value of num, but the scope is not private. The outer code can simply set counter::num 10 which cannot be done in the JS example. At least there appears no way to access n.
My question is is there a way to make the scope private in Tcl as in the JS example, apart from using an OOP Tcl library?
Thank you.
"use strict";
function getCounter () {
var n = 0;
return {
next: function () { return ++n; },
reset: function () { n = 0; }
};
}
var counter = getCounter();
console.log(counter.next()); // 1
console.log(counter.next()); // 2
console.log(counter.next()); // 3
counter.reset();
console.log(counter.next()); // 1
console.log(counter.next()); // 2
console.log(counter.next()); // 3
// Cannot access n from here because
// private scope.
namespace eval counter {
variable num 0
proc next {} {
variable num
return [incr num]
}
proc reset {} {
variable num
set num 0
}
}
chan puts [counter::next]; # 1
chan puts [counter::next]; # 2
chan puts [counter::next]; # 3
counter::reset
chan puts [counter::next]; # 1
chan puts [counter::next]; # 2
chan puts [counter::next]; # 3
set counter::num 10
chan puts [counter::next]; # 11
chan puts [counter::next]; # 12

Not really - mainly because tcl does not really have references/pointers (everything is a value). The js mechanism you refer to is closures and you can't really have closures without references (multiple variables referring to the same object). See https://wiki.tcl-lang.org/page/Closures for a long discussion on this.
But tcl does have one mechanism that most other languages don't have that can be used to restrict access - only it won't be the kind of logic you are thinking about. Tcl has interp - the ability to create, delete, manipulate and execute entire tcl interpreters from tcl itself. There is a concept in tcl called safe interpreters (lots of articles on the subject). A safe interpreter is basically a tcl interpreter where you remove all the capabilities you don't want scripts to access.
This is possible because Tcl allows you to redefine or rename functions/procs, including built-in ones!. If you don't want scripts to access networking for example you can simply remove the socket command. In your case you can create an interp that has no set function - making it impossible to create or modify variables. Of course you yourself will need the set function for your own code to run so instead of deleting it you can just rename it to something 3rd parties won't guess like the smiley face emoji (😊 - which is a perfectly legal name for a proc!) or 3 NUL characters in a row (\0\0\0). Anyone trying to directly use the set command in your interp will just trigger an error.

The following requires Tcl 8.7.
You can create private variables in a TclOO class, which gives them a real name that is "hard to predict" and prevents most tampering.
oo::class create Counter {
private variable num
constructor {} { my reset }
method next {} { incr num }
method reset {} { set num 0 }
}
Counter create counter
chan puts [counter next]; # 1
chan puts [counter next]; # 2
chan puts [counter next]; # 3
counter reset
chan puts [counter next]; # 1
chan puts [counter next]; # 2
chan puts [counter next]; # 3
# Can't do these; they don't work
# set counter::num 10
# chan puts [counter next]; # 11
# chan puts [counter next]; # 12
The mechanism is as safe dunder "private" fields in Python, but actually a touch harder to guess (it uses an internal ID that's otherwise used to track object identity across renaming and deletion). But there is a name, which means that the variable can participate in, say, vwait (usually if the object tells it to).
Another approach is to use a coroutine:
coroutine counter apply {{} {
set num 0
while true {
switch [yield $num] {
next { incr num }
reset { set num 0 }
stop { return }
default { yieldto error "I won't do that..." }
}
}
}}
In this case, the variable is a local variable, and so shielded from most external access. (But in 8.7 you can use coroprobe to mess with it; that's designed for debugging this sort of scenario.)
If a value absolutely must be kept from user code, it should either be held in a different interpreter or in a C extension. Interpreters are not designed to have actual security boundaries within; they go at a different level. But for something as simple as a counter, it is not normally important to bother.(Using multiple interpreters makes a lot of sense when the code is outright untrusted, or in an application server scenario.)

Related

Is there any Tcl package/add-on that handles named arguments?

In Python, Ruby 2.0, Perl 6, and some hardware description languages, one can use named arguments. See this example. This makes the code more readable, easy to maintain, etc. Is there a way of getting it done/extension, in TCL 8.6, other than using a dictionary as a workaround?
In 8.6, use a dictionary parsed from args. The dict merge command can help:
proc example args {
if {[llength $args] % 2} {
return -code error "wrong # args: should be \"example ?-abc abc? ?-def def?\""
}
set defaults {
-abc 123
-def 456
}
set args [dict merge $defaults $args]
set abc [dict get $args -abc]
set def [dict get $args -def]
puts "abc=$abc, def=$def"
}
example; # abc=123, def=456
example -abc 789; # abc=789, def=456
example -def 789; # abc=123, def=789
example -def 246 -abc 135; # abc=135, def=246
You can go further than that with verifying (the tcl::prefix command can help) but it's a lot more work and doesn't buy you a lot more in production code. Not that that has stopped people from trying.
There are two proposals to add full named argument handling
(TIP #457, TIP #479) to 8.7 at the moment, but I'm not sure that either have really gained traction. (The problem from my perspective is the extra runtime cost that has to be borne by code that doesn't volunteer to support named arguments. There might be other issues too, such as disagreement over preferred syntax; I've not paid so much attention to that as I'm still fretting over the performance implications in a pretty hot piece of code.)
There is an entire page on the tcler's wiki that discusses named arguments: http://wiki.tcl.tk/10702
You can do it yourself with a little creativity. There are several mechanisms that allow you to do this:
procs can define other procs
proc behave just like a proc (the function definition system is not a syntax, it is a function call)
procs can use the args argument instead of positional parameter and manually process the list of arguments
you can execute code in any parent stack frame using uplevel
you can pull variables from any parent stack frame using upvar
everything is a string!
etc.
I'm not sure I've listed all the possible mechanisms.
My personal implementation of this idea is optproc: http://wiki.tcl.tk/20066
The implementation itself is quite simple:
proc optproc {name args script} {
proc $name args [
string map [list ARGS $args SCRIPT $script] {
foreach var {ARGS} {
set [lindex $var 0] [lindex $var 1]
}
foreach {var val} $args {
set [string trim $var -] $val
}
SCRIPT
}
]
}
I basically used string manipulation (string map) to directly insert the function body ($script) into the defined function without any substitutions etc. This is to avoid any $ or [] from being processed. There are many ways to do this but my go-to tool is string map.
This is similar to Donald's answer except for two things:
I don't transform args into a dict instead I manually process args and declare each local variable in a loop.
This is a meta solution. Instead of processing args I created another function (syntax) to create a function that processes args.
Usage (stolen from Donald's answer) would be:
optproc example {abc def} {
puts "abc=$abc, def=$def"
}
But note that my solution is not the only solution. There are many ways to do this limited only by creativity.

perform certain action when variable changes in TCL

In TCL do we have any mechanism that will keep polling for the variable change and perform certain action after that.
I have read about vwait but it is pausing the script. I want the script to be running and in between if the variable value changes, perform certain action.
Kind of asynchronous mode of vwait.
You can attach a trace to a variable so that you can do something immediately whenever the variable is changed (or, depending on flags, read from or deleted). Try out this example:
set abc 123
proc exampleCallback args {
global abc
puts "The variable abc is now $abc"
}
trace add variable abc write exampleCallback
incr abc
incr abc
incr abc
It's possible to trace local variables, but not recommended. Also, internally, the vwait command sets a trace that just trips a flag when the variable is written to; that flag signals the wait to end when the event loop is returned to. It just happens that that trace is set using Tcl's C API, not its script-level API…
You could also use a recursive procedure to keep polling for the current value of a variable at a particular interval and bow out of the recursion once a specific condition for the variable is met.
For Example :
set x 1
proc CheckVariableValue {
global x
if { $x >= 5 } {
puts "end"
return 1;
}
else{
incr x
puts $x
after 1000 CheckVariableValue
}
}
CheckVariableValue

What purpose does upvar serve?

In the TCL code that I currently work on, the arguments in each procedure is upvar'ed to a local variable so to speak and then used. Something like this:
proc configure_XXXX { params_name_abc params_name_xyz} {
upvar $params_name_abc abc
upvar $params_name_xyz xyz
}
From here on, abc and xyz will be used to do whatever. I read the upvar TCL wiki but could not understand the advantages. I mean why cant we just use the variables that have been received as the arguments in the procedure. Could anybody please elaborate?
I mean why cant we just use the variables that have been received as the arguments in the procedure.
You can. It just gets annoying.
Typically, when you pass the name of a variable to a command, it is so that command can modify that variable. The classic examples of this are the set and incr commands, both of which take the name of a variable as their first argument.
set thisVariable $thisValue
You can do this with procedures too, but then you need to access the variable from the context of the procedure when it is a variable that is defined in the context of the caller of the procedure, which might be a namespace or might be a different local variable frame. To do that, we usually use upvar, which makes an alias from a local variable to a variable in the other context.
For example, here's a reimplementation of incr:
proc myIncr {variable {increment 1}} {
upvar 1 $variable v
set v [expr {$v + $increment}]
}
Why does writing to the local variable v cause the variable in the caller's context to be updated? Because we've aliased it (internally, it set up via a pointer to the other variable's storage structure; it's very fast once the upvar has been done). The same underlying mechanism is used for global and variable; they're all boiled down to fast variable aliases.
You could do it without, provided you use uplevel instead, but that gets rather more annoying:
proc myIncr {variable {increment 1}} {
set v [uplevel 1 [list set $variable]]
set v [expr {$v + $increment}]
uplevel 1 [list set $variable $v]
}
That's pretty nasty!
Alternatively, supposing we didn't do this at all. Then we'd need to pass the variable in by its value and then assign the result afterwards:
proc myIncr {v {increment 1}} {
set v [expr {$v + $increment}]
return $v
}
# Called like this
set foo [myIncr $foo]
Sometimes the right thing, but a totally different way of working!
One of the core principles of Tcl is that pretty much anything you can do with a standard library command (such as if or puts or incr) could also be done with a command that you wrote yourself. There are no keywords. Naturally there might be some efficiency concerns and some of the commands might need to be done in another language such as C to work right, but the semantics don't make any command special. They all just plain commands.
The upvar command will allow you to modify a variable in a block and make this modification visible from parent block.
Try this:
# a function that will modify the variable passed
proc set_upvar { varname } {
upvar 1 $varname var
puts "var was $var\n"
set var 5
puts "var is now $var\n"
}
# a function that will use the variable but that will not change it
proc set_no_upvar { var } {
puts "var was $var\n"
set var 6
puts "var is now $var\n"
}
set foo 10
# note the lack of '$' here
set_upvar foo
puts "foo is $foo\n"
set_no_upvar $foo
puts "foo is $foo\n"
As it was mentioned in comment above, it is often used for passing function arguments by reference (call by reference). A picture costs a thousand words:
proc f1 {x} {
upvar $x value
set value 0
}
proc f2 {x} {
set x 0
}
set x 1
f1 x
puts $x
set x 1
f2 x
puts $x
will result in:
$ ./call-by-ref.tcl
0
1
With upvar we changed variable x outside of function (from 1 to 0), without upvar we didn't.

About passing around Tcl arrays holding lists

First off: I could fix my problem by myself, but I don't understand why my original solution did not work, and this is what I am interested in. I tried to make a compact example here:
I am dynamically building arrays, each array value being a list. Let's start with the following program:
# 'collector' is a callback function, expecting a container array, and some
# data used to populate the array.
proc generate { collector arr_name } {
eval $collector $arr_name first XXX YYY
eval $collector $arr_name second UUU VVV
}
# This is the callback function used in our example
proc collect { container_name key valuex valuey } {
upvar $container_name container
lappend container($key) [list $valuex $valuey]
}
# Procedure to write out an array
proc dump { arr_name } {
upvar $arr_name arr
puts $arr_name:
foreach key [array names arr] {
puts "$key : $arr($key)"
}
}
# Main program
array set containerA {}
generate [namespace code { collect }] containerA
dump containerA
Up to this point, nothing spectacular. Running this program produces the output
containerA:
second : {UUU VVV}
first : {XXX YYY}
But now let's extend this program somewhat
# Wrapper function to call 'generate' using a fixed collector function
# ("Currying" the first argument to generate)
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace code { collect }] $container_name ; # This works
# This would not work:
#generate [namespace code { collect }] container
}
array set containerB {}
coll_gen containerB
dump containerB
As written here, this would work too, and we get the output
containerB:
second : {UUU VVV}
first : {XXX YYY}
Now to my question: As you already can guess from the comments in the code, I had first written coll_gen as
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace code { collect }] container
}
My reasoning was that, since container is an alias to the array, the name of which was passed via the parameter list, I could equally well pass on the name of this alias to the 'generate' function. However, when I run the code (Tcl 8.5), it turns out that containerB is empty.
Why is it that it didn't work this way too?
The issue is one of evaluation scope.
Let's write out the call stack at the point where you're inside collect in the case where things don't work:
::
coll_gen containerB
generate {namespace inscope :: { collect }} container
namespace inscope :: { collect } container first XXX YYY
collect container first XXX YYY
Whoops! What's that namespace inscope? Where are the inner layers upvaring to? The result of namespace code is a wrapping with namespace inscope (which you shouldn't write directly; use namespace code or namespace eval) that arranges for the script formed by appending the other arguments (with appropriate metacharacter protection) to be run in the given namespace (:: in your case, I assume). This “run in the given namespace” requires adding another stack frame, and that's what the upvar is then poking into (it's probably created a global array called container, since the namespace inscope frame is a namespace-coupled one, not a “procedure local” stack frame).
You could use upvar 2 or maybe even upvar 3 (I'm not quite sure which) inside collect to work around this, but that's horrific and fragile.
You're better off writing your code like this:
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace which collect] container
}
proc generate { collector arr_name } {
upvar 1 $arr_name collectorVar
eval $collector collectorVar first XXX YYY
eval $collector collectorVar second UUU VVV
}
With that, the call stack will become this:
::
coll_gen containerB
generate ::collect container
::collect collectorVar first XXX YYY
Annotating with what the array is called inside each level…
:: ### containerB
coll_gen containerB ### container (→ containerB)
generate ::collect container ### collectorVar (→ container → containerB)
::collect collectorVar first XXX YYY ### container (→ collectorVar → container → containerB)
Tcl is very literal, and I find it helps to think in terms of strings as far as possible, similar to how you think in terms of symbols when using Lisp but even more pervasive. When you use upvar, what you get isn't anything like a reference variable in some other languages. You just get to refer to a Tcl_Obj that was originally referenced in another stack frame (or the same stack frame if you upvar 0) using a local name. In the invocation
generate [namespace code { collect }] container
the second argument to generate doesn't carry over any kind of reference to the Tcl_Obj that container referred to inside coll_gen: the argument is just a Tcl_Obj containing the string "container". If that string is equal to a valid name in one of the stack frames, you can upvar the name to get/be able to set a value in the associated object (and if you've managed the stack frames correctly, it will even be the object you wanted to access).
The commands upvar and uplevel have important uses, but you really don't need them here. If you just go with names and don't try to drag your objects with you through each stack frame, your code becomes easier to read and easier to maintain:
proc generate args {
# use eval $args first XXX YYY if you have Tcl 8.4 or earlier
{*}$args first XXX YYY
{*}$args second UUU VVV
}
proc collect {container_name key args} {
lappend ${container_name}($key) $args
}
proc dump arr_name {
puts $arr_name:
dict for {key val} [array get $arr_name] {
puts "$key : $val"
}
}
proc coll_gen container_name {
generate [namespace code collect] $container_name
}
array set containerB {}
set container_name [namespace which -variable containerB]
foreach cmd {coll_gen dump} {$cmd $container_name}
A variable created (by assignment or the variable command) in the global scope will be a namespace variable that exists independent of stack frames: every proc in the program will be able to reach it using an absolute reference (such as created by namespace which or simply prepending the namespace to the variable name).
Local variables, OTOH, are disambiguated by name and stack frame. Within a stack frame, every use of a certain variable name will reference the same object. In the simple case, a proc will execute in one stack frame only, but the uplevel command may cause some piece of code to execute in another stack frame. In that case, the same name may be used to refer to different objects in the same code body. There is no ambiguity, though: the level of execution determines what object a name refers to.
When using the upvar command, two different name + stack frame permutations can be used to reference the same object residing on some stack level, or the same name can be used to reference objects from different stack levels:
proc foo {} {set abc foo ; bar}
proc bar {} {set abc bar ; baz}
proc baz {} {set abc baz ; qux}
proc qux {} {
set abc qux
foreach n {3 2 1 0} {
upvar $n abc var
lappend res $var
}
puts [join $res { }]
}
foo
# => foo bar baz qux
Again, there is never any ambiguity, since the name + stack level designation makes the identity of the object clear.
The uplevel and upvar commands can be wonderfully convenient as long as you can keep the stack frames straight, and I for one use them all the time. As you saw in Donal's answer, though, even a Tcl ace can't always keep the stack frames straight, and in those cases namespace variables are much simpler and safer.
Documentation: array, dict, foreach, lappend, namespace, proc, puts, set, {*}, uplevel, upvar

SWIG + TCL flags

Will the ownership of a pointer last only in the block in which we set the -acquire flag for it?
Eg.:
{
{
$xyz -acquire
}
}
Firstly, Tcl doesn't define blocks with {/}. The scope is defined by the procedure call or namespace.
Secondly, Tcl commands are always defined to have lifetime that corresponds to the namespace that owns them; they are never† scoped to a procedure call. They must be manually disposed one way or another; there are two ways to do this manual disposal: calling $xyz -delete or rename $xyz "" (or to anything else that is the empty string). Frankly, I prefer the first method.
But if you do want the lifespan to be tied to a procedure call, that's actually quite possible to do. It just requires some extra code:
proc tieLifespan args {
upvar 1 "____lifespan handle" v
if {[info exists v]} {
trace remove variable v unset $v
set args [concat [lindex $v 1] $args]
}
set v [concat Tie-Garbage-Collect $args]
trace add variable v unset $v
}
proc Tie-Garbage-Collect {handles var dummy1 dummy2} {
upvar 1 $var v
foreach handle $handles {
# According to SWIG docs, this is how to do explicit destruction
$handle -delete
# Alternatively: rename $handle ""
}
}
That you'd use like this in the scope that you want to tie $xyz's life to:
tieLifespan $xyz
# You can register multiple objects at once too
And that's it. When the procedure (or procedure-like entity if you're using Tcl 8.5 or later) exits, the tied object will be deleted. It's up to you to decide if that's what you really want; if you later disown the handle, you probably ought to not use tying.
† Well, hardly ever; some extensions do nasty things. Discount this statement as it doesn't apply to SWIG-generated code!