About passing around Tcl arrays holding lists - tcl

First off: I could fix my problem by myself, but I don't understand why my original solution did not work, and this is what I am interested in. I tried to make a compact example here:
I am dynamically building arrays, each array value being a list. Let's start with the following program:
# 'collector' is a callback function, expecting a container array, and some
# data used to populate the array.
proc generate { collector arr_name } {
eval $collector $arr_name first XXX YYY
eval $collector $arr_name second UUU VVV
}
# This is the callback function used in our example
proc collect { container_name key valuex valuey } {
upvar $container_name container
lappend container($key) [list $valuex $valuey]
}
# Procedure to write out an array
proc dump { arr_name } {
upvar $arr_name arr
puts $arr_name:
foreach key [array names arr] {
puts "$key : $arr($key)"
}
}
# Main program
array set containerA {}
generate [namespace code { collect }] containerA
dump containerA
Up to this point, nothing spectacular. Running this program produces the output
containerA:
second : {UUU VVV}
first : {XXX YYY}
But now let's extend this program somewhat
# Wrapper function to call 'generate' using a fixed collector function
# ("Currying" the first argument to generate)
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace code { collect }] $container_name ; # This works
# This would not work:
#generate [namespace code { collect }] container
}
array set containerB {}
coll_gen containerB
dump containerB
As written here, this would work too, and we get the output
containerB:
second : {UUU VVV}
first : {XXX YYY}
Now to my question: As you already can guess from the comments in the code, I had first written coll_gen as
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace code { collect }] container
}
My reasoning was that, since container is an alias to the array, the name of which was passed via the parameter list, I could equally well pass on the name of this alias to the 'generate' function. However, when I run the code (Tcl 8.5), it turns out that containerB is empty.
Why is it that it didn't work this way too?

The issue is one of evaluation scope.
Let's write out the call stack at the point where you're inside collect in the case where things don't work:
::
coll_gen containerB
generate {namespace inscope :: { collect }} container
namespace inscope :: { collect } container first XXX YYY
collect container first XXX YYY
Whoops! What's that namespace inscope? Where are the inner layers upvaring to? The result of namespace code is a wrapping with namespace inscope (which you shouldn't write directly; use namespace code or namespace eval) that arranges for the script formed by appending the other arguments (with appropriate metacharacter protection) to be run in the given namespace (:: in your case, I assume). This “run in the given namespace” requires adding another stack frame, and that's what the upvar is then poking into (it's probably created a global array called container, since the namespace inscope frame is a namespace-coupled one, not a “procedure local” stack frame).
You could use upvar 2 or maybe even upvar 3 (I'm not quite sure which) inside collect to work around this, but that's horrific and fragile.
You're better off writing your code like this:
proc coll_gen { container_name } {
upvar $container_name container
generate [namespace which collect] container
}
proc generate { collector arr_name } {
upvar 1 $arr_name collectorVar
eval $collector collectorVar first XXX YYY
eval $collector collectorVar second UUU VVV
}
With that, the call stack will become this:
::
coll_gen containerB
generate ::collect container
::collect collectorVar first XXX YYY
Annotating with what the array is called inside each level…
:: ### containerB
coll_gen containerB ### container (→ containerB)
generate ::collect container ### collectorVar (→ container → containerB)
::collect collectorVar first XXX YYY ### container (→ collectorVar → container → containerB)

Tcl is very literal, and I find it helps to think in terms of strings as far as possible, similar to how you think in terms of symbols when using Lisp but even more pervasive. When you use upvar, what you get isn't anything like a reference variable in some other languages. You just get to refer to a Tcl_Obj that was originally referenced in another stack frame (or the same stack frame if you upvar 0) using a local name. In the invocation
generate [namespace code { collect }] container
the second argument to generate doesn't carry over any kind of reference to the Tcl_Obj that container referred to inside coll_gen: the argument is just a Tcl_Obj containing the string "container". If that string is equal to a valid name in one of the stack frames, you can upvar the name to get/be able to set a value in the associated object (and if you've managed the stack frames correctly, it will even be the object you wanted to access).
The commands upvar and uplevel have important uses, but you really don't need them here. If you just go with names and don't try to drag your objects with you through each stack frame, your code becomes easier to read and easier to maintain:
proc generate args {
# use eval $args first XXX YYY if you have Tcl 8.4 or earlier
{*}$args first XXX YYY
{*}$args second UUU VVV
}
proc collect {container_name key args} {
lappend ${container_name}($key) $args
}
proc dump arr_name {
puts $arr_name:
dict for {key val} [array get $arr_name] {
puts "$key : $val"
}
}
proc coll_gen container_name {
generate [namespace code collect] $container_name
}
array set containerB {}
set container_name [namespace which -variable containerB]
foreach cmd {coll_gen dump} {$cmd $container_name}
A variable created (by assignment or the variable command) in the global scope will be a namespace variable that exists independent of stack frames: every proc in the program will be able to reach it using an absolute reference (such as created by namespace which or simply prepending the namespace to the variable name).
Local variables, OTOH, are disambiguated by name and stack frame. Within a stack frame, every use of a certain variable name will reference the same object. In the simple case, a proc will execute in one stack frame only, but the uplevel command may cause some piece of code to execute in another stack frame. In that case, the same name may be used to refer to different objects in the same code body. There is no ambiguity, though: the level of execution determines what object a name refers to.
When using the upvar command, two different name + stack frame permutations can be used to reference the same object residing on some stack level, or the same name can be used to reference objects from different stack levels:
proc foo {} {set abc foo ; bar}
proc bar {} {set abc bar ; baz}
proc baz {} {set abc baz ; qux}
proc qux {} {
set abc qux
foreach n {3 2 1 0} {
upvar $n abc var
lappend res $var
}
puts [join $res { }]
}
foo
# => foo bar baz qux
Again, there is never any ambiguity, since the name + stack level designation makes the identity of the object clear.
The uplevel and upvar commands can be wonderfully convenient as long as you can keep the stack frames straight, and I for one use them all the time. As you saw in Donal's answer, though, even a Tcl ace can't always keep the stack frames straight, and in those cases namespace variables are much simpler and safer.
Documentation: array, dict, foreach, lappend, namespace, proc, puts, set, {*}, uplevel, upvar

Related

Is there any Tcl package/add-on that handles named arguments?

In Python, Ruby 2.0, Perl 6, and some hardware description languages, one can use named arguments. See this example. This makes the code more readable, easy to maintain, etc. Is there a way of getting it done/extension, in TCL 8.6, other than using a dictionary as a workaround?
In 8.6, use a dictionary parsed from args. The dict merge command can help:
proc example args {
if {[llength $args] % 2} {
return -code error "wrong # args: should be \"example ?-abc abc? ?-def def?\""
}
set defaults {
-abc 123
-def 456
}
set args [dict merge $defaults $args]
set abc [dict get $args -abc]
set def [dict get $args -def]
puts "abc=$abc, def=$def"
}
example; # abc=123, def=456
example -abc 789; # abc=789, def=456
example -def 789; # abc=123, def=789
example -def 246 -abc 135; # abc=135, def=246
You can go further than that with verifying (the tcl::prefix command can help) but it's a lot more work and doesn't buy you a lot more in production code. Not that that has stopped people from trying.
There are two proposals to add full named argument handling
(TIP #457, TIP #479) to 8.7 at the moment, but I'm not sure that either have really gained traction. (The problem from my perspective is the extra runtime cost that has to be borne by code that doesn't volunteer to support named arguments. There might be other issues too, such as disagreement over preferred syntax; I've not paid so much attention to that as I'm still fretting over the performance implications in a pretty hot piece of code.)
There is an entire page on the tcler's wiki that discusses named arguments: http://wiki.tcl.tk/10702
You can do it yourself with a little creativity. There are several mechanisms that allow you to do this:
procs can define other procs
proc behave just like a proc (the function definition system is not a syntax, it is a function call)
procs can use the args argument instead of positional parameter and manually process the list of arguments
you can execute code in any parent stack frame using uplevel
you can pull variables from any parent stack frame using upvar
everything is a string!
etc.
I'm not sure I've listed all the possible mechanisms.
My personal implementation of this idea is optproc: http://wiki.tcl.tk/20066
The implementation itself is quite simple:
proc optproc {name args script} {
proc $name args [
string map [list ARGS $args SCRIPT $script] {
foreach var {ARGS} {
set [lindex $var 0] [lindex $var 1]
}
foreach {var val} $args {
set [string trim $var -] $val
}
SCRIPT
}
]
}
I basically used string manipulation (string map) to directly insert the function body ($script) into the defined function without any substitutions etc. This is to avoid any $ or [] from being processed. There are many ways to do this but my go-to tool is string map.
This is similar to Donald's answer except for two things:
I don't transform args into a dict instead I manually process args and declare each local variable in a loop.
This is a meta solution. Instead of processing args I created another function (syntax) to create a function that processes args.
Usage (stolen from Donald's answer) would be:
optproc example {abc def} {
puts "abc=$abc, def=$def"
}
But note that my solution is not the only solution. There are many ways to do this limited only by creativity.

How to effectively override a procedure-local variable in TCL

So I have the following situation:
$ ls -l
-r--r----- 1.tcl
-rw-rw---- 2.tcl
$ cat 1.tcl
proc foo {args} {
puts "$bar"
}
and I need to make 1.tcl print something other than "can't read \"bar\"". In a good programming language, the obvious solution would be
$ cat > 2.tcl
set -global bar "hello, world"
foo
What would be a reasonable workaround in TCL? Unfortunately the real foo is a long function that I can't really make a copy of or sed to a temporary file at runtime.
You can do this for your specific example
$ cat 2.tcl
source 1.tcl
set bar "Hello, bar!"
# add a "global bar" command to the foo procedure
proc foo [info args foo] "global bar; [info body foo]"
foo
$ tclsh 2.tcl
Hello, bar!
Clearly this doesn't scale very well.
If the variable is simply undefined, the easiest way would be to patch the procedure with a definition:
proc foo [info args foo] "set bar \"hello, world\" ; [info body foo]"
You can also accomplish this using a read trace and a helper command. This removes the problem I mentioned above, where local assignments destroy the value you wanted to inject.
The original procedure, with an added command that sets the local variable to a value which is later printed.
proc foo args {
set bar foobar
puts "$bar"
}
% foo
foobar
Create a global variable (it doesn't matter if the name is the same or not).
set bar "hello, world"
Create a helper command that gets the name of the local variable, links to it, and assigns the value of the global variable to it. Since we already know the name we could hardcode it in the procedure, but this is more flexible.
proc readbar {name args} {
upvar 1 $name var
global bar
set var $bar
}
Add the trace to the body of the foo procedure. The trace will fire whenever the local variable bar is read, i.e. something attempts to retrieve its value. When the trace fires, the command readbar is called: it overwrites the current value of the variable with the globally set value.
proc foo [info args foo] "trace add variable bar read readbar; [info body foo]"
% foo
hello, world
If one doesn't want to pollute the namespace with the helper command, one can use an anonymous function instead:
proc foo [info args foo] [format {trace add variable bar read {apply {{name args} {
upvar 1 $name var
global bar
set var $bar
}}} ; %s} [info body foo]]
Documentation:
apply,
format,
global,
info,
proc,
puts,
set,
trace,
upvar,
Syntax of Tcl regular expressions
source 1.tcl
try {
foo
} on error {err res} {
set einfo [dict get $res -errorinfo]
if { [regexp {no such variable} $einfo] } {
puts "hello, world"
return -code 0
} else {
puts $einfo
return -code [dict get $res -code]
}
}
Tcl's procedures do not resolve variables to anything other than local variables by default. You have to explicitly ask for them to refer to something else (e.g., with global, variable or upvar). This means that it's always possible to see at a glance whether non-local things are happening, and that the script won't work.
It's possible to override this behaviour with a variable resolver, but Tcl doesn't really expose that API in its script interface. Some extensions do more. For example, it might work to use [incr Tcl] (i.e., itcl) as that does that sort of thing for variables in its objects. I can't remember if Expect also does this, or if that uses special-cased code for handling its variables.
Of course, you could get really sneaky and override the behaviour of proc.
rename proc real_proc
real_proc proc {name arguments body} {
uplevel 1 [list real_proc $name $arguments "global bar;$body"]
}
That's rather nasty though.

What purpose does upvar serve?

In the TCL code that I currently work on, the arguments in each procedure is upvar'ed to a local variable so to speak and then used. Something like this:
proc configure_XXXX { params_name_abc params_name_xyz} {
upvar $params_name_abc abc
upvar $params_name_xyz xyz
}
From here on, abc and xyz will be used to do whatever. I read the upvar TCL wiki but could not understand the advantages. I mean why cant we just use the variables that have been received as the arguments in the procedure. Could anybody please elaborate?
I mean why cant we just use the variables that have been received as the arguments in the procedure.
You can. It just gets annoying.
Typically, when you pass the name of a variable to a command, it is so that command can modify that variable. The classic examples of this are the set and incr commands, both of which take the name of a variable as their first argument.
set thisVariable $thisValue
You can do this with procedures too, but then you need to access the variable from the context of the procedure when it is a variable that is defined in the context of the caller of the procedure, which might be a namespace or might be a different local variable frame. To do that, we usually use upvar, which makes an alias from a local variable to a variable in the other context.
For example, here's a reimplementation of incr:
proc myIncr {variable {increment 1}} {
upvar 1 $variable v
set v [expr {$v + $increment}]
}
Why does writing to the local variable v cause the variable in the caller's context to be updated? Because we've aliased it (internally, it set up via a pointer to the other variable's storage structure; it's very fast once the upvar has been done). The same underlying mechanism is used for global and variable; they're all boiled down to fast variable aliases.
You could do it without, provided you use uplevel instead, but that gets rather more annoying:
proc myIncr {variable {increment 1}} {
set v [uplevel 1 [list set $variable]]
set v [expr {$v + $increment}]
uplevel 1 [list set $variable $v]
}
That's pretty nasty!
Alternatively, supposing we didn't do this at all. Then we'd need to pass the variable in by its value and then assign the result afterwards:
proc myIncr {v {increment 1}} {
set v [expr {$v + $increment}]
return $v
}
# Called like this
set foo [myIncr $foo]
Sometimes the right thing, but a totally different way of working!
One of the core principles of Tcl is that pretty much anything you can do with a standard library command (such as if or puts or incr) could also be done with a command that you wrote yourself. There are no keywords. Naturally there might be some efficiency concerns and some of the commands might need to be done in another language such as C to work right, but the semantics don't make any command special. They all just plain commands.
The upvar command will allow you to modify a variable in a block and make this modification visible from parent block.
Try this:
# a function that will modify the variable passed
proc set_upvar { varname } {
upvar 1 $varname var
puts "var was $var\n"
set var 5
puts "var is now $var\n"
}
# a function that will use the variable but that will not change it
proc set_no_upvar { var } {
puts "var was $var\n"
set var 6
puts "var is now $var\n"
}
set foo 10
# note the lack of '$' here
set_upvar foo
puts "foo is $foo\n"
set_no_upvar $foo
puts "foo is $foo\n"
As it was mentioned in comment above, it is often used for passing function arguments by reference (call by reference). A picture costs a thousand words:
proc f1 {x} {
upvar $x value
set value 0
}
proc f2 {x} {
set x 0
}
set x 1
f1 x
puts $x
set x 1
f2 x
puts $x
will result in:
$ ./call-by-ref.tcl
0
1
With upvar we changed variable x outside of function (from 1 to 0), without upvar we didn't.

How to find a procedure by using the code inside the proc?

Is it possible to find the procedure name by using the content of that procedure?
For example,
proc test {args} {
set varA "exam"
puts "test program"
}
Using the statement set varA, is it possible to find its procedure name test?
Because, I need to find a procedure for which i know the output [it's printing something, i need to find the procedure using that].
I tried many ways like info frame, command. But, nothing helps.
Is it possible to find the procedure name by using the content of that procedure?
Yes. You use info level 0 to get the argument words to the current procedure (or info level -1 to get its caller's argument words). The first word is the command name, as resolved in the caller's context. That might be enough, but if not, you can use namespace which inside an uplevel 1 to get the fully-qualified name.
proc foo {args} {
set name [lindex [info level 0] 0]
set FQname [uplevel 1 [list namespace which $name]]
# ...
}
Note that this does not give you the main name in all circumstances. If you're using aliases or imported commands, the name you'll get will vary. Mostly that doesn't matter too much.
With info proc, we can get the content of a procedure which may helps you in what you expect.
The following procedure will search for the given word in all the namespaces. You can change it to search in particular namespace as well. Also, the search word can also be case insensitive if altered in terms of regexp with -nocase. It will return the list of procedure names which contains the search word.
proc getProcNameByContent {searchWord} {
set resultProcList {}
set nslist [namespace children ::]; # Getting all Namespaces list
lappend nslist ::; # Adding 'global scope namespace as well
foreach ns $nslist {
if {$ns eq "::"} {
set currentScopeProcs [info proc $ns*]
} else {
set currentScopeProcs [info proc ${ns}::*]
}
foreach myProc $currentScopeProcs {
if {[regexp $searchWord [info body $myProc]]} {
puts "found in $myProc"
lappend resultProcList $myProc
}
}
}
return $resultProcList
}
Example
% proc x {} {
puts hai
}
% proc y {} {
puts hello
}
% proc z {} {
puts world
}
% namespace eval dinesh {
proc test {} {
puts "world is amazing"
}
}
%
% getProcNameByContent world
found in ::dinesh::test
found in ::z
::dinesh::test ::z
%

TCL/Expect - $argv VS $::argv VS {*}$argv

What is difference between following variables:
$argv
$::argv
{*}$argv
First two are possible to print via puts command and they returns following output:
param0 param1 {param 2} param3
param0 param1 {param 2} param3
The arguments that was passed to script were:
param0 param1 "param 2" param3
The last one end up with error:
wrong # args: should be "puts ?-nonewline? ?channelId? string"
while executing
"puts {*}$argv"
I've done some research in this area using following code:
if {[array exists $argv]} {
puts "\$argv IS ARRAY"
} else {
puts "\$argv IS NOT AN ARRAY"
}
if {[string is list $argv]} {
puts "\$argv IS LIST"
} else {
puts "\$argv IS NOT LIST"
}
if {[array exists $::argv]} {
puts "\$::argv IS ARRAY"
} else {
puts "\$::argv IS NOT AN ARRAY"
}
if {[string is list $::argv]} {
puts "\$::argv IS LIST"
} else {
puts "\$::argv IS NOT LIST"
}
if {[array exists {*}$argv]} {
puts "{*}\$::argv IS ARRAY"
} else {
puts "{*}\$::argv IS NOT AN ARRAY"
}
if {[string is list {*}$argv]} {
puts "{*}\$::argv IS LIST"
} else {
puts "{*}\$::argv IS NOT LIST"
}
The last two if-else statements which contain {*}$argv ends with following error:
wrong # args: should be "array exists arrayName"
while executing
"array exists {*}$argv"
invoked from within
"if {[array exists {*}$argv]} {
puts "{*}\$::argv IS ARRAY"
} else {
puts "{*}\$::argv IS NOT AN ARRAY"
}"
Commenting out those two statements shows that $argv and $::argv are lists:
argv IS NOT AN ARRAY
$argv IS NOT AN ARRAY
argv IS LIST
$argv IS LIST
Both those lists can be traversed as standard list e.g.:
foreach item $argv {
puts $item
}
or
foreach item $::argv {
puts $item
}
Attempt to traverse {*}$argv the same way leads to following error again:
wrong # args: should be "foreach varList list ?varList list ...? command"
while executing
"foreach item {*}$argv {
puts $item
}"
I am using TCL version 8.5
What is difference between following variables:
$argv
$::argv
{*}$argv
There are two types of difference here.
Unqualified and Qualified Variables
In Tcl, unqualified and qualified variables can be a bit different, but it depends on the context (in a pretty simple way though). Firstly, a qualified variable name is one that contains at least one :: within it. If the variable name (the thing after the $ — in Tcl, $ just means “read this variable now and use its contents here”) starts with ::, it is an absolute variable name, otherwise a qualified variable name is a relative variable name and is resolved with respect to the current namespace (which you can find out with namespace current if you're uncertain). Absolute variable names always refer to the same thing, in all contexts. Thus, ::argv is an absolute variable name, and indeed it refers to a variable called argv in the top-level, global namespace. That happens to be a variable that tclsh and wish write their arguments into.
But if there is no ::, it is an unqualified variable name. If you are not in a procedure (or procedure-like thing, which includes a lambda term such as you'd use with apply or the methods defined by various OO systems) then the variable is (mostly) treated as if it was a relative variable name and resolved with respect to the current namespace. namespace eval and namespace code are two of the things that can change the current namespace (the others are more obscure). All this is provided you use variable to declare all your namespace variables. Otherwise, you can hit some weird problems with variable resolution which are really nasty. So do use variable. Really.
If you are in a procedure(-like entity) though, that unqualified name refers to a local variable, whose life is coupled to that of the stack frame pushed on the stack when the procedure is entered. That can be linked to variables in other scopes (including the global namespace) through various commands: global, upvar, variable, and namespace upvar. However, the actual resolution of the variable is to something local.
Finally, there might also be a custom variable resolver in place. Since you're using Tcl 8.5, the place where you're most likely to see this in use is if you're using Incr Tcl, an object system for Tcl. Custom variable resolvers can do some complex stuff. (If you were using Tcl 8.6, the most likely place to see a custom variable resolver at work is in TclOO. The variable resolver there is very conservative and cautious, but allows local variables to be bound to object variables without having to explicitly declare this in each method).
Normal and Expanding Substitution
The difference between $argv and {*}$argv is totally different.
$argv is a normal substitution. It says “read this variable here and use the contents of it instead”. It can be used in the middle of a word, so $argv$argv$argv is a thing, consisting of the concatenation of the contents of the argv variable three times.
{*}, when placed at the start of a word (it's not special elsewhere), marks that word for expansion. When a word is expanded, it's parsed as a Tcl list after all other normal substitutions have been done, and the words of that list are used as words in the resulting command being built up. {*}$argv is a degenerate case where the remainder of the word is just the a read from a variable; the words that are used in the command are the elements of the list in the argv variable. Since that's normally a list, this is all hunky-dory.
Here's an example:
set abc {a b c}
set dabcf [list d $abc f]
puts $dabcf; # ===> “d {a b c} f”
set eabcg [list e {*}$abc g]
puts $eabcg; # ===> “e a b c g”
See the difference? One produces three elements in the list, the other produces five. It makes even more sense with something somewhat longer:
set options {
-foreground blue
-background yellow
-text "This is eye-watering stuff!"
}
button .b1 {*}$options -command {puts "Ouch 1"}
button .b2 {*}$options -command {puts "Ouch 2"}
button .b3 {*}$options -command {puts "Ouch 3"}
pack .b1 .b2 .b3
With expansion, that all Just Works™. Without, you'd have to do something horrific with eval:
eval [list button .b1] [lrange $options 0 end] [list -command {puts "Ouch 1"}]
# etc.
This was difficult to get right, and tedious, so it caused lots of people (including Tcl and Tk maintainers!) many problems because they tended to take shortcuts and get it wrong. It was to address this that expansion syntax was created in Tcl 8.5 to make this all less error prone. (The prototypical example in plain Tcl tends to involve things with exec, and meant that quite a few people actually had security holes because of this.)
As a bonus, using {*} is much faster than using eval since expansion can guarantee that it is never doing complicated reparsing of things. In Tcl, faster virtually always correlates with safer.
Note that this is independent of whether the variable is qualified. Yes, that means that you can also have {*}$::argv if you want.
You confuse the effects of substitution with the effects of argument expansion.
Please study the Dodekalogue http://wiki.tcl.tk/10259.
You mix the Rule #5: Argument Expansion (the {*} thing) with Variable Substitution (Rule #8).
The three forms you listed above are equivalent to the following:
$argv -> [set argv]
Get the value of a simple variable in the currently active scope.
$::argv -> [namespace eval :: { set argv }] -> [set ::argv]
Get the value of the variable in the namespace :: (the global namespace)
{*}$argv -> [eval [set argv]]
Expand the variables content to multiple arguments.