Exception to the "brace your expr expressions" rule. What's going on? - tcl

I have a number, say 10, in one variable and a string like +1 or -2 in another. I need to evaluate 10+1 or 10-2 in the above cases.
So, I have
set foo 10
set garp -1
If I do
expr $foo $garp
all is well (I get 9).
Ah! But in general, you should put braces around the expr expression.
expr {$foo $garp}
fails with missing operator at _#_ in expression $foo _#_$garp.
Similarly,
expr [concat $foo $garp]
works nicely but
expr {[concat $foo $garp]}
returns 10 -1.
I don't want to leave the expression unbraced without really understanding what's going on as I'm afraid that otherwise I, or someone else, is going to put braces around the expression and the code will stop working.
What's the "correct" way to do this?

In general, expr involves two rounds of substitution.
The first round of substitution is performed by the command parser on those arguments of the expr command that are not enclosed in braces. The resulting strings are concatenated (by adding separator spaces between them) into a single expression string, which is then parsed (and later evaluated) by the expression processor.
During parsing, the expression is decomposed into operators and operands. Operands must be delimited with operators. Assuming focus on mathematical expressions (i.e. discarding string operations), an operand may be one of the following:
a numeric value
a Tcl variable, using standard $ notation. The variable's value will be used as the operand.
a Tcl command enclosed in brackets. The command will be executed and its result will be used as the operand.
a parenthesized sub-expression, which is parsed using the same rules.
a mathematical function whose arguments are sub-expressions, parsed using the same rules.
Items 2 and 3 correspond to the second round of substitution, which is performed by the expression processor during evaluation. Each substitution performed at this step is expected to yield a numeric value that is directly usable in further evaluation, without needing to re-parse and re-evaluate it.
Having all this said, let's look at your examples:
expr $foo $garp
The command processor expands this during the 1st round of substitution to expr 10 -1, the expression string after concatenation of the arguments is {10 -1}, and the expression processor parses it into a valid expression 10 subtract 1.
expr [concat $foo $garp]
During the 1st round of substitution the command processor expands this to expr {10 -1}, effectively producing the same expression string as in the previous case.
expr {$foo $garp}
The command processor leaves this intact, and the expression processor sees two consecutive operands (corresponding to clause 2 above), without any operator between them.
expr {[concat $foo $garp]}
Again, the 1st round of substitution is not performed. Parsing this expression extracts a single operand [concat $foo $garp] corresponding to clause 3. Expression processor evaluates the command and substitutes its result (i.e. the string "10 -1") for the result of the full expression.
So the correctly braced version of your expression must read:
expr {$foo + $garp}
which will be parsed as $foo add $garp.

In this case,
expr {$foo + $garp}
The rule "always brace your expressions" stems from the fact that it is a good idea to bypass the argument evaluation step and leave the evaluation of the expression string completely to expr (because it is more secure1 and results in more effective bytecode2).
For this to work, the string passed to expr needs to be legal according to the rules laid out in the expr docs (an unbraced expression doesn't have to be legal as long as the argument evaluation step makes it legal). From this follows that anytime you need the argument evaluation to help you create a legal expression string is an exception to the "always brace" rule (and possibly a hint that you need to rethink the structure of your code3).
The string {$foo $garp} is illegal because variable substitutions can only be operands in an expression, meaning that we have two operands without an operator. The string "$foo $garp" is transformed by the argument evaluation into a legal expression as the minus operator is reinterpreted into a subtraction operator.
If you have a bunch of values that you are getting in pairs, a and b, and you want to add those, expr $a $b might work if you are sure that they always have a sign. That's brittle, though. It's better to use one of
expr {$a + $b}
tcl::mathop::+ $a $b
expr [join [list $a $b] +]
(The first one is the solution we've discussed above. The second one avoids double substitution by using the + operator outside of expr: the variables are evaluated by the argument evaluator but not by the command. The third variant has all the problems of double substitution and is mentioned mostly for completeness. It's still better than just expr $a $b, though.)
Documentation:
+ (operator),
expr,
join,
list,
Mathematical operators as Tcl commands
1) The argument evaluator, given hostile arguments, could for instance replace $foo in the expression with [exec rm -rf *] or whatever you crazy Linux kids call it, and then the command substitution will be performed inside expr. This is less likely to happen if you disallow double substitution by bracing the expression.
2) The byte compiler can analyze a braced expression and replace the call to expr with more efficient inlined calculations. For an unbraced string, the compiler has no other option than to set up a call to expr whatever the expression is.
3) Seemingly paradoxically, it is not a problem to construct an expression by some trusted method and pass it unbraced via a variable (set myexpr [...] ; expr $myexpr), because this way you are still in full control of the content of the expression, and you are certainly not depending on the argument evaluator to patch it up for you. You won't get the bytecode optimization, though.

Related

Use of brackets and space character in Tcl

I am still confused about the usage of the bracket i.e () [] and {} use in Tcl. I always get caught out using the wrong bracket, having missed brackets when it was required to use them or having used too many of them. Besides this, I am also getting confused by Tcl giving me different result depending on presence or absence of space character (in math expression) and also if I have used more than one space character in succession.
Can someone please give me the basic rules that I must keep in mind to get out of this mess. Brackets have always been simple to use in C and some other languages but here they are totally different.
At the level you're looking at, Tcl is very different to any other language you've ever worked with. The heart of Tcl is defined by the Tcl(n) manual page, which states that (among other things):
Whitespace separates words. Every command takes its arguments as a sequence of words. Newlines and semicolons separate command calls; they're totally equivalent, but good style is to use a newline instead of a semicolon.
{braces} are used mainly for quoting text so that it is passed to commands with no substitutions or word separation performed on it. They nest properly. Braces are also used after $ to do variable substitution in a few cases: that's a rare use.
"double quotes" are used for quoting text so that it is passed to commands with substitutions applied, but no word separation.
[brackets] are a command substitution. They are replaced with the result of running the script inside the bracket. The script is usually a single command.
(parentheses) only have one base language use: for (associative) array elements. Thus, $a(b) is a variable substitution that will use the value of the b element in the a array.
The rest of what people call Tcl is really just a standard library, a set of commands to get you started. Some are fundamental. For example:
if is a conditional command, evaluating a branch (a script) if a condition is true. In order for this to be meaningful, the branch has to be not evaluated until the condition has been evaluated and tested; that pretty much requires putting it in braces.
while is a looping command, and not only do you want to brace its body (that's probably going to be evaluated over and over) but you also want to put the condition expression in braces as well as you definitely want that to be reevaluated each time round the loop.
proc is a command that makes your own custom commands. The body of the procedure definitely is something you want to evaluate later; it goes in braces.
expr is a general expression evaluation command. Under all normal circumstances, you'll want to put its expression in braces so that the code can be compiled and won't have double substitution problems. Note that expressions often make heavy use of parentheses: they have additional meanings in expression syntax. In particular, apart from being array element lookups, they're also used for function calls and grouping.
Note that if and while also use that same expression evaluation engine. They just use the result of the expression to decide what to do.
Scoping is a matter for commands to decide. The usual commands for dealing with introducing a scope are proc and namespace eval. This is nothing like C, C++, Java, C#, or Javascript; they have different rules. Variables are local to their procedure unless you explicitly say otherwise.
The community practice is to do calls like this:
if { $foo(bar) > (17 + $grill) * 7 } {
# This is a comment; it lasts to the end of the line
puts "the foobar $foo(bar) is too large"
set foo(bar) [ComputeSmallerValue $grill]
}
That is, barewords (if and puts) are unquoted, expressions and inner scripts are brace-quoted, parentheses are used where meaningful but most for arrays and expressions, whitespace separates all words, inner scripts are indented (usually by 4) for clarity (it doesn't have semantic meaning, but it sure helps with reading), and “blocks” use egyptian braces so that you don't have to add backslashes all over the place.
You don't have to follow these rules (they're guidelines, not the law) but they make your life easier if you do. Sometimes you do need to break the rules, but then you should know to be careful.
You cannot compare Tcl to C. In C, {} defines scope. In Tcl, {} is a grouping operator.
In Tcl, {} may group a string:
{hello world}
Or a list:
{a b c d e f g h}
Or a script:
{
puts -nonewline {hello }
puts world\n
}
Every command is simply a series of groups (which may be a word, a list,
an expression or a script):
{if} {true} { puts "hello\n" }
Of course, you don't need to put braces around every word,
but you do need braces to enclose a script:
if true { puts hello\n }
Generally, for the if statement, not bracing the expression is a bad idea,
so this is better:
if { true } { puts hello\n }
This simple rule creates Tcl's remarkably simple syntax. Every command is simply
a series of groups, whether a word, an expression, a list or script:
if expr script
while expr script
proc name argument-list script
puts string
for initialization condition nextloop script
The one important thing to remember is whenever an expression is wanted, it
should be enclosed within braces in order to prevent early substitution. e.g.:
set i 0
while { $i < 10 } {
incr i
}
The square brackets, [], are replaced with the output of a command enclosed
by the square brackets:
set output [expr {2**5}]
Parentheses are used within expressions as usual:
set output [expr {(2**5)+2}]
And for arrays:
set i 0
while { $i < 5 } {
set output($i) [expr {2**$i}]
incr i
}
parray output

How do I use the ternary operator to add an optional part to an anonymous string in tcl?

I am new to tcl (sorry if the answer is obvious, but reading tutorials and documentation did not help). I have a statement in tcl that says:
startupitem.start "foo
\tbar"
What I would like to do is have the "foo" part become optional, depending on the outcome of
[variant_isset "alice"] using the ternary operator and without using variables.
I've tried several things along the lines
startupitem.start "[variant_isset """alice"""?"""foo\n\t""":""""""] bar"
(of course with all kinds of escapes and combo's or the use of the double quotes inside the double quotes) but I haven't succeeded.
The outcome if the variant_isset expression returns true is that it is equivalent to
startupitem.start "bar"
You might prefer to use the if command (which is very much the command version of the ternary operator), whose result is the result of the body script it evaluates. If there isn't an else clause, the result is the empty string if nothing else is chosen to do:
startupitem.start "[if {[variant_isset alice]} {string cat "foo\n\t"}] bar"
Or you can build a list and then join it:
set items {}
if {[variant_isset alice]} {
lappend items "foo"
}
lappend items bar
startupitem.start [join $items "\n\t"]
This second approach tends to work particularly well when things get complicated.
You want to check out Tcl's expr command, which introduces Tcl's expression sub-language incl. what you call the "ternary" operator ?:
startupitem.start "[expr {[variant_isset "alice"] ? "foo\n\t" : ""}]bar"
If you happen to use a Tcl recent enough, you may want to prefer an outplace string assembly using string cat, rather than inplace:
string cat [expr {[variant_isset "alice"] ? "foo\n\t" : ""}] "bar"

TCL error: extra characters after close-quote

I'm trying to evaluate certain expression. I have "pqr" && "xyz" in command. on evaluating the command, it gives an error: extra characters after close-quote.
I think tcl cant able to parse && after double quote. If this is the reason then how should i have to deal with double quote and &&?
You're not giving us enough information.
A wild guess is that you were writing something like this:
expr "pqr"&& "xyz"
which does give the error message "extra characters after close-quote". This is because the interpreter tries to parse the command according to Tcl language rules, and one of those rules is that a word that starts with a double quote must end with a matching double quote. In this case, there are two & characters following the matching double quote.
Now,
expr "pqr" && "xyz"
(with a space between the double quote and the ampersand) is no good either. This is because the interpreter will remove any characters that have syntactic function as it prepares the arguments for the command. This means that the argument expr gets is the string pqr && xyz. When the expr command executes, it tries to interpret its argument as a string in a special expression language that isn't Tcl. In particular, unlike in Tcl strings that aren't boolean values or the names of variables of functions must always be enclosed in braces or double quotes, like this: "pqr" && "xyz". So how do you get that? You always* brace the argument to expr, that's how.
expr {"pqr" && "xyz"}
means that expr gets the legal string "pqr" && "xyz".
But the string "pqr" && "xyz" is still not valid, since the && (logical and) operation isn't defined for strings other than strings that are equal to the string representation of boolean values, such as expr {"true" && "false"}
So, again we're stuck, because what you seem to be trying to do makes no sense. If you show us what you're doing we might be able to help you.
*) except when you shouldn't. Rare, expert level.
Documentation:
expr,
Mathematical operators as Tcl commands,
Summary of Tcl language syntax
Brace your expressions
The expr command (and by extension, the commands for, if, and while, which use the same mechanism to evaluate their conditions) interprets an expression string that is constructed from its arguments. Note that the language of the expression string isn't Tcl, but specific to the expr command's expression evaluator: the languages share many syntactic forms, but are fundamentally different with infix, operator-based structure for the expr language and prefix, command-based structure for Tcl.
Letting the Tcl interpreter evaluate the arguments before passing them to expr can lead to
double substitution, which has security problems similar to SQL injection attacks.
iterative commands (for, while) getting constant-valued condition arguments, leading to infinite loops.
all substitutions (and thus their side-effects) always occurring while expr can selectively suppress some of them.
Therefore, it is almost always better to provide the expression string as a braced (escaped) string, which will not be evaluated by the Tcl interpreter, only the expr interpreter.
Note that while unbraced arguments to expr are allowed to be a invalid expression string as long as the argument evaluation transforms them into a valid one, braced expressions must be valid as they are (e.g. variable or command substitutions must be simple operands and not operators or complex expressions).
Another benefit from using braced expression strings is that the byte compiler usually can generate more efficient code (5 - 10x faster) from them.

tcl scripts, struggling with [...] and [expr ...]

I can't understand how assignments and use of variables work in Tcl.
Namely:
If I do something like
set a 5
set b 10
and I do
set c [$a + $b]
Following what internet says:
You obtain the results of a command by placing the command in square
brackets ([]). This is the functional equivalent of the back single
quote (`) in sh programming, or using the return value of a function
in C.
So my statement should set c to 15, right?
If yes, what's the difference with
set c [expr $a + $b]
?
If no, what does that statement do?
Tcl's a really strict language at its core; it always follows the rules. For your case, we can therefore analyse it like this:
set c [$a + $b]
That's three words, set (i.e., the standard “write to a variable” command), c, and what we get from evaluating the contents of the brackets in [$a + $b]. That in turn is a script formed by a single command invocation with another three words, the contents of the a variable (5), +, and the contents of the b variable (10). That the values look like numbers is irrelevant: the rules are the same in all cases.
Since you probably haven't got a command called 5, that will give you an error. On the other hand, if you did this beforehand:
proc 5 {x y} {
return "flarblegarble fleek"
}
then your script would “work”, writing some (clearly defined) utter nonsense words into the c variable. If you want to evaluate a somewhat mathematical expression, you use the expr command; that's it's one job in life, to concatenate all its arguments (with a space between them) and evaluate the result as an expression using the documented little expression language that it understands.
You virtually always want to put braces around the expression, FWIW.
There are other ways to make what you wrote do what you expect, but don't do them. They're slow. OTOH, if you are willing to put the + first, you can make stuff go fast with minimum interference:
# Get extra commands available for Lisp-like math...
namespace path ::tcl::mathop
set c [+ $a $b]
If you're not a fan of Lisp-style prefix math, use expr. It's what most Tcl programmers do, after all.
set c [$a + $b]
Running the above command, you will get invalid command name "5" error message.
For mathematical operations, we should rely on expr only as Tcl treats everything as string.
set c [expr $a + $b]
In this case, the value of a and b is passed and addition is performed.
Here, it is always safe and recommended to brace the expressions as,
set c [expr {$a+$b}]
To avoid any possible surprises in the evaluation.
Update 1 :
In Tcl, everything is based on commands. It can a user-defined proc or existing built-in commands such as lindex. Using a bare-word of string will trigger a command call. Similarly, usage of [ and ] will also trigger the same.
In your case, $a replaced with the value of the variable a and since they are enclosed within square brackets, it triggers command call and since there is no command with the name 5, you are getting the error.

TCL expression parsing - why brace and bracket are escaped differently

I just did the following experiment in TCL 8.6:
% expr \"\{" ne \"x\"
1
% expr \"\[" ne \"x\"
extra characters after close-quote
in expression ""[" ne "x""
The first command makes sense to me:
Because the argument is not braced, first round parsing is script level parsing, backslash escapes are removed: expr "{" ne "x"
expr command continues the parsing, "{" and "x" are 2 quoted literals and the execution goes well.
The error in the 2nd command does not make sense. The only difference is replacing bracket with brace, why does it fail?
I know bracing the arguments is expected for expression, this question is mostly to understand TCL parsing.
The problem with the second command is that the expr command processes […] sequences inside double quotes as command substitutions. This is independent of whether Tcl does and is part of why it is a really good idea to always brace your overall expressions. Had you instead used:
expr \{\[\} ne \"x\"
then it would have worked; just as with the base Tcl language, expr does not expand command substitutions in brace-quoted terms.