On tcl tests - how to interpret tcltest::test - tcl

I am looking at the tests in TCL source tree and I see this one in compExpr-old.test:
test compExpr-old-14.17 {CompilePrimaryExpr: string primary that looks like var ref} {
expr $
} $
It looks wrong to me: the test runs the script expr $ and expects the return value is "$". Is my interpretation right?
It cannot be right because expr $ is wrong syntactically.
I checked out tcltest.tcl, the definition of tcltest::test is so long, wish someone can help me over here.

I don't know what version of the test suite you are looking at (probably some 8.4 variant?), but when I look at the whole of that test in the Tcl trunk, I see this:
test compExpr-old-14.17 {CompilePrimaryExpr: string primary that looks like var ref} -body {
expr $
} -returnCodes error -match glob -result *
In this case, it is checking that the result is an error and that the value of the result (i.e., the content of the error message) glob matches with *, i.e., is anything (effectively ignoring it). That is, the test checks that an error is obtained from expr $ and otherwise doesn't care.
The test you posted (which uses an older syntax for tcltest) won't pass on modern versions of Tcl. But in 8.4, it did pass; this was an area where Tcl's semantics changed between 8.4 and 8.5:
dkf$ tclsh8.4
% expr $
$
% exit
dkf$ tclsh8.5
% expr $
invalid character "$"
in expression "$"
% exit
Quick guide to Tcltest test cases: The -body describes a scrip to run, the -returnCode can be used to select whether normal results or errors are expected, the -result can be used to say what string to expect as a result of the body script, and -match can be used to pick an alternate matching scheme than the default (exact string equality). There's also -constraints for specifying preconditions on a test, -setup for doing setup code, and -cleanup for cleanup code. The two leading mandatory arguments are the name of the test (which must be unique within a test suite for your own sanity) and a short description of the test, used in reporting of failures.
In the old syntax (used in much of Tcl's test suite because updating it is a ton of boring work), you instead had the same two mandatory arguments, then an optional list of constraints (as for -constraints), then a mandatory body (as for -body), then the mandatory string to match for equality (as for -result). Less flexible, but not too hard to understand.

Related

How to determine whether a function exists in a POSIX shell?

This is basically the same question as Determine if a function exists in bash, except that this time it's not aiming at Bash, but at a POSIX shell:
How to determine whether a shell function with a given name exists?
It seems that none of the typical built-ins like type are mandated by POSIX, so the matter is more difficult or maybe even impossible.
POSIX (more precisely the X/Open Portability Guide) specify the type command. It doesn't state what the type command should return to tell what the argument is. However, the standard says it typically identifies the operand, so it would be very unlikely for a type implementation not to include the string "function" in its reply when passed a function name.
This should then work with most, if not all, POSIX compliant shells:
isFunction()
{
type "$1" | sed "s/$1//" | grep -qwi function
}
You might also run command -V instead of type here, with the same comment about the unspecified output format. I never do, given the fact the former is shorter to type and easier to remember. This would, however, be mandatory if you run a shell that decided not to include XSI (likely posh), i.e., a shell that breaks portability with many existing scripts by limiting the utilities it tries to comply with to the strict POSIX set.
You can use command for this in shells that implement the 2013 version of POSIX, or the User Portability Utilities option of the older spec:
isFunction() {
command -V "$1" 2>/dev/null | grep -qwi function
}
However, note that the spec doesn't actually dictate the form of the command's output. It requires that functions be identified as such, so it is highly likely that the output will include the word function if and only if the requested name is a function, but it is not strictly guaranteed. The above solution can be fooled pretty easily (see #jiliagre's comment).
A different part of the spec mandates a type command that does much the same thing (with the same caveats about unspecified output format). Oddly, it's not listed as one of the commands required to be a shell builtin, but as the informational notes say, it pretty much has to be one in order to work as specified.
For the sake of completeness: it is possible to use type or command -V without spawning any extra subprocesses like sed or grep (although you still have to spawn one for $(type ...)):
is_function() {
case "$(type -- "$1" 2>/dev/null)" in
*function*) return 0 ;;
esac
return 1
}

Expect: extract specific string from output

I am navigating a Java-based CLI menu on a remote machine with expect inside a bash script and I am trying to extract something from the output without leaving the expect session.
Expect command in my script is:
expect -c "
spawn ssh user#host
expect \"#\"
send \"java cli menu command here\r\"
expect \"java cli prompt\"
send \"java menu command\"
"
###I want to extract a specific string from the above output###
Expect output is:
Id Name
-------------------
abcd 12 John Smith
I want to extract abcd 12 from the above output into another expect variable for further use within the expect script. So that's the 3rd line, first field by using a double-space delimiter. The awk equivalent would be: awk -F ' ' 'NR==3 {$1}'
The big issue is that the environment through which I am navigating with Expect is, as I stated above, a Java CLI based menu so I can't just use awk or anything else that would be available from a bash shell.
Getting out from the Java menu, processing the output and then getting in again is not an option as the login process lasts for 15 seconds so I need to remain inside and extract what I need from the output using expect internal commands only.
You can use regexp in expect itself directly with the use of -re flag. Thanks to Donal on pointing out the single quote and double quote issues. I have given solution using both ways.
I have created a file with the content as follows,
Id Name
-------------------
abcd 12 John Smith
This is nothing but your java program's console output. I have tested this in my system with this. i.e. I just simulated your program's output with cat. You just replace the cat code with your program commands. Simple. :)
Double Quotes :
#!/bin/bash
expect -c "
spawn ssh user#domain
expect \"password\"
send \"mypassword\r\"
expect {\\\$} { puts matched_literal_dollar_sign}
send \"cat input_file\r\"; # Replace this code with your java program commands
expect -re {-\r\n(.*?)\s\s}
set output \$expect_out(1,string)
#puts \$expect_out(1,string)
puts \"Result : \$output\"
"
Single Quotes :
#!/bin/bash
expect -c '
spawn ssh user#domain
expect "password"
send "mypasswordhere\r"
expect "\\\$" { puts matched_literal_dollar_sign}
send "cat input_file\r"; # Replace this code with your java program commands
expect -re {-\r\n(.*?)\s\s}
set output $expect_out(1,string)
#puts $expect_out(1,string)
puts "Result : $output"
'
As you can see, I have used {-\r\n(.*?)\s\s}. Here the braces prevent any variable substitutions. In your output, we have a 2nd line with full of hyphens. Then a newline. Then your 3rd line content. Let's decode the regex used.
-\r\n is to match one literal hyphen and a new line together. This will match the last hyphen in the 2nd line and the newline which in turn make it to 3rd line now. So, .*? will match the required output (i.e. abcd 12) till it encounters double space which is matched by \s\s.
You might be wondering why I need parenthesis which is used to get the sub-match patterns.
In general, expect will save the expect's whole match string in expect_out(0,string) and buffer all the matched/unmatched input to expect_out(buffer). Each sub match will be saved in subsequent numbering of string such as expect_out(1,string), expect_out(2,string) and so on.
As Donal pointed out, it is better to use single quote's approach since it looks less messy. :)
It is not required to escape the \r with the backslash in case of double quotes.
Update :
I have changed the regexp from -\r\n(\w+\s+\w+)\s\s to -\r\n(.*?)\s\s.
With this way - your requirement - such as match any number of letters and single spaces until you encounter first occurrence of double spaces in the output
Now, let's come to your question. You have mentioned that you have tried -\r\n(\w+)\s\s. But, there is a problem here with \w+. Remember \w+ will not match space character. Your output has some spaces in it till double spaces.
The use of regexp will matter based on your requirements on the input string which is going to get matched. You can customize the regular expressions based on your needs.
Update version 2 :
What is the significance of .*?. If you ask separately, I am going to repeat what you commented. In regular expressions, * is a greedy operator and ? is our life saver. Let us consider the string as
Stackoverflow is already overflowing with number of users.
Now, see the effect of the regular expression .*flow as below.
* matches any number of characters. More precisely, it matches the longest string possible while still allowing the pattern itself to match. So, due to this, .* in the pattern matched the characters Stackoverflow is already over and flow in pattern matched the text flow in the string.
Now, in order to prevent the .* to match only up to the first occurrence of the string flow, we are adding the ? to it. It will help the pattern to behave as non-greedy manner.
Now, again coming back to your question. If we have used .*\s\s, then it will match the whole line since it is trying to match as much as possible. This is common behavior of regular expressions.
Update version 3:
Have your code in the following way.
x=$(expect -c "
spawn ssh user#host
expect \"password\"
send \"password\r\"
expect {\\\$} { puts matched_literal_dollar_sign}
send \"cat input\r\"
expect -re {-\r\n(.*?)\s\s}
if {![info exists expect_out(1,string)]} {
puts \"Match did not happen :(\"
exit 1
}
set output \$expect_out(1,string)
#puts \$expect_out(1,string)
puts \"Result : \$output\"
")
y=$?
# $x now contains the output from the 'expect' command, and $y contains the
# exit status
echo $x
echo $y;
If the flow happened properly, then exit code will have value as 0. Else, it will have 1. With this way, you can check the return value in bash script.
Have a look at here to know about the info exists command.

Tcl: Is parameter evaluation guaranteed to be left-to-right?

I have a Tcl program where I often find expressions of the following kind:
proc func {} {...}
...
lappend arr([set v [func]]) $v
The intended meaning of the last line is
set v [func]
lappend arr($v) $v
It obviously works. What I would like to know: Does it work "by accident", or does Tcl guarantee, that the first parameter passed to lappend is evaluated before the second?
Tcl is always evaluated from left to right as you can read on the documentation, I quote the part:
Substitutions take place from left to right, and each substitution is evaluated completely before attempting to evaluate the next. Thus, a sequence like:
set y [set x 0][incr x][incr x]
will always set the variable y to the value, 012.
Agreed with Jerry. Adding some flavor in it.
Tcl commands are evaluated in two steps : parsing & execution.
First the Tcl interpreter parses the command string into words, performing substitutions along the way.
Then a command procedure processes the words to produce a result string. Each command has a separate command procedure.
Let us consider the following code.
%set input "The cat in the hat"
The cat in the hat
%string match "*at in*" $input
1
In the parsing step the Tcl interpreter applies the rules described in this chapter to divide the command up into words and perform substitutions.
Parsing is done in exactly the same way for every command. During the parsing step the Tcl interpreter does not apply any meaning to the values of the words. Tcl just performs a set of simple string operations such as replacing the characters $a with the string stored in variable a. Tcl does not know or care whether a or the resulting word is a number or the name of a widget or anything else.
In the execution step meaning is applied to the words of the command. Tcl treats the first word as a command name, checking to see if the command is defined and locating a command procedure to carry out its function. If the command is defined then the Tcl interpreter invokes its command procedure, passing all of the words of the command to the command procedure. The command procedure is free to interpret the words in any way that it pleases, and different commands apply very different meanings to their arguments.
Major rule to remember here
Tcl parses a command and makes substitutions in a single pass from left to right. Each character is scanned exactly once.
At most a single layer of substitution occurs for each character; the result of one substitution is not scanned for further
substitutions.
Reference : Tcl and the Tk Toolkit

Break on namespace function in gdb (llvm)

I'm trying to step through llvm's opt program (for an assignment) and the instructor suggested setting a breakpoint at runOnFunction. I see this in one of the files:
bool InstCombiner::runOnFunction(Function &F) { /* (Code removed for SO) */ }
but gdb does not seem to find the runOnFunction breakpoint. It occurred to me that the problem might be namespaces? I tried this but gdb never breaks, it just creates the fooOpt.s file:
(gdb) b runOnFunction
Function "runOnFunction" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (runOnFunction) pending.
(gdb) r -S -instcombine -debug -o ~/Desktop/fooOpt.s ~/Desktop/foo.s
I'm on a Mac so I don't have objdump but otool produces 5.6 million lines, wading through that for a starting point does not seem reasonable as runOnFunction appears more than once there.
Gdb has several builtin commands to find name of such functions. First is info functions, which can be used with optional regexp argument to grep all available functions, https://sourceware.org/gdb/current/onlinedocs/gdb/Symbols.html
info functions regexp
Print the names and data types of all defined functions whose names contain a match for regular expression regexp. Thus, ‘info fun step’ finds all functions whose names include step; ‘info fun ^step’ finds those whose names start with step. If a function name contains characters that conflict with the regular expression language (e.g. ‘operator*()’), they may be quoted with a backslash.
So, you can try info functions runOnFunction to get the name. Sometimes it can be useful to add quotes around name when doing break command.
The other way is to use rbreak command instead of break (b). rbreak will do regexp search in functions names and may define several breakpoints: https://sourceware.org/gdb/current/onlinedocs/gdb/Set-Breaks.html#Set-Breaks
rbreak regex
Set breakpoints on all functions matching the regular expression regex. This command sets an unconditional breakpoint on all matches, printing a list of all breakpoints it set. ...
The syntax of the regular expression is the standard one used with tools like grep. Note that this is different from the syntax used by shells, so for instance foo* matches all functions that include an fo followed by zero or more os. There is an implicit .* leading and trailing the regular expression you supply, so to match only functions that begin with foo, use ^foo.
(or even rbreak file:regex to limit search to single source file)
PS: if you want, you can turn on or off C++ function name demangling with set print demangle on or off (https://sourceware.org/gdb/current/onlinedocs/gdb/Debugging-C-Plus-Plus.html#Debugging-C-Plus-Plus). With demangling turned off it will be easier to copy function name to break command.

sed: modify function arguments

I'm trying to write a sed command that will allow me to modify a function's arguments. The number of arguments can be variable.
If this is my function:
int myFunction(int arg1, int arg2, Dog arg3) {
// function implementation
}
I would like to be able to perform addition operations on int arg1, int arg2, ...
Here's what I have that does not work:
sed -e '/^[a-zA-Z0-9_]\+\s\+[a-zA-Z0-9_]\+(/ , /)[\n\s]*{/ {
# arguments should be listed here
}'
Any help is appreciated. Go easy on me, it's my first attempt at sed / shell scripting.
Thanks.
Ultimately, sed is not the correct tool for this job - because of your comment about 'the number of arguments can be variable'. If you are dealing with a fixed number of arguments of a fixed type, you can kludge your way around it, but any more general processing requires a more general processor (than sed).
I suggest trying a different task as your introduction to shell scripting and sed.
If you must do it, then maybe:
sed '/^[A-Za-z_][A-Za-z0-9_]* *[A-Za-z_][A-Za-z0-9]* *([A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\) *, *[A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\)[ ,)].*{/{p;a\
return \1 + \2;
}' $file
That horror of a match contains the sequence [A-Za-z_][A-Za-z0-9_]* 6 times; it matches an identifier each time. The segment from '[ ,)].*{ matches a third or subsequent arguments. The spaces in the pattern should, perhaps, be '[<blank><tab>]' character classes, but they're a pain to enter on StackOverflow. The regex then matches a function definition, and captures (in the '\(<identifier>\)' parts the names of the two variables (arg1 and arg2 in your example). The actions when this is recognized are:
p - print the line that was recognized.
a - append the following line(s) to the output; in this case, one line containing a return statement that is the sum of the two remembered argument names. The backslash indicates that there is another line of output to append. The braces group the operations together.
Some versions of sed support more powerful regular expressions than others; I'm not sure though that even GNU sed supports PCRE (Perl-Compatible Regular Expressions), and it would take something like PCRE to significantly reduce the the regex.
Note that this script leaves the comment line '// function implementation' untouched. It's your call what you do with that.
Finally, remember that if you write more than one function to add two integers together, you are wasting code. Therefore, this is not a plausible transformation. Each function should do something different, somehow. Granted, if the types are different each time, then maybe it has its uses, but even so, it would be easier to write a generator than to parse the skeleton and fill in the bits. And that might be a good scripting exercise.