Here is my test.cpp program. It exits abnormally via an assert(0).
#include <cassert>
int main() {
assert(0);
}
When I run this program directly I get the expected output including a non-zero exit status:
$ ./test
...
$ echo $?
134
But when I try to detect the abnormal exit in tcl/expect I don't seem to be able to:
#!/usr/bin/expect
spawn ./test
expect eof
lassign [wait] pid spawnid os_error_flag value
if {$os_error_flag != 0} {
puts "OS error"
exit 1
}
if {$value != 0} {
puts "Application error"
exit 1
}
puts "No error"
When I run that script:
$ ./test.expect
No error
If I use exit(1) instead of assert(0) then the tcl script is able to detect the abnormal exit. Why doesn't tcl/expect provide an OS- or application-returned error code for assertion failures and how can I uniformly detect all abnormal program exits by checking the exit code?
Not an answer, but an extended comment:
Running that code, I see:
$ ./a.out; echo $?
Assertion failed: (0), function main, file x.c, line 4.
Abort trap: 6
134
And in expect, I see:
$ expect
expect1.1> spawn ./a.out
spawn ./a.out
47429
expect1.2> expect eof
Assertion failed: (0), function main, file x.c, line 4.
expect1.3> wait
47429 exp6 0 0 CHILDKILLED SIGABRT SIGABRT
Looks like you need to look at the elements returned by wait beyond the 4th.
When I look up what wait does, I see that:
Additional elements may appear at the end of the return value from wait. An optional fifth element identifies a class of information. Currently, the only possible value for this element is CHILDKILLED in which case the next two values are the C-style signal name and a short textual description.
The assert() call uses abort() to terminate the process if the assertion fails, and that shows up as an exit via SIGABRT.
set result [wait]
if {[lindex $result 4] eq "CHILDKILLED"} {
if {[lindex $result 5] eq "SIGABRT"} {
# assertion failed, or other abort
} else {
# other signal
}
} else {
# normal exit, may be with conventional error
}
Error handling can definitely be fiddly!
Related
I'm looking for a builtin or standard package that provides functionality that is similar or equivalent to stdlib's atexit() and bash' trap "..." EXIT.
It should catch termination due to any programatic way of ending execution, including all of the following:
naturally reached end of script execution
explicitly invoked exit
uncaught error
In most cases, all you need to do to intercept such terminations is to intercept the exit command.
rename exit real_exit
proc exit args {
puts "EXITING"; flush stdout; # Flush because I'm paranoid...
tailcall real_exit {*}$args
}
That will obviously work if you call it explicitly, but it also gets called if you just drop off the end of the script, signal end of file in an interactive session, or if your script has an error in it later and terminates with an error message. This is because the Tcl C API call, Tcl_Exit(), works by calling exit and, if that doesn't exit the process, directly does the exit itself.
Be careful with exit scripts BTW; errors in them are harder to debug than normal.
The cases where it doesn't work? Mainly where the interpreter itself has become unable to execute commands (perhaps because it has been deleted out from under itself) or where some signal closes down the interpreter (e.g., SIGINT isn't handled by default for various reasons).
A more-or-less complete atexit based on #Donal's answer:
proc atexit { procbody } {
if { [catch {set oldbody [info body exit]}] } {
rename exit builtin_exit
set oldbody { builtin_exit $returnCode }
}
proc exit { {returnCode 0} } [subst -nocommands {
apply [list [list {returnCode 0}] [list $procbody]] \$returnCode
tailcall apply [list [list {returnCode 0}] [list $oldbody]] \$returnCode
}]
}
Sample code for atexit-test.tcl:
#!/usr/bin/tclsh8.6
source atexit.tcl
atexit {
puts "EXITING($returnCode)"; flush stdout; # Flush because I'm paranoid...
}
atexit {
puts "done($returnCode)..."; flush stdout; # Flush because I'm paranoid...
}
atexit {
puts "almost($returnCode)..."; flush stdout; # Flush because I'm paranoid...
}
{*}$argv
puts "fell through argv for implicit exit..."
... and terminal session:
$ ./atexit-test.tcl
fell through argv for implicit exit...
almost(0)...
done(0)...
EXITING(0)
$ ./atexit-test.tcl exit
almost(0)...
done(0)...
EXITING(0)
$ ./atexit-test.tcl exit 5
almost(5)...
done(5)...
EXITING(5)
$ ./atexit-test.tcl error "unhandled exception"
unhandled exception
while executing
"{*}$argv"
(file "./atexit-test.tcl" line 17)
almost(1)...
done(1)...
EXITING(1)
$
Why does the following code:
#!/usr/bin/env tclsh
package require cmdline;
set options {{d.arg "" "destination directory"}}
set usage ": $::argv0 \[options] filename ...\noptions:"
set params [::cmdline::getoptions ::argv $options $usage]
throw the following error upon execution of ./main.tcl -help?
main : ./main.tcl [options] filename ...
options:
-d value destination directory <>
-help Print this message
-? Print this message
while executing
"error [usage $optlist $usage]"
(procedure "::cmdline::getoptions" line 15)
invoked from within
"::cmdline::getoptions ::argv $options $usage"
invoked from within
"set params [::cmdline::getoptions ::argv $options $usage]"
(file "./main.tcl" line 8)
It should display the usage information, but I didn't expect the error afterwards. Did I do something wrong?
From what I understand from the docs (emphasis mine):
The options -?, -help, and -- are implicitly understood. The first two abort option processing by throwing an error and force the generation of the usage message, whereas the the last aborts option processing without an error, leaving all arguments coming after for regular processing, even if starting with a dash.
using -help or -? will always throw an error.
Further down in the docs you can see an example where try { ... } trap { ... } is being used in conjunction with ::cmdline::getoptions, which might be how you might want to do it:
try {
array set params [::cmdline::getoptions ::argv $options $usage]
} trap {CMDLINE USAGE} {msg o} {
# Trap the usage signal, print the message, and exit the application.
# Note: Other errors are not caught and passed through to higher levels!
puts $msg
exit 1
}
I have expect (tcl) script for automated task working properly - configuring network devices via telnet/ssh. Most of the cases there is 1,2 or 3 command lines to execute, BUT now I have more then 100 command lines to send via expect. How can I achieved this in smart and good scripting way :)
Because I can join all command lines over 100 to a variable "commandAll" with "\n" and "send" them one after another, but I think it's pretty ugly :) Is there a way without stacking them together to be readable in code or external file ?
#!/usr/bin/expect -f
set timeout 20
set ip_address "[lrange $argv 0 0]"
set hostname "[lrange $argv 1 1]"
set is_ok ""
# Commands
set command1 "configure snmp info 1"
set command2 "configure ntp info 2"
set command3 "configure cdp info 3"
#... more then 100 dif commands like this !
#... more then 100 dif commands like this !
#... more then 100 dif commands like this !
spawn telnet $ip_address
# login & Password & Get enable prompt
#-- snnipped--#
# Commands execution
# command1
expect "$enableprompt" { send "$command1\r# endCmd1\r" ; set is_ok "command1" }
if {$is_ok != "command1"} {
send_user "\n### 9 Exit before executing command1\n" ; exit
}
# command2
expect "#endCmd1" { send "$command2\r# endCmd2\r" ; set is_ok "command2" }
if {$is_ok != "command2"} {
send_user "\n### 9 Exit before executing command2\n" ; exit
}
# command3
expect "#endCmd2" { send "$command3\r\r\r# endCmd3\r" ; set is_ok "command3" }
if {$is_ok != "command3"} {
send_user "\n### 9 Exit before executing command3\n" ; exit
}
p.s. I'm using one approach for cheeking is given cmd line is executed successfully but I'm not certain that is perfect way :D
don't use numbered variables, use a list
set commands {
"configure snmp info 1"
"configure ntp info 2"
"configure cdp info 3"
...
}
If the commands are already in a file, you can read them into a list:
set fh [open commands.file]
set commands [split [read $fh] \n]
close $fh
Then, iterate over them:
expect $prompt
set n 0
foreach cmd $commands {
send "$cmd\r"
expect {
"some error string" {
send_user "command failed: ($n) $cmd"
exit 1
}
timeout {
send_user "command timed out: ($n) $cmd"
exit 1
}
$prompt
}
incr n
}
While yes, you can send long sequences of commands that way, it's usually a bad idea as it makes the overall script very brittle; if anything unexpected happens, the script just keeps on forcing the rest of the script over. Instead, it is better to have a sequence of sends interspersed with expects to check that what you've sent has been accepted. The only real case for sending a very long string over is when you're creating a function or file on the other side that will act as a subprogram that you call; in that case, there's no really meaningful place to stop and check for a prompt half way. But that's the exception.
Note that you can expect two things at once; that's often very helpful as it lets you check for errors directly. I mention this because it is a technique often neglected, yet it allows you to make your script far more robust.
...
send "please run step 41\r"
expect {
-re {error: (.*)} {
puts stderr "a problem happened: $expect_out(1,string)"
exit 1
}
"# " {
# Got the prompt; continue with the next step below
}
}
send "please run step 42\n"
...
I have a case when the Tcl script runs a process, which does fork(), leaves the forked process to run, and then the main process exits. You can try it out simply by running any program that forks to background, for example gvim, provided that it is configured to run in background after execution: set res [exec gvim].
The main process theoretically exits immediately, the child process runs in background, but somehow the main process hangs up, doesn't exit, stays in zombie state (reports as <defunct> in ps output).
In my case the process I'm starting prints something, I want that something and I want that the process exit and I state it done. The problem is that if I spawn the process using open "|gvim" r, then I cannot also recognize the moment when the process has finished. The fd returned by [open] never reports [eof], even when the program turns into zombie. When I try to [read], just to read everything that the process might print, it hangs up completely.
What is more interesting, is that occasionally both the main process and the forked process print something and when I'm trying to read it using [gets], I get both. If I close the descriptor too early, then [close] throws an exception due to broken pipe. Probably that's why [read] never ends.
I need some method to recognize the moment when the main process exits, while this process could have spawned another child process, but this child process may be completely detached and I'm not interested what it does. I want something that the main process prints before exitting and the script should continue its work while the process running in background is also running and I'm not interested what happens to it.
I have a control over the sources of the process I'm starting. Yes, I did signal(SIGCLD, SIG_IGN) before fork() - didn't help.
Tcl clears up zombies from background process calls the next time it calls exec. Since a zombie really doesn't use much resources (just an entry in the process table; there's nothing else there really) there isn't a particular hurry to clean them up.
The problem you were having with the pipeline was that you'd not put it in non-blocking mode. To detect exit of a pipeline, you're best off using a fileevent which will fire when either there's a byte (or more) to read from the pipe or when the other end of the pipe is closed. To distinguish these cases, you have to actually try to read, and that can block if you over-read and you're not in non-blocking mode. However, Tcl makes working with non-blocking mode easy.
set pipeline [open |… "r"]
fileevent $pipeline readable [list handlePipeReadable $pipeline]
fconfigure $pipeline -blocking false
proc handlePipeReadable {pipe} {
if {[gets $pipe line] >= 0} {
# Managed to actually read a line; stored in $line now
} elseif {[eof $pipe]} {
# Pipeline was closed; get exit code, etc.
if {[catch {close $pipe} msg opt]} {
set exitinfo [dict get $opt -errorcode]
} else {
# Successful termination
set exitinfo ""
}
# Stop the waiting in [vwait], below
set ::donepipe $pipeline
} else {
# Partial read; things will be properly buffered up for now...
}
}
vwait ::donepipe
Be aware that using gvim in a pipeline is… rather more complex than usual, as it is an application that users interact with.
You might find it easier to run a simple exec in a separate thread, provided your version of Tcl is thread-enabled and the Thread package is installed. (That ought to be the case if you're using 8.6, but I don't know if that's true.)
package require Thread
set runner [thread::create {
proc run {caller targetVariable args} {
set res [catch {
exec {*}$args
} msg opt]
set callback [list set $targetVariable [list $res $msg $opt]]
thread::send -async $caller $callback
}
thread::wait
}]
proc runInBackground {completionVariable args} {
global runner
thread::send -async $runner [list run [thread::id] $completionVariable {*}$args]
}
runInBackground resultsVar gvim …
# You can do other things at this point
# Wait until the variable is set (by callback); alternatively, use a variable trace
vwait resultsVar
# Process the results to extract the sense
lassign $resultsVar res msg opt
puts "code: $res"
puts "output: $msg"
puts "status dictionary: $opt"
For all that, for an editor like gvim I'd actually expect it to be run in the foreground (which doesn't require anything like as much complexity) since only one of them can really interact with a particular terminal at once.
Your daemon can also call setsid() and setpgrp() to start a new session and to detach from the process group. But these don't help with your problem either.
You will have to do some process management:
#!/usr/bin/tclsh
proc waitpid {pid} {
set rc [catch {exec -- kill -0 $pid}]
while { $rc == 0 } {
set ::waitflag 0
after 100 [list set ::waitflag 1]
vwait ::waitflag
set rc [catch {exec -- kill -0 $pid}]
}
}
set pid [exec ./t1 &]
waitpid $pid
puts "exit tcl"
exit
Edit: Another unreasonable answer
If the forked child process closes the open channels, Tcl will not wait on it.
Test program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
int
main (int argc, char *argv [])
{
int pid;
FILE *o;
signal (SIGCHLD, SIG_IGN);
pid = fork ();
if (pid == 0) {
/* should also call setsid() and setpgrp() to daemonize */
printf ("child\n");
fclose (stdout);
fclose (stderr);
sleep (10);
o = fopen ("/dev/tty", "w");
fprintf (o, "child exit\n");
fclose (o);
} else {
printf ("parent\n");
sleep (2);
}
printf ("t1 exit %d\n", pid);
return 0;
}
Test Tcl program:
#!/usr/bin/tclsh
puts [exec ./t1]
puts "exit tcl"
At first you say:
I need some method to recognize the moment when the main process exits, while this process could have spawned another child process, but this child process may be completely detached and I'm not interested what it does.
later on you say:
If the forked child process closes the open channels, Tcl will not wait on it.
these are two contradictory statements. One one hand you are only interested in the parent process and on the other whether or not the child has finished even thought you also state you aren't interested in child processes that have detached. Last I heard forking and closing the childs copies of the parents stdin,stdout and stderr is detaching (i.e.daemonizing the child process ). I wrote this quick program to run the above included simple c program and as expected tcl knows nothing of the child process. I called the compiled version of the program /tmp/compile/chuck. I did not have gvim so I used emacs but as emacs does not generate text I wrap the exec in its own tcl script and exec that. In both cases, the parent process is waited for and eof is detected. When the parent exits the Runner::getData runs and the clean up is evaluated.
#!/bin/sh
exec /opt/usr8.6.3/bin/tclsh8.6 "$0" ${1+"$#"}
namespace eval Runner {
variable close
variable watch
variable lastpid ""
array set close {}
array set watch {}
proc run { program { message "" } } {
variable watch
variable close
variable lastpid
if { $message ne "" } {
set fname "/tmp/[lindex $program 0 ]-[pid].tcl"
set out [ open $fname "w" ]
puts $out "#![info nameofexecutable]"
puts $out " catch { exec $program } err "
puts $out "puts \"\$err\n$message\""
close $out
file attributes $fname -permissions 00777
set fd [ open "|$fname " "r" ]
set close([pid $fd]) "file delete -force $fname "
} else {
set fd [ open "|$program" "r" ]
set close([pid $fd]) "puts \"cleanup\""
}
fconfigure $fd -blocking 0 -buffering none
fileevent $fd readable [ list Runner::getData [ pid $fd ] $fd ]
}
proc getData { pid chan } {
variable watch
variable close
variable lastpid
set data [read $chan]
append watch($pid) "$data"
if {[eof $chan]} {
catch { close $chan }
eval $close($pid) ; # cleanup
set lastpid $pid
}
}
}
Runner::run /tmp/compile/chuck ""
Runner::run emacs " Emacs complete"
while { 1 } {
vwait Runner::lastpid
set p $Runner::lastpid
catch { exec ps -ef | grep chuck } output
puts "program with pid $p just ended"
puts "$Runner::watch($p)"
puts " processes that match chuck "
puts "$output"
}
Output :
note I exited out of emacs after the child reported that it was exiting.
[user1#linuxrocks workspace]$ ./test.tcl
cleanup
program with pid 27667 just ended
child
parent
t1 exit 27670
processes that match chuck avahi 936 1 0 2016 ?
00:04:35 avahi-daemon: running [linuxrocks.local] admin 27992 1 0
19:37 pts/0 00:00:00 /tmp/compile/chuck admin 28006 27988 0
19:37 pts/0 00:00:00 grep chuck
child exit
program with pid 27669 just ended
Emacs complete
Ok, I found the solution after a long discussion here:
https://groups.google.com/forum/#!topic/comp.lang.tcl/rtaTOC95NJ0
The below script demonstrates how this problem can be solved:
#!/usr/bin/tclsh
lassign [chan pipe] input output
chan configure $input -blocking no -buffering line ;# just for a case :)
puts "Running $argv..."
set ret [exec {*}$argv 2>#stderr >#$output]
puts "Waiting for finished process..."
set line [gets $input]
puts "FIRST LINE: $line"
puts "DONE. PROCESSES:"
puts [exec ps -ef | grep [lindex $argv 0]]
puts "EXITING."
The only problem that remains is that there's still no possibility to know that the process has exited, however the next [exec] (in this particular case probably the [exec ps...] command did this) cleans up the zombie (No universal method for that - the best you can do on POSIX systems is [exec /bin/true]). In my case it was enough that I get one line that the parent process had to print, after which I can simply "let it go".
Still, it would be nice if [exec] can return me somehow the PID of the first process and there's a standard [wait] command that can block until the process exits or check its running state (this command is currently available in TclX).
Note that [chan pipe] is available only in Tcl 8.6, you can use [pipe] from TclX alternatively.
Consider this code:
set status [catch {eval exec $Executable $options | grep "(ERROR_|WARNING)*" ># stdout} errorMessage]
if { $status != 0 } {
return -code error ""
}
In case of errors in the child process, they are outputted in the stdout. But if there are no errors in the child process, the status value still non-zero. How avoid this?
Also is there are some ways to use fileutil::grep instead of bash grep?
In case of errors in the child process, they are outputted in the stdout. But if there are no errors in the child process, the status value still non zero. How avoid this?
There's no connection between writing something to any file descriptor (including the one connected to the "standadrd error stream") and returning a non-zero exit code as these concepts are completely separate as far as an OS is concerned. A process is free to perform no I/O at all and return a non-zero exit code (a somewhat common case for Unix daemons, which log everything, including errors, through syslog), or to write something to its standard error stream and return zero when exiting — a common case for software which write certain valuable data to their stdout and provide diagnostic messages, when requested, to their stderr.
So, first verify your process writes nothing to its standard error and still exits with non-zero exit code using plain shell
$ that_process --its --command-line-options and arguments if any >/dev/null
$ echo $?
(the process should print nothing, and echo $? should print a non-zero number).
If the case is true, and you're sure the process does not think something is wrong, you'll have to work around it using catch and processing the extended error information it returns — ignoring the case of the process exiting with the known exit code and propagating every other error.
Basically:
set rc [catch {exec ...} out]
if {$rc != 0} {
global errorCode errorInfo
if {[lindex $errorCode 0] ne "CHILDSTATUS"} {
# The error has nothing to do with non-zero process exit code
# (for instance, the program wasn't found or the current user
# did not have the necessary permissions etc), so pass it up:
return -code $rc -errorcode $errorCode -errorinfo $errorInfo $out
}
set exitcode [lindex $errorCode 2]
if {$exitcode != $known_exit_code} {
# Unknown exit code, propagate the error:
return -code $rc -errorcode $errorCode -errorinfo $errorInfo $out
}
# OK, do nothing as it has been a known exit code...
}
CHILDSTATUS (and the errorCode global variable in general) is described here.