Consider this code:
set status [catch {eval exec $Executable $options | grep "(ERROR_|WARNING)*" ># stdout} errorMessage]
if { $status != 0 } {
return -code error ""
}
In case of errors in the child process, they are outputted in the stdout. But if there are no errors in the child process, the status value still non-zero. How avoid this?
Also is there are some ways to use fileutil::grep instead of bash grep?
In case of errors in the child process, they are outputted in the stdout. But if there are no errors in the child process, the status value still non zero. How avoid this?
There's no connection between writing something to any file descriptor (including the one connected to the "standadrd error stream") and returning a non-zero exit code as these concepts are completely separate as far as an OS is concerned. A process is free to perform no I/O at all and return a non-zero exit code (a somewhat common case for Unix daemons, which log everything, including errors, through syslog), or to write something to its standard error stream and return zero when exiting — a common case for software which write certain valuable data to their stdout and provide diagnostic messages, when requested, to their stderr.
So, first verify your process writes nothing to its standard error and still exits with non-zero exit code using plain shell
$ that_process --its --command-line-options and arguments if any >/dev/null
$ echo $?
(the process should print nothing, and echo $? should print a non-zero number).
If the case is true, and you're sure the process does not think something is wrong, you'll have to work around it using catch and processing the extended error information it returns — ignoring the case of the process exiting with the known exit code and propagating every other error.
Basically:
set rc [catch {exec ...} out]
if {$rc != 0} {
global errorCode errorInfo
if {[lindex $errorCode 0] ne "CHILDSTATUS"} {
# The error has nothing to do with non-zero process exit code
# (for instance, the program wasn't found or the current user
# did not have the necessary permissions etc), so pass it up:
return -code $rc -errorcode $errorCode -errorinfo $errorInfo $out
}
set exitcode [lindex $errorCode 2]
if {$exitcode != $known_exit_code} {
# Unknown exit code, propagate the error:
return -code $rc -errorcode $errorCode -errorinfo $errorInfo $out
}
# OK, do nothing as it has been a known exit code...
}
CHILDSTATUS (and the errorCode global variable in general) is described here.
Related
Here is my test.cpp program. It exits abnormally via an assert(0).
#include <cassert>
int main() {
assert(0);
}
When I run this program directly I get the expected output including a non-zero exit status:
$ ./test
...
$ echo $?
134
But when I try to detect the abnormal exit in tcl/expect I don't seem to be able to:
#!/usr/bin/expect
spawn ./test
expect eof
lassign [wait] pid spawnid os_error_flag value
if {$os_error_flag != 0} {
puts "OS error"
exit 1
}
if {$value != 0} {
puts "Application error"
exit 1
}
puts "No error"
When I run that script:
$ ./test.expect
No error
If I use exit(1) instead of assert(0) then the tcl script is able to detect the abnormal exit. Why doesn't tcl/expect provide an OS- or application-returned error code for assertion failures and how can I uniformly detect all abnormal program exits by checking the exit code?
Not an answer, but an extended comment:
Running that code, I see:
$ ./a.out; echo $?
Assertion failed: (0), function main, file x.c, line 4.
Abort trap: 6
134
And in expect, I see:
$ expect
expect1.1> spawn ./a.out
spawn ./a.out
47429
expect1.2> expect eof
Assertion failed: (0), function main, file x.c, line 4.
expect1.3> wait
47429 exp6 0 0 CHILDKILLED SIGABRT SIGABRT
Looks like you need to look at the elements returned by wait beyond the 4th.
When I look up what wait does, I see that:
Additional elements may appear at the end of the return value from wait. An optional fifth element identifies a class of information. Currently, the only possible value for this element is CHILDKILLED in which case the next two values are the C-style signal name and a short textual description.
The assert() call uses abort() to terminate the process if the assertion fails, and that shows up as an exit via SIGABRT.
set result [wait]
if {[lindex $result 4] eq "CHILDKILLED"} {
if {[lindex $result 5] eq "SIGABRT"} {
# assertion failed, or other abort
} else {
# other signal
}
} else {
# normal exit, may be with conventional error
}
Error handling can definitely be fiddly!
Why does the following code:
#!/usr/bin/env tclsh
package require cmdline;
set options {{d.arg "" "destination directory"}}
set usage ": $::argv0 \[options] filename ...\noptions:"
set params [::cmdline::getoptions ::argv $options $usage]
throw the following error upon execution of ./main.tcl -help?
main : ./main.tcl [options] filename ...
options:
-d value destination directory <>
-help Print this message
-? Print this message
while executing
"error [usage $optlist $usage]"
(procedure "::cmdline::getoptions" line 15)
invoked from within
"::cmdline::getoptions ::argv $options $usage"
invoked from within
"set params [::cmdline::getoptions ::argv $options $usage]"
(file "./main.tcl" line 8)
It should display the usage information, but I didn't expect the error afterwards. Did I do something wrong?
From what I understand from the docs (emphasis mine):
The options -?, -help, and -- are implicitly understood. The first two abort option processing by throwing an error and force the generation of the usage message, whereas the the last aborts option processing without an error, leaving all arguments coming after for regular processing, even if starting with a dash.
using -help or -? will always throw an error.
Further down in the docs you can see an example where try { ... } trap { ... } is being used in conjunction with ::cmdline::getoptions, which might be how you might want to do it:
try {
array set params [::cmdline::getoptions ::argv $options $usage]
} trap {CMDLINE USAGE} {msg o} {
# Trap the usage signal, print the message, and exit the application.
# Note: Other errors are not caught and passed through to higher levels!
puts $msg
exit 1
}
I have a case when the Tcl script runs a process, which does fork(), leaves the forked process to run, and then the main process exits. You can try it out simply by running any program that forks to background, for example gvim, provided that it is configured to run in background after execution: set res [exec gvim].
The main process theoretically exits immediately, the child process runs in background, but somehow the main process hangs up, doesn't exit, stays in zombie state (reports as <defunct> in ps output).
In my case the process I'm starting prints something, I want that something and I want that the process exit and I state it done. The problem is that if I spawn the process using open "|gvim" r, then I cannot also recognize the moment when the process has finished. The fd returned by [open] never reports [eof], even when the program turns into zombie. When I try to [read], just to read everything that the process might print, it hangs up completely.
What is more interesting, is that occasionally both the main process and the forked process print something and when I'm trying to read it using [gets], I get both. If I close the descriptor too early, then [close] throws an exception due to broken pipe. Probably that's why [read] never ends.
I need some method to recognize the moment when the main process exits, while this process could have spawned another child process, but this child process may be completely detached and I'm not interested what it does. I want something that the main process prints before exitting and the script should continue its work while the process running in background is also running and I'm not interested what happens to it.
I have a control over the sources of the process I'm starting. Yes, I did signal(SIGCLD, SIG_IGN) before fork() - didn't help.
Tcl clears up zombies from background process calls the next time it calls exec. Since a zombie really doesn't use much resources (just an entry in the process table; there's nothing else there really) there isn't a particular hurry to clean them up.
The problem you were having with the pipeline was that you'd not put it in non-blocking mode. To detect exit of a pipeline, you're best off using a fileevent which will fire when either there's a byte (or more) to read from the pipe or when the other end of the pipe is closed. To distinguish these cases, you have to actually try to read, and that can block if you over-read and you're not in non-blocking mode. However, Tcl makes working with non-blocking mode easy.
set pipeline [open |… "r"]
fileevent $pipeline readable [list handlePipeReadable $pipeline]
fconfigure $pipeline -blocking false
proc handlePipeReadable {pipe} {
if {[gets $pipe line] >= 0} {
# Managed to actually read a line; stored in $line now
} elseif {[eof $pipe]} {
# Pipeline was closed; get exit code, etc.
if {[catch {close $pipe} msg opt]} {
set exitinfo [dict get $opt -errorcode]
} else {
# Successful termination
set exitinfo ""
}
# Stop the waiting in [vwait], below
set ::donepipe $pipeline
} else {
# Partial read; things will be properly buffered up for now...
}
}
vwait ::donepipe
Be aware that using gvim in a pipeline is… rather more complex than usual, as it is an application that users interact with.
You might find it easier to run a simple exec in a separate thread, provided your version of Tcl is thread-enabled and the Thread package is installed. (That ought to be the case if you're using 8.6, but I don't know if that's true.)
package require Thread
set runner [thread::create {
proc run {caller targetVariable args} {
set res [catch {
exec {*}$args
} msg opt]
set callback [list set $targetVariable [list $res $msg $opt]]
thread::send -async $caller $callback
}
thread::wait
}]
proc runInBackground {completionVariable args} {
global runner
thread::send -async $runner [list run [thread::id] $completionVariable {*}$args]
}
runInBackground resultsVar gvim …
# You can do other things at this point
# Wait until the variable is set (by callback); alternatively, use a variable trace
vwait resultsVar
# Process the results to extract the sense
lassign $resultsVar res msg opt
puts "code: $res"
puts "output: $msg"
puts "status dictionary: $opt"
For all that, for an editor like gvim I'd actually expect it to be run in the foreground (which doesn't require anything like as much complexity) since only one of them can really interact with a particular terminal at once.
Your daemon can also call setsid() and setpgrp() to start a new session and to detach from the process group. But these don't help with your problem either.
You will have to do some process management:
#!/usr/bin/tclsh
proc waitpid {pid} {
set rc [catch {exec -- kill -0 $pid}]
while { $rc == 0 } {
set ::waitflag 0
after 100 [list set ::waitflag 1]
vwait ::waitflag
set rc [catch {exec -- kill -0 $pid}]
}
}
set pid [exec ./t1 &]
waitpid $pid
puts "exit tcl"
exit
Edit: Another unreasonable answer
If the forked child process closes the open channels, Tcl will not wait on it.
Test program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
int
main (int argc, char *argv [])
{
int pid;
FILE *o;
signal (SIGCHLD, SIG_IGN);
pid = fork ();
if (pid == 0) {
/* should also call setsid() and setpgrp() to daemonize */
printf ("child\n");
fclose (stdout);
fclose (stderr);
sleep (10);
o = fopen ("/dev/tty", "w");
fprintf (o, "child exit\n");
fclose (o);
} else {
printf ("parent\n");
sleep (2);
}
printf ("t1 exit %d\n", pid);
return 0;
}
Test Tcl program:
#!/usr/bin/tclsh
puts [exec ./t1]
puts "exit tcl"
At first you say:
I need some method to recognize the moment when the main process exits, while this process could have spawned another child process, but this child process may be completely detached and I'm not interested what it does.
later on you say:
If the forked child process closes the open channels, Tcl will not wait on it.
these are two contradictory statements. One one hand you are only interested in the parent process and on the other whether or not the child has finished even thought you also state you aren't interested in child processes that have detached. Last I heard forking and closing the childs copies of the parents stdin,stdout and stderr is detaching (i.e.daemonizing the child process ). I wrote this quick program to run the above included simple c program and as expected tcl knows nothing of the child process. I called the compiled version of the program /tmp/compile/chuck. I did not have gvim so I used emacs but as emacs does not generate text I wrap the exec in its own tcl script and exec that. In both cases, the parent process is waited for and eof is detected. When the parent exits the Runner::getData runs and the clean up is evaluated.
#!/bin/sh
exec /opt/usr8.6.3/bin/tclsh8.6 "$0" ${1+"$#"}
namespace eval Runner {
variable close
variable watch
variable lastpid ""
array set close {}
array set watch {}
proc run { program { message "" } } {
variable watch
variable close
variable lastpid
if { $message ne "" } {
set fname "/tmp/[lindex $program 0 ]-[pid].tcl"
set out [ open $fname "w" ]
puts $out "#![info nameofexecutable]"
puts $out " catch { exec $program } err "
puts $out "puts \"\$err\n$message\""
close $out
file attributes $fname -permissions 00777
set fd [ open "|$fname " "r" ]
set close([pid $fd]) "file delete -force $fname "
} else {
set fd [ open "|$program" "r" ]
set close([pid $fd]) "puts \"cleanup\""
}
fconfigure $fd -blocking 0 -buffering none
fileevent $fd readable [ list Runner::getData [ pid $fd ] $fd ]
}
proc getData { pid chan } {
variable watch
variable close
variable lastpid
set data [read $chan]
append watch($pid) "$data"
if {[eof $chan]} {
catch { close $chan }
eval $close($pid) ; # cleanup
set lastpid $pid
}
}
}
Runner::run /tmp/compile/chuck ""
Runner::run emacs " Emacs complete"
while { 1 } {
vwait Runner::lastpid
set p $Runner::lastpid
catch { exec ps -ef | grep chuck } output
puts "program with pid $p just ended"
puts "$Runner::watch($p)"
puts " processes that match chuck "
puts "$output"
}
Output :
note I exited out of emacs after the child reported that it was exiting.
[user1#linuxrocks workspace]$ ./test.tcl
cleanup
program with pid 27667 just ended
child
parent
t1 exit 27670
processes that match chuck avahi 936 1 0 2016 ?
00:04:35 avahi-daemon: running [linuxrocks.local] admin 27992 1 0
19:37 pts/0 00:00:00 /tmp/compile/chuck admin 28006 27988 0
19:37 pts/0 00:00:00 grep chuck
child exit
program with pid 27669 just ended
Emacs complete
Ok, I found the solution after a long discussion here:
https://groups.google.com/forum/#!topic/comp.lang.tcl/rtaTOC95NJ0
The below script demonstrates how this problem can be solved:
#!/usr/bin/tclsh
lassign [chan pipe] input output
chan configure $input -blocking no -buffering line ;# just for a case :)
puts "Running $argv..."
set ret [exec {*}$argv 2>#stderr >#$output]
puts "Waiting for finished process..."
set line [gets $input]
puts "FIRST LINE: $line"
puts "DONE. PROCESSES:"
puts [exec ps -ef | grep [lindex $argv 0]]
puts "EXITING."
The only problem that remains is that there's still no possibility to know that the process has exited, however the next [exec] (in this particular case probably the [exec ps...] command did this) cleans up the zombie (No universal method for that - the best you can do on POSIX systems is [exec /bin/true]). In my case it was enough that I get one line that the parent process had to print, after which I can simply "let it go".
Still, it would be nice if [exec] can return me somehow the PID of the first process and there's a standard [wait] command that can block until the process exits or check its running state (this command is currently available in TclX).
Note that [chan pipe] is available only in Tcl 8.6, you can use [pipe] from TclX alternatively.
In my Tcl/Tk script, there is one step to remove some txt file.
I use:
exec rm file1.txt
But if the file dose not exist, then error message will come up which will block the script usage.
What I want to do is remove the file if it exist, and if it does not exist, to skip the error.
Is there a good way of doing this?
Ok, I find the answer: file exists filename works well for this case.
You could use
file delete file1.txt
where trying to delete a non-existent file is not considered an error.
How to avoid having an error stop your program.
The "0th" solution is to use commands that don't raise errors. such as glob -nocomplain instead of just glob, or in this case file delete file1.txt as suggested by timrau.
In some cases it's impossible to prevent errors from being raised. In those cases you can choose from several strategies. Assume you need to call mycmd, and it might raise errors.
# Tcl 8.6
try mycmd on error {} {}
# Tcl 8.4 or later
catch mycmd
This invocation quietly intercepts the error and lets your program continue. This is perfectly acceptable if the error isn't important, e.g. when you attempt to discard a variable that might not exist (catch {unset myvar}).
You might want to take some action when an error is raised, such as reporting it to yourself (as an error message on stderr or in a message box, or in a log of some kind) or by dealing with the error somehow.
try mycmd on error msg {puts stderr "There was a problem: $msg"}
if {[catch mycmd msg]} {
puts stderr "There was a problem: $msg"
}
You might want to take some action only if there was no error:
try {
mycmd
} on ok res {
puts "mycmd returned $res"
} on error msg {
puts stderr "There was a problem: $msg"
}
if {[catch mycmd res]} {
puts stderr "There was a problem: $res"
} else {
puts "mycmd returned $res"
}
For instance, this invocation returns the contents of a file, or the empty string if the file doesn't exist. It makes sure that the channel is closed and the variable holding the channel identifier are destroyed in either case:
set txt [try {
open $filename
} on ok f {
chan read $f
} on error msg {
puts stderr $msg
} finally {
catch {chan close $f}
catch {unset f}
}]
Documentation: catch, chan, file, glob, if, open, puts, set, try
What is actually the difference between raising an exception in TCL via return -code error ...and error ...? When would one be used instead of the other?
The error command produces an error right at the current point; it's great for the cases where you're throwing a problem due to a procedure's internal state.
The return -code error command makes the procedure it is placed in produce an error (as if the procedure was error); it's great for the case where there's a problem with the arguments passed to the procedure (i.e., the caller did something wrong).
The difference really comes when you look at the stack trace.
Here's a (contrived!) example:
proc getNumberFromFile {filename} {
if {![file readable $filename]} {
return -code error "could not read $filename"
}
set f [open $filename]
set content [read $f]
close $f
if {![regexp -- {-?\d+} $content number]} {
error "no number present in $filename"
}
return $number
}
catch {getNumberFromFile no.such.file}
puts $::errorInfo
#could not read no.such.file
# while executing
#"getNumberFromFile no.such.file"
catch {getNumberFromFile /dev/null}
puts $::errorInfo
#no number present in /dev/null
# while executing
#"error "no number present in $filename""
# (procedure "getNumberFromFile" line 9)
# invoked from within
#"getNumberFromFile /dev/null"