How to carry out synchronous processes in TCL - tcl

I am trying to carry out two process in parallel. Help me in writing a code in tcl which carries out two processes synchronously in TCL.

In Tcl, there are two ways to run a pair of subprocesses “at the same time”.
Simplest: Without control
If you just want to fire off two processes at once without keeping any control over them, put an ampersand (&) as the last argument to exec:
exec process1 "foo.txt" &
exec process2 "bar.txt" &
Note that, apart from the process ID (returned by exec) you've got no control over these subprocesses at all. Once you set them going, you'll essentially never hear from them again (using appropriate redirections to/from standard in/out may well be advisable!)
More complex: With control
To keep control over a subprocess while running it in the background, make it run in a pipeline created with open. The syntax for doing so is rather odd; be sure to follow it exactly (except as noted below):
set pipelineChannel1 [open |[list process1 "foo.txt" ] "r"]
set pipelineChannel2 [open |[list process2 "bar.txt" ] "r"]
These are reader pipelines where you're consuming the output of the subprocesses; that's what the (optional) r means. To get a pipeline that you write to (i.e., that you provide input to) you use w instead, and if you want to both read and write, use r+. The pipelines are then just normal channels that you use with puts, gets, read, fconfigure, etc. Just close when you are done.
The | must come outside and immediately before the [list …]. This matters especially if the name of the command (possibly a full pathname) has any Tcl metacharacters in it, and is because the specification of open says this:
If the first character of fileName is “|” then the remaining characters of fileName are treated as a list of arguments that describe a command pipeline to invoke, in the same style as the arguments for exec.
The main things to beware of when working with a pipeline are that.
The processing of the subprocesses really is asynchronous. You need to take care to avoid forcing too much output through at once, though turning on non-blocking IO with fconfigure $channel -blocking 0 is usually enough there.
The other processes can (and frequently do) buffer their output differently when outputting to a pipeline than when they're writing to a terminal. If this is a problem, you'll have to consider whether to use a package like Expect (which can also run multiple interactions at once, though that should be used much more sparingly as virtual terminals are a much more expensive and limited system resource than pipelines).
If you're doing truly complex asynchronous interactions with the subprocesses, consider using Tcl 8.6 where there are Tcllib packages built on top of the base coroutine feature that make keeping track of what's going on much easier.

Related

Tcl_DoOneEvent is blocked if tkwait / vwait is called

There is an external C++ function that is called from Tcl/Tk and does some stuff in a noticeable amount of time. Tcl caller has to get the result of that function so it waits until it's finished. To avoid blocking of GUI, that C++ function has some kind of event loop implemented in its body:
while (m_curSyncProc.isRunning()) {
const clock_t tm = clock();
while (Tcl_DoOneEvent(TCL_ALL_EVENTS | TCL_DONT_WAIT) > 0) {} // <- stuck here in case of tkwait/vwait
// Pause for 10 ms to avoid 100% CPU usage
if (double(clock() - tm) / CLOCKS_PER_SEC < 0.005) {
nanosleep(10000);
}
}
Everything works great unless tkwait/vwait is in action in Tcl code.
For example, for dialogs the tkwait variable someVariable is used to wait Ok/Close/<whatever> button is pressed. I see that even standard Tk bgerror uses the same method (it uses vwait).
The problem is that once called Tcl_DoOneEvent does not return while Tcl code is waiting in tkwait/vwait line, otherwise it works well. Is it possible to fix it in that event loop without total redesigning of C++ code? Because that code is rather old and complicated and its author is not accessible anymore.
Beware! This is a complex topic!
The Tcl_DoOneEvent() call is essentially what vwait, tkwait and update are thin wrappers around (passing different flags and setting up different callbacks). Nested calls to any of them create nested event loops; you don't really want those unless you're supremely careful. An event loop only terminates when it is not processing any active event callbacks, and if those event callbacks create inner event loops, the outer event loop will not get to do anything at all until the inner one has finished.
As you're taking control of the outer event loop (in a very inefficient way, but oh well) you really want the inner event loops to not run at all. There's three possible ways to deal with this; I suspect that the third (coroutines) will be most suitable for you and that the first is what you're really trying to avoid, but that's definitely your call.
1. Continuation Passing
You can rewrite the inner code into continuation-passing style — a big pile of procedures that hands off from step to step through a state machine/workflow — so that it doesn't actually call vwait (and friends). The only one of the family that tends to be vaguely safe is update idletasks (which is really just Tcl_DoOneEvent(TCL_IDLE_EVENTS | TCL_DONT_WAIT)) to process Tk internally-generated alterations.
This option was your main choice up to Tcl 8.5, and it was a lot of work.
2. Threads
You can move to a multi-threaded application. This can be easy… or very difficult; the details depend on an examination of what you're doing throughout the application.
If going this route, remember that Tcl interpreters and Tcl values are totally thread-bound; they internally use thread-specific data so that they can avoid big global locks. This means that threads in Tcl are comparatively expensive to set up, but actually use multiple CPUs very efficiently afterwards; thread pooling is a very common approach.
3. Coroutines
Starting in 8.6, you can put the inner code in a coroutine. Almost everything in 8.6 is coroutine-aware (“non-recursive” in our internal lingo) by default (including commands you wouldn't normally think of, such as source) and once you've done that, you can replace the vwait calls with equivalents from the Tcllib coroutine package and things will typically “just work”. (For example, vwait var becomes coroutine::vwait var, and after 123 becomes coroutine::after 123.)
The only things that don't have direct replacements are tkwait window and tkwait visibility; you'll need to simulate those with waiting for a <Destroy> or <Visibility> event (the latter is uncommon as it is unsupported on some platforms), which you do by binding a trivial callback on those that just sets a variable that you can coroutine::vwait on (which is essentially all that tkwait does internally anyway).
Coroutines can become messy in a few cases, such as when you've got C code that is not coroutine-aware. The main places in Tcl where these come into play are in trace callbacks, inter-interpreter calls, and the scripted implementations of channels; the issue there is that the internal APIs these sit behind are rather complicated already (especially channels) and nobody's felt up to wading in and enabling a non-recursive implementation.

Is there a way in tcl to replace the tcl process with the process of a command?

I would like to do the equivalent of exec in sh, or execve(2), just transfer control from tcl to a called command and never return, so that any signals sent to what used to be the tcl process go directly to the called command, stdout of the called command is the stdout of what used to be the tcl process, etc.
I have looked around in the manuals, but haven't found an obvious way to do this in tcl.
I'm not talking about tcl's exec, which keeps tcl running, creates a subprocess and captures its output, then resumes control flow in the tcl process.
There's nothing built into Tcl itself at the moment, but the TclX extension includes a command, execl, which can do exactly what you want.
package require TclX
execl /bin/bash [list -c somescript.sh]
Note that doing an execl that fails may put the process in an inconsistent state, as some libraries use hooks to detect the underlying system call and close their resources. It depends on exactly what you're doing of course, but since the X11 library is a notable example of this, Tk applications are unsafe to continue the current process with if execl fails. (The fork/execl idiom — or Tcl's standard exec — doesn't have the problem, as the problem syscall always happens in a subprocess.)

When is a command's compile function called?

My understanding of TCL execution is, if a command's compile function is defined, it is first called when it comes to execute the command before its execution function is called.
Take command append as example, here is its definition in tclBasic.c:
static CONST CmdInfo builtInCmds[] = {
{"append", (Tcl_CmdProc *) NULL, Tcl_AppendObjCmd,
TclCompileAppendCmd, 1},
Here is my testing script:
$ cat t.tcl
set l [list 1 2 3]
append l 4
I add gdb breakpoints at both functions, TclCompileAppendCmd and Tcl_AppendObjCmd. My expectation is TclCompileAppendCmd is hit before Tcl_AppendObjCmd.
Gdb's target is tclsh8.4 and argument is t.tcl.
What I see is interesting:
TclCompileAppendCmd does get hit first, but it is not from t.tcl,
rather it is from init.tcl.
TclCompileAppendCmd gets hit several times and they all are from init.tcl.
The first time t.tcl executes, it is Tcl_AppendObjCmd gets hit, not TclCompileAppendCmd.
I cannot make sense of it:
Why is the compile function called for init.tcl but not for t.tcl?
Each script should be independently compiled, i.e. the object with compiled command append at init.tcl is not reused for later scripts, isn't it?
[UPDATE]
Thanks Brad for the tip, after I move the script to a proc, I can see TclCompileAppendCmd is hit.
The compilation function (TclCompileAppendCmd in your example) is called by the bytecode compiler when it wants to issue bytecode for a particular instance of that particular command. The bytecode compiler also has a fallback if there is no compilation function for a command: it issues instructions to invoke the standard implementation (which would be Tcl_AppendObjCmd in this case; the NULL in the other field causes Tcl to generate a thunk in case someone really insists on using a particular API but you can ignore that). That's a useful behaviour, because it is how operations like I/O are handled; the overhead of calling a standard command implementation is pretty small by comparison with the overhead of doing disk or network I/O.
But when does the bytecode compiler run?
On one level, it runs whenever the rest of Tcl asks for it to be run. Simple! But that's not really helpful to you. More to the point, it runs whenever Tcl evaluates a script value in a Tcl_Obj that doesn't already have bytecode type (or if the saved bytecode indicates that it is for a different resolution context or different compilation epoch) except if the evaluation has asked to not be bytecode compiled by the flag TCL_EVAL_DIRECT to Tcl_EvalObjEx or Tcl_EvalEx (which is a convenient wrapper for Tcl_EvalObjEx). It's that flag which is causing you problems.
When is that flag used?
It's actually pretty simple: it's used when some code is believed to be going to be run only once because then the cost of compilation is larger than the cost of using the interpretation path. It's particularly used by Tk's bind command for running substituted script callbacks, but it is also used by source and the main code of tclsh (essentially anything using Tcl_FSEvalFileEx or its predecessors/wrappers Tcl_FSEvalFile and Tcl_EvalFile). I'm not 100% sure whether that's the right choice for a sourced context, but it is what happens now. However, there is a workaround that is (highly!) worthwhile if you're handling looping: you can put the code in a compiled context within that source using a procedure that you call immediately or use an apply (I recommend the latter these days). init.tcl uses these tricks, which is why you were seeing it compile things.
And no, we don't normally save compiled scripts between runs of Tcl. Our compiler is fast enough that that's not really worthwhile; the cost of verifying that the loaded compiled code is correct for the current interpreter is high enough that it's actually faster to recompile from the source code. Our current compiler is fast (I'm working on a slower one that generates enormously better code). There's a commercial tool suite from ActiveState (the Tcl Dev Kit) which includes an ahead-of-time compiler, but that's focused around shrouding code for the purposes of commercial deployment and not speed.

Does Tcl eval command prevent byte coding?

I know that in some dynamic, interpreted languages, using eval can slow things down, as it stops byte-coding.Is it so in Tcl 8.5?
Thanks
It doesn't prevent bytecode compilation, but it can slow things down anyway. The key issue is that it can prevent the bytecode compiler from having access to the local variable table (LVT) during compilation, forcing variable accesses to go via a hash lookup. Tcl's got an ultra-fast hash algorithm (we've benchmarked it a lot and tried a lot of alternatives; it's very hot code) but the LVT has it beat as that's just a simple C array lookup when the bytes hit the road. The LVT is only known properly when compiling a whole procedure (or other procedure-like thing, such as a lambda term or TclOO method).
Now, I have tried making this specific case:
eval {
# Do stuff in here...
}
be fully bytecode-compiled and it mostly works (apart from a few weird things that are currently observable but perhaps shouldn't be) yet for the amount that we use that, it's just plain not worth it. In any other case, the fact that the script can't be known too precisely at the point where the compiler is running forces the LVT-less operation mode.
On the other hand, it's not all doom and gloom. Provided the actual script being run inside the eval doesn't change (and that includes not being regenerated through internal concat — multi-argument eval never gets this benefit) Tcl can cache the compilation of the code in the internal representation of the script value, LVT-less though it is, and so there's still quite a bit of performance gain there. This means that this isn't too bad, performance wise:
set script {
foo bar $boo
}
for {set i 0} {$i < 10} {incr i} {
eval $script
}
If you have real performance-sensitive code, write it without eval. Expansion syntax — {*} — can help here, as can helper procedures. Or write the critical bits in C or C++ or Fortran or … (see the critcl and ffidl extension packages for details of cool ways to do this, or just load the DLL as needed if it has a suitable *_Init function).

Handling DoS from untrusted sockets (and other streams)

This TIP confused me. It seems to be saying that -buffering line makes the input buffer infinitely large, when I thought line buffering only affected flushing of output? Can't I use -buffersize 5000 together with -buffering line to protect me from people sending long lines? If I can, then what good is chan pending? To discover when the buffer is full without a line break in it?
Or are there two different buffers? One that's just for pre-reading data to save time, and one internal that commands like gets and read use?
EDIT: Or is the problem created only when you use gets because it doesn't return partial lines? Does gets put the stream into an infinite large buffer mode because otherwise if the buffer filled up without a line break, gets could never return it? Is this the "line buffer mode" that the TIP talks about?
First off, the -buffersize option is for output, not input. I've never needed to set it in the past few years; Tcl's buffer management is pretty good.
Secondly, the -buffering option is also for output.
Thirdly, you're vulnerable to someone sending you a vastly long line if you're using blocking channels. You just have no opportunity to do anything other than wait for the end of the line (or the end of the file) to come.
But in non-blocking mode, things are more subtle. You get a readable fileevent for the channel (not relevant for files, but you can check their size is sane more easily, and they're not normally a problem in any case) and do a gets $theChannel line, which returns a -1. (If 0 or more, you've got a complete line.)
So what does the -1 mean? Well, it means that either the line is incomplete or you've got to the end of the stream. You can distinguish the cases with fblocked/chan blocked (or eof to detect the reverse case) and you find that the line isn't there yet. What now? Check to see how much data has been buffered with chan pending input; if there's a silly amount (where “silly” is tunable) then it's time to give up on the channel as the other side isn't being nice (i.e., just close it).
I've yet to see a real use for chan pending output that isn't happier with writable fileevents, but it's not usually a big problem: just using fcopy/chan copy to spool data from large sources to the (slow) output channel works fine without bloating buffers a lot.