Redirecting output of tcl proc to file and output (like tee) Part 2 - tcl

I am using tee from https://wiki.tcl-lang.org/page/Tee to redirect file output from my procedures. I need to redirect both stdout and stderr to the file.
Using the input from Redirecting output of tcl proc to file and output (like tee) I arrived at doing the following:
set LogFile [open ${LogFileName} w]
tee channel stderr $LogFile
tee channel stdout $LogFile
set BuildErrorCode [catch {LocalBuild $BuildName $Path_Or_File} BuildErrMsg]
set BuildErrorInfo $::errorInfo
# Restore stdout and stderr
chan pop stdout
chan pop stderr
# Handle errors from Build ...
I am testing this on three different EDA tools and I have three different issues.
When I run from tclsh (on MSYS2 running on Windows 10) and run either the open source simulator GHDL, ModelSim, or QuestaSim, all the even characters are the NUL character.
If I run ModelSim or QuestaSim from the GUI, I miss the output of each command. Shouldn't that be going to either stdout or stderr?
In Riviera-PRO, I am getting extraneous characters that were previously printed. They are generally the second half of a word.
Am I doing something wrong? I tested out the above code using:
set LogFile [open test_tee.log w]
tee channel stderr $LogFile
tee channel stdout $LogFile
puts "Hello, World!"
puts stderr "Error Channel"
puts stdout "Output Channel"
chan pop stdout
chan pop stderr
And this works well.
I am hoping to find something that works in the general case for all tools rather than having to write a different handler for each tool.
============ Update =============
For #1 above, with #Shawn's suggestion, I tried the following and it did not work.
set LogFile [open ${LogFileName} w]
chan configure $LogFile -encoding ascii
. . .
I also tried the following and it did not work.
set LogFile [open ${LogFileName} w]
fconfigure $LogFile -encoding ascii
. . .
Then I tried updating the write in tee to the following and it did not work:
proc tee::write {fd handle buffer} {
puts -nonewline $fd [encoding convertto ascii $buffer]
return $buffer
}
Any other hints solutions appreciated
============ Update2 =============
I have successfully removed the nul characters by doing the following, except now I have an extra newline. Still not a solution.
proc tee::write {fd handle buffer} {
puts -nonewline $fd [regsub -all \x00 $buffer ""]
return $buffer
}

The extra NUL bytes are probably because the stdout ahd steer channels are being written in UTF-16 (the main use for that encoding is the console on Windows). The tee interceptors you are using come after the data being written is encoded. There's a few ways to fix it, but the easiest is to open the file with the right encoding when reading it.
The output of the commands is not necessarily written to those channels. Code written in C or C++ is entirely free to write directly, and Tcl code cannot see that; it's all happening behind our back. Command results can be intercepted using execution traces, but that cannot see anything that the commands internally print that aren't routed via the Tcl library somehow. (There are a few more options on Unix due to the different ways that the OS handles I/O.)
Don't know what's happening with the extra characters. I can tell you that you are getting what goes through the channel, but there are too many tricks (especially in interactive use!) for a useful guess on that front.

Related

How is a channel's output-buffer content deleted without writing it to the channel?

I don't know much about PHP or Tcl; but I am trying to learn both concurrently.
In PHP, I read that every script should start with ob_start and, therefore, have been using the following.
ob_start(NULL, 0, PHP_OUTPUT_HANDLER_STDFLAGS);
echo header('Content-Length: '.ob_get_length());
ob_end_flush();
ob_end_clean();
In Tcl channels, I see that the options of -buffering full and -buffersize take care of ob_start() and chan flush is analogous to ob_end_flush() and chan pending output returns the number of bytes written to the output buffer but not yet written out.
I've been looking at my two texts on Tcl and the Tcl manual web page for channels and I can't find a method of just clearing the channel output buffer without writing it.
If data is being written to a channel set to -buffering full and an error is caught/trapped is it possible to empty the buffer and not write it to the channel?
It though perhaps that could use chan seek to set the position back to start similar to setting a pointer back to the beginning of a segment of RAM but the pipe example doesn't appear to create a channel that supports seeking.
lassign [chan pipe] rchan wchan
chan configure $rchan -buffering line -blocking 0 -translation crlf
chan configure $wchan -buffering full -blocking 0 -translation crlf
chan puts $wchan "This is the full messsage which shall attempt to truncate."
chan puts stdout "wchan pending: [chan pending output $wchan]"
chan puts stdout "wchan tell: [chan tell $wchan]"
# => -1 Thus, channel does not support seeking.
#chan seek $wchan 5 start
# => Errors invalid seek
chan flush $wchan
chan puts stdout [chan gets $rchan]
Thank you.
Sounds like you want to only output text written to a channel if no error happens in the middle of writing?
One way is to use a variable channel from tcllib; everything written to the channel is stored in a variable, which can then be written out to the real target on successful completion of whatever you're trying to do.
Example:
#!/usr/bin/env tclsh
package require tcl::chan::variable
proc main {} {
variable output
set output ""
set outputchan [::tcl::chan::variable output]
try {
puts $outputchan "Some text"
error "This is an error"
# Won't get written if an error is raised
chan flush $outputchan
puts -nonewline $output
} on error {errMsg errOptions} {
# Report error if you want
} finally {
chan close $outputchan
}
}
main
I don't think Tcl provides the functionality you are looking for. It's assumed that if you send something to a channel then it should always be written out.

Redirecting output of tcl proc to file and output (like tee)

I need redirect that I found in one of my searches:
redirect -tee LogFile.log {include ../RunDemoTests.tcl}
Where include is a TCL proc and ../RunDemoTests.tcl is a parameter to the proc. Is there a library I need to be able to use redirect or is this not general tcl?
I am working in an EDA tool environment that runs under both Windows and Linux, so I need a solution that is just TCL and does not rely on something from the OS.
I have tried numerous variations of:
set ch [open |[list include "../OsvvmLibraries/UART/RunDemoTests.tcl"] r]
set lf [open build.log w]
puts "Starting"
puts $lf "Starting"
while {[gets $ch line]>=0} {
puts $line
puts $lf $line
}
close $lf
However, this only seems to work when the command is something from the OS environment, such as:
set ch [open |[list cat ../tests.pro] r]
Printing from this can be a significant number of lines, buffering is ok, but not collecting the whole file and then printing as the files can be long (180K lines).
In response to a question on comp.lang.tcl a while ago, I created a small Tcl module to provide tee-like functionality in Tcl. I have now published the code on the Tcl wiki.
You would use it like this:
package require tee
tee stdout build.log
try {
puts "Starting"
include ../OsvvmLibraries/UART/RunDemoTests.tcl
} finally {
# Make sure the tee filter is always popped from the filter stack
chan pop stdout
}
This assumes the include RunDemoTests.tcl command produces output to stdout.

process only the top n rows of a gzipped file in Tcl

I need to extract some string values out of a text file from the top n rows (around 50 rows). In some cases the files are gzipped and in some cases they are not.
currently, I'm using the following to read the file, but this reads the entire file and is slow to process very large files.
set f [open "| zcat -f $filename" r]
if the file is not gzipped, then this seems to work OK and seems pretty fast.
set f [open "| head -n 50 $filename" r]
but when it is gzipped, I can't seem to zcat just the top n rows. I've tried this but I get an error at the tclsh
set f [open "| zcat -f $filename | head -n 50" r]
set data [read $f]
close $f
%child killed: write on pipe with no readers
I can just try to catch the error and move on since it does seem to push the data into the $data variable, but I'm wondering if I'm doing something illegal here.
Alternatively, is there a pure Tcl way to accomplish this?
The error comes out of close and is because zcat was stopped (by a signal, SIGPIPE) before it had written all the lines out. It's expected, and you can safely catch it and ignore it. Put the catch around the close.
In Tcl 8.6 (but not any previous version, nor in the free-standing zlib package for Tcl; this was functionality we added when we imported the package into Tcl), you can do this in pure Tcl.
set f [open $filename]
zlib push gunzip $f
# Read those lines! This oneliner is a hack!
set lines [lmap - [lrepeat 50 -] {gets $f}]
# NB: We don't need to put a catch around this now
close $f
The zlib command provides compression and decompression; zlib push is used to add in compression or decompression to a channel, such as in this case where the gunzip channel filter has been applied.

taking file content as file name itself in tcl

I am running one tcl script who is taking file as a input by "stdin".The problem is that its taking the file content as a filename and throwing error while running the script on command line processor.
tcl script is
#!/bin/sh
# SystemInfo_2.tcl \
exec tclsh "$0" ${1+"$#"}
set traps [read stdin];
#set traps "snmp trap test"
set timetrap [clock format [clock seconds]];
set trapout [open Database_traps_event.txt a+];
set javaout [open JavaTrapOutput.txt a+];
puts $trapout $timetrap;
puts $trapout $traps;
puts $trapout "Before executing java program";
set javaprogargs "open {|java -cp mysql-connector-java-5.1.10.jar;. EventAlarmHandling \"$traps\"} r";
puts $trapout $javaprogargs;
set javaprogram [eval $javaprogargs];
puts $trapout "Execution of java is over"
while { [gets $javaprogram line] != -1 } {
puts $javaout $line;
}
close $javaprogram;
puts $trapout "After excution of java program\r\n\r\n\r\n\r\n\r\n";
close $trapout;
close $javaout;
exit;
input file content is -
<UNKNOWN>
UDP: [192.168.1.19]:60572->[0.0.0.0]:0
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance 1:9:58:56.61
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 .iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTraps.linkDown
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifIndex.1 8
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifAdminStatus.8 up
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifOperStatus.8 down
From command line it ran like below
E:\eventAlarmHandling>tclsh TclTempFile.tcl < traps.txt
couldn't read file "UNKNOWN>
UDP: [192.168.1.19]:60572->[0.0.0.0]:0
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance 1:9:58:56.61
.iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 .iso.org.dod.intern
et.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTraps.linkDown
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifIndex.1 8
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifAdminStatus.8 up
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifOperStatus.8 down": No error
while executing
"open {|java -cp mysql-connector-java-5.1.10.jar;. EventAlarmHandling "<UNKNOWN>
UDP: [192.168.1.19]:60572->[0.0.0.0]:0
.iso.org.dod.internet.mgmt.mib..."
("eval" body line 1)
invoked from within
"eval $javaprogargs"
invoked from within
"set javaprogram [eval $javaprogargs]"
(file "TclTempFile.tcl" line 26)
So clearly in command line its showing that "couldn't read file UNKNOWN> ......"
So please explain it that whats happening here in command line.I am new to tcl.So hoping that someone help me out.
Thanks
You're having problems with one of the trickier bits of how pipelines work in Tcl. If we look at the documentation carefully, we see:
If the first character of fileName is “|” then the remaining characters of fileName are treated as a list of arguments that describe a command pipeline to invoke, in the same style as the arguments for exec.
That means you have to have the first character be | and the rest, after stripping that first character, be a proper list. In your case, you've not got that. Instead, you're doing:
set javaprogargs "open {|java -cp mysql-connector-java-5.1.10.jar;. EventAlarmHandling \"$traps\"} r";
That's pretty complicated anyway. Let's build this in the idiomatic fashion instead:
set CPsep ";"
set classpath [list mysql-connector-java-5.1.10.jar .]
set javaprogargs [list open |[list \
java -cp [join $classpath $CPsep] EventAlarmHandling $traps]]
It helps to split the classpath out; it's got a ; character in it (on Windows; you'll need to change that if you port to Linux or OSX) and it's nicer to use list in Tcl to build things and then join to convert into what Java expects.
We also no longer need any backslash-quoted substrings in there (except the one I put in to keep lines short and readable); the pattern of list commands there will add everything that is required. Note the |[list …] there: that's non-idiomatic everywhere in Tcl except when creating a pipeline when it is recommended practice as it is doing in reverse what open expects to parse.
The other thing you're running into is this:
If an arg (or pair of args) has one of the forms described below then it is used by exec to control the flow of input and output among the subprocess(es). Such arguments will not be passed to the subprocess(es).
[…]
< fileName
The file named by fileName is opened and used as the standard input for the first command in the pipeline.
Your argument from $traps starts with a < and so it triggers this rule.
Unfortunately, there's no simple workaround for this and this is a severe, known, and very annoying limitation of the pipeline creation code. The only known techniques for dealing with this are to move to transferring that data by either a file or via the subprocess's standard input, both of which require modifying the subprocess's implementation. If you can make that Java program read from System.in (a good idea anyway, so you don't hit Windows's command line length limitations!) then you can pass the value like this:
set CPsep ";"
set classpath [list mysql-connector-java-5.1.10.jar .]
set javaprogargs [list open |[list \
java -cp [join $classpath $CPsep] EventAlarmHandling << $traps]]
That is just by adding a << in there immediately before the value.

How to mask the sensitive information contained in a file using tcl?

I'm trying to implement a tcl script which reads a text file, and masks all the sensitive information (such as passwords, ip addresses etc) contained it and writes the output to another file.
As of now I'm just substituting this data with ** or ##### and searching the entire file with regexp to find the stuff which I need to mask. But since my text file can be 100K lines of text or more, this is turning out to be incredibly inefficient.
Are there any built in tcl functions/commands I can make use of to do this faster? Do any of the add on packages provide extra options which can help get this done?
Note: I'm using tcl 8.4 (But if there are ways to do this in newer versions of tcl, please do point me to them)
Generally speaking, you should put your code in a procedure to get best performance out of Tcl. (You have got a few more related options in 8.5 and 8.6, such as lambda terms and class methods, but they're closely related to procedures.) You should also be careful with a number of other things:
Put your expressions in braces (expr {$a + $b} instead of expr $a + $b) as that enables a much more efficient compilation strategy.
Pick your channel encodings carefully. (If you do fconfigure $chan -translation binary, that channel will transfer bytes and not characters. However, gets is not be very efficient on byte-oriented channels in 8.4. Using -encoding iso8859-1 -translation lf will give most of the benefits there.)
Tcl does channel buffering quite well.
It might be worth benchmarking your code with different versions of Tcl to see which works best. Try using a tclkit build for testing if you don't want to go to the (minor) hassle of having multiple Tcl interpreters installed just for testing.
The idiomatic way to do line-oriented transformations would be:
proc transformFile {sourceFile targetFile RE replacement} {
# Open for reading
set fin [open $sourceFile]
fconfigure $fin -encoding iso8859-1 -translation lf
# Open for writing
set fout [open $targetFile w]
fconfigure $fout -encoding iso8859-1 -translation lf
# Iterate over the lines, applying the replacement
while {[gets $fin line] >= 0} {
regsub -- $RE $line $replacement line
puts $fout $line
}
# All done
close $fin
close $fout
}
If the file is small enough that it can all fit in memory easily, this is more efficient because the entire match-replace loop is hoisted into the C level:
proc transformFile {sourceFile targetFile RE replacement} {
# Open for reading
set fin [open $sourceFile]
fconfigure $fin -encoding iso8859-1 -translation lf
# Open for writing
set fout [open $targetFile w]
fconfigure $fout -encoding iso8859-1 -translation lf
# Apply the replacement over all lines
regsub -all -line -- $RE [read $fin] $replacement outputlines
puts $fout $outputlines
# All done
close $fin
close $fout
}
Finally, regular expressions aren't necessarily the fastest way to do matching of strings (for example, string match is much faster, but accepts a far more restricted type of pattern). Transforming one style of replacement code to another and getting it to go really fast is not 100% trivial (REs are really flexible).
Especially for very large files - as mentioned - it's not the best way to read the whole file into a variable. As soon as your system runs out of memory you can't prevent your app crashes. For data that is separated by line breaks, the easiest solution is to buffer one line and process it.
Just to give you an example:
# Open old and new file
set old [open "input.txt" r]
set new [open "output.txt" w]
# Configure input channel to provide data separated by line breaks
fconfigure $old -buffering line
# Until the end of the file is reached:
while {[gets $old ln] != -1} {
# Mask sensitive information on variable ln
...
# Write back line to new file
puts $new $ln
}
# Close channels
close $old
close $new
I can't think of any better way to process large files in Tcl - please feel free to tell me any better solution. But Tcl was not made to process large data files. For real performance you may use a compiled instead of a scripted programming language.
Edit: Replaced ![eof $old] in while loop.
A file with 100K lines is not that much (unless every line is 1K chars long :) so I'd suggest you read the entire file into a var and make the substitution on that var:
set fd [open file r+]
set buf [read $fd]
set buf [regsub -all $(the-passwd-pattern) $buf ****]
# write it back
seek $fd 0; # This is not safe! See potrzebie's comment for details.
puts -nonewline $fd $buf
close $fd