TCL: Check file existance by SHELL environment variable (another one) - tcl

I have a file contain lines with path to the files. Sometimes a path contain SHELL environment variable and I want to check the file existence.
The following is my solution:
set fh [open "the_file_contain_path" "r"]
while {![eof $fh]} {
set line [gets $fh]
if {[regexp -- {\$\S+} $line]} {
catch {exec /usr/local/bin/tcsh -c "echo $line" } line
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}
}
I sure there is more elegant solution without using
/usr/local/bin/tcsh -c

You can capture the variable name in the regexp command and do a lookup in Tcl's global env array. Also, your use of eof as the while condition means your loop will interate one time too many (see http://phaseit.net/claird/comp.lang.tcl/fmm.html#eof)
set fh [open "the_file_contain_path" "r"]
while {[gets $fh line] != -1} {
# this can handle "$FOO/bar/$BAZ"
if {[string first {$} $line] != -1} {
regsub -all {(\$)(\w+)} $line {\1::env(\2)} new
set line [subst -nocommand -nobackslashes $new]
}
if {![file exists $line]} {
puts "ERROR: the file $line does not exist"
}
}

First off, it's usually easier (for small files, say of no more than 1–2MB) to read in the whole file and split it into lines instead of using gets and eof in a while loop. (The split command is very fast.)
Secondly, to do the replacement you need the place in the string to replace, so you use regexp -indices. That does mean that you need to take a little more complex approach to doing the replacement, with string range and string replace to do some of the work. Assuming you're using Tcl 8.5…
set fh [open "the_file_contain_path" "r"]
foreach line [split [read $fh] "\n"] {
# Find a replacement while there are any to do
while {[regexp -indices {\$(\w+)} $line matchRange nameRange]} {
# Get what to replace with (without any errors, just like tcsh)
set replacement {}
catch {set replacement $::env([string range $line {*}$nameRange])}
# Do the replacement
set line [string replace $line {*}$matchRange $replacement]
}
# Your test on the result
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}

TCL programs can read environment variables using the built-in global variable env. Read the line, look for $ followed by a name, look up $::env($name), and substitute it for the variable.
Using the shell for this is very bad if the file is supplied by untrusted users. What if they put ; rm * in the file? And if you're going to use a shell, you should at least use sh or bash, not tcsh.

Related

Tcl, if not not working

I'm trying to do a If not on a string match with Tcl. However, when I expect it not to match, it seems to be matching because when it shouldn't match it continues to "I don't want it to do this". Hope this makes sense. Inside the log.text file, it should contain, "This is a String."
set var1 "String"
set file [open "log.text" r]
while {[gets $file data] != -1} {
if {![string match *[string toupper $var1]* [string toupper $data]]} {
*I don't want it to do this
}
}
Your code appears to work fine:
$ cat log.text
This is a String
this line does not match
$ tclsh <<'END'
set var1 "String"
set file [open "log.text" r]
while {[gets $file data] != -1} {
if {![string match -nocase *$var1* $data]} {
puts "$data: does not match $var1"
}
}
END
outputs
this line does not match: does not match String
Ah, now you have clearly stated what you want: does the string exist in the file, yes or no. Here are some ways to accomplish that:
read the entire file, and string match against that.
set file [open log.text r]
set contents [read -nonewline $file]
close $file
set pattern_exists [string match -nocase *$var1* $contents]
if {$pattern_exists} {puts "$var1 found in file"}
read the file line-by-line until the pattern is found
set pattern_exists false
set file [open log.text r]
while {[gets $file line] != -1} {
if {[string match -nocase *$var1* $line]} {
set pattern_exists true
break
}
}
close $file
if {$pattern_exists} {puts "$var1 found in file"}
call out to grep to do the heavy lifting: grep exits with non-zero status when the pattern is not found, and exec thinks a non-zero exit status is an exception (see https://tcl.tk/man/tcl8.6/TclCmd/exec.htm#M27)
try {
exec grep -qi $var1 log.text
set pattern_exists true
} on error {e} {
set pattern_exists false
}
if {$pattern_exists} {puts "$var1 found in file"}
The code as you wrote it works… but I'm guessing it is a proxy for something else. If you are looking to see if an arbitrary string exists as a substring of a line, you are better off using string first instead of string match, since the latter has a few metacharacters (especially [ and ], which denote a set of characters) that can cause problems if you're not expecting them.
Try:
if {[string first [string toupper $var1] [string toupper $data]] >= 0} {
# The substring was there...
}
Alternatively, apply relevant backslash quoting when building your search pattern (possibly with string map) or use regexp, which has a useful find-a-literal mode:
if {[regexp -nocase ***=$var1 $data]} {
# The substring was there...
}
The ***= means “the rest of this pattern is a literal string to match” and we can pass -nocase as an option to allow us to not need to use string toupper.

Search in file for number, increment and replace

I have a VHDL file which has a line like this:
constant version_nr :integer := 47;
I want to increment the number in this line in the file. Is there a way to accomplish this with TCL?
This is principally a string operation. The tricky bit is finding the line to operate on and picking the number out of it. This can be occasionally awkward, but it is mainly a matter of choosing a suitable regular expression (as this is the kind of parsing task that they excel at). A raw RE to do the matching would be this:
^\s*constant\s+version_nr\s*:integer\s*:=\s*\d+\s*;\s*$
This is essentially converting all possible places for a whitespace sequence into \s* (except where whitespace is mandatory, which becomes \s+) and matching the number with \d+, i.e., a digit sequence. We then add in parentheses to capture the interesting substrings, which are the prefix, the number, and the suffix:
^(\s*constant\s+version_nr\s*:integer\s*:=\s*)(\d+)(\s*;\s*)$
Now we have enough to make the line transform (which we'll do as a procedure so we can give it a nice name):
proc lineTransform {line} {
set RE {^(\s*constant\s+version_nr\s*:integer\s*:=\s*)(\d+)(\s*;\s*)$}
if {[regexp $RE $line -> prefix number suffix]} {
# If we match, we increment the number...
incr number
# And reconcatenate it with the prefix and suffix to make the new line
set line $prefix$number$suffix
}
return $line
}
In Tcl 8.7 (which you won't be using yet) you can write this as this more succinct form:
proc lineTransform {line} {
# Yes, this version can be a single (long) line if you want
set RE {^(\s*constant\s+version_nr\s*:integer\s*:=\s*)(\d+)(\s*;\s*)$}
regsub -command $RE $line {apply {{- prefix number suffix} {
# Apply the increment when the RE matches and build the resulting line
string cat $prefix [incr number] $suffix
}}}
}
Now that we have a line transform, we've just got to apply that to all the lines of the file. This is easily done with a file that fits in memory (up to a few hundred MB) but requires additional measures for larger files as you need to stream from one file to another:
proc transformSmallFile {filename} {
# Read data into memory first
set f [open $filename]
set data [read $f]
close $f
# Then write it back out, applying the transform as we go
set f [open $filename w]
foreach line [split $data "\n"] {
puts $f [transformLine $line]
}
close $f
}
proc transformLargeFile {filename} {
set fin [open $filename]
# The [file tempfile] command makes working with temporary files easier
set fout [file tempfile tmp [file normalize $filename]]
# A streaming transform; requires that input and output files be different
while {[gets $fin line] >= 0} {
puts $fout [transformLine $line]
}
# Close both channels; flushes everything to disk too
close $fin
close $fout
# Rename our temporary over the original input file, replacing it
file rename $tmp $filename
}

Regarding got to command in tcl

I want to print character from the next line:
say :
when this variable dum=183 exists in file , then print the very next charater from next line.
Note : I am using tcl
Thanks,
This should help you get started.
The typical idioms for working with a file one line at a time are:
1) linewise reading:
set f [open thefile.txt]
while {[gets $f line] >= 0} {
# work with the line of text in "line"
}
close $f
2) block reading with line splitting:
set f [open thefile.txt]
set text [read $f]
close $f
set lines [split [string trim $text] \n]
foreach line $lines {
# work with the line of text in "line"
}
This can be simplified by using a package:
package require fileutil
::fileutil::foreachLine line thefile.txt {
# work with the line of text in "line"
}
Another way is to search and extract using a regular expression. This is the worst method as it is inflexible and very likely to be buggy in use.
set f [open thefile.txt]
set text [read $f]
close $f
# this regular expression is an example
if {[regexp {\ydum\y[^\n]*.(.)} $text -> thecharacter]} {
# the character you wanted should be in "thecharacter"
}
Documentation:
>= (operator),
close,
fileutil (package),
foreach,
gets,
if,
open,
package,
read,
regexp,
set,
split,
string,
while,
Syntax of Tcl regular expressions

TCL - find a regular pattern in a file and return the occurrence and number of occurrences

I am writing a code to grep a regular expression pattern from a file, and output that regular expression and the number of times it has occured.
Here is the code: I am trying to find the pattern "grep" in my file hello.txt:
set file1 [open "hello.txt" r]
set file2 [read $file1]
regexp {grep} $file2 matched
puts $matched
while {[eof $file2] != 1} {
set number 0
if {[regexp {grep} $file2 matched] >= 0} {
incr number
}
puts $number
}
Output that I got:
grep
--------
can not find channel named "qwerty
iiiiiii
wxseddtt
lsakdfhaiowehf'
jbsdcfiweg
kajsbndimm s
grep
afnQWFH
ACV;SKDJNCV;
qw qde
kI UQWG
grep
grep"
while executing
"eof $file2"
It's usually a mistake to check for eof in a while loop -- check the return code from gets instead:
set filename "hello.txt"
set pattern {grep}
set count 0
set fid [open $filename r]
while {[gets $fid line] != -1} {
incr count [regexp -all -- $pattern $line]
}
close $fid
puts "$count occurrances of $pattern in $filename"
Another thought: if you're just counting pattern matches, assuming your file is not too large:
set fid [open $filename r]
set count [regexp -all -- $pattern [read $fid [file size $filename]]]
close $fid
The error message is caused by the command eof $file2. The reason is that $file2 is not a file handle (resp. channel) but contains the content of the file hello.txt itself. You read this file content with set file2 [read $file1].
If you want to do it like that I would suggest to rename $file2 into something like $filecontent and loop over every contained line:
foreach line [split $filecontent "\n"] {
... do something ...
}
Glenn is spot on. Here is another solution: Tcl comes with the fileutil package, which has the grep command:
package require fileutil
set pattern {grep}
set filename hello.txt
puts "[llength [fileutil::grep $pattern $filename]] occurrences found"
If you care about performance, go with Glenn's solution.

Parsing a file with Tcl

I have a file in here which has multiple set statements. However I want to extract the lines of my interest. Can the following code help
set in [open filename r]
seek $in 0 start
while{ [gets $in line ] != -1} {
regexp (line to be extracted)
}
Other solution:
Instead of using gets I prefer using read function to read the whole contents of the file and then process those line by line. So we are in complete control of operation on file by having it as list of lines
set fileName [lindex $argv 0]
catch {set fptr [open $fileName r]} ;
set contents [read -nonewline $fptr] ;#Read the file contents
close $fptr ;#Close the file since it has been read now
set splitCont [split $contents "\n"] ;#Split the files contents on new line
foreach ele $splitCont {
if {[regexp {^set +(\S+) +(.*)} $ele -> name value]} {
puts "The name \"$name\" maps to the value \"$value\""
}
}
How to run this code:
say above code is saved in test.tcl
Then
tclsh test.tcl FileName
FileName is full path of file unless the file is in the same directory where the program is.
First, you don't need to seek to the beginning straight after opening a file for reading; that's where it starts.
Second, the pattern for reading a file is this:
set f [open $filename]
while {[gets $f line] > -1} {
# Process lines
if {[regexp {^set +(\S+) +(.*)} $line -> name value]} {
puts "The name \"$name\" maps to the value \"$value\""
}
}
close $f
OK, that's a very simple RE in the middle there (and for more complicated files you'll need several) but that's the general pattern. Note that, as usual for Tcl, the space after the while command word is important, as is the space between the while expression and the while body. For specific help with what RE to use for particular types of input data, ask further questions here on Stack Overflow.
Yet another solution:
as it looks like the source is a TCL script, create a new safe interpreter using interp which only has the set command exposed (and any others you need), hide all other commands and replace unknown to just skip anything unrecognised. source the input in this interpreter
Here is yet another solution: use the file scanning feature of Tclx. Please look up Tclx for more info. I like this solution for that you can have several scanmatch blocks.
package require Tclx
# Open a file, skip error checking for simplicity
set inputFile [open sample.tcl r]
# Scan the file
set scanHandle [scancontext create]
scanmatch $scanHandle {^\s*set} {
lassign $matchInfo(line) setCmd varName varValue; # parse the line
puts "$varName = $varValue"
}
scanfile $scanHandle $inputFile
close $inputFile
Yet another solution: use the grep command from the fileutil package:
package require fileutil
puts [lindex $argv 0]
set matchedLines [fileutil::grep {^\s*set} [lindex $argv 0]]
foreach line $matchedLines {
# Each line is in format: filename:line, for example
# sample.tcl:set foo bar
set varName [lindex $line 1]
set varValue [lindex $line 2]
puts "$varName = $varValue"
}
I've read your comments so far, and if I understand you correctly your input data file has 6 (or 9, depending which comment) data fields per line, separated by spaces. You want to use a regexp to parse them into 6 (or 9) arrays or lists, one per data field.
If so, I'd try something like this (using lists):
set f [open $filename]
while {[gets $f line] > -1} {
# Process lines
if {[regexp {(\S+) (\S+) (\S+) (\S+) (\S+) (\S+)} $line -> name source drain gate bulk inst]} {
lappend nameL $name
lappend sourceL $source
lappend drainL $drain
lappend gateL $gate
lappend bulkL $bulk
lappend instL $inst
}
}
close $f
Now you should have a set of 6 lists, one per field, with one entry in the list for each item in your input file. To access the i-th name, for example, you grab $nameL[$i].
If (as I suspect) your main goal is to get the parameters of the device whose name is "foo", you'd use a structure like this:
set name "foo"
set i [lsearch $nameL $name]
if {$i != -1} {
set source $sourceL[$i]
} else {
puts "item $name not found."
set source ''
# or set to 0, or whatever "not found" marker you like
}
set File [ open $fileName r ]
while { [ gets $File line ] >= 0 } {
regex {(set) ([a-zA-Z0-0]+) (.*)} $line str1 str2 str3 str4
#str2 contains "set";
#str3 contains variable to be set;
#str4 contains the value to be set;
close $File
}