i have one Textfile with thousands of values and some alphanumerical chars like this:
\Test1
+3.00000E-04
+5.00000E-04
+4.00000E-04
now i want to scan this file and write the values into variables.
set path "C:/test.txt"
set in [open $path r]
while {[gets $in line] != -1} {
set Cache [gets $in line]
if { $Cache < $Cache } {
set lowest "$Cache"
}
}
has anybody an idea? im getting a alert which tells me the Directory couldnt deleted?!
br
You could use the core math function tcl::mathfunc::min. If there is "junk" (i.e. lines that contain text that aren't numbers), you can filter those lines out first:
set numbers {}
set f [open test.txt]
while {[gets $f line] >= 0} {
if {[string is double -strict $line]} {
lappend numbers [string trim $line]
}
}
close $f
tcl::mathfunc::min {*}$numbers
# => +3.00000E-04
If every line is a valid double precision floating point number, you can dispense with the filtering:
set f [open test.txt]
set numbers [split [string trim [read $f]]]
close $f
tcl::mathfunc::min {*}$numbers
# => +3.00000E-04
If you can use the Tcllib module fileutil, which is easy to pick up from the Tcllib site if not available on your installation (it is included in the ActiveTcl installation already), you can simplify the code somewhat:
package require fileutil
set numbers {}
::fileutil::foreachLine line test.txt {
if {[string is double -strict $line]} {
lappend ::numbers [string trim $line]
}
}
tcl::mathfunc::min {*}$numbers
or
package require fileutil
tcl::mathfunc::min {*}[split [string trim [::fileutil::cat test.txt]]]
Documentation:
>= (operator),
close,
fileutil (package),
gets,
if,
lappend,
namespace,
open,
package,
read,
set,
split,
string,
while,
{*} (syntax),
Mathematical functions for Tcl expressions
Related
I'm trying to do a If not on a string match with Tcl. However, when I expect it not to match, it seems to be matching because when it shouldn't match it continues to "I don't want it to do this". Hope this makes sense. Inside the log.text file, it should contain, "This is a String."
set var1 "String"
set file [open "log.text" r]
while {[gets $file data] != -1} {
if {![string match *[string toupper $var1]* [string toupper $data]]} {
*I don't want it to do this
}
}
Your code appears to work fine:
$ cat log.text
This is a String
this line does not match
$ tclsh <<'END'
set var1 "String"
set file [open "log.text" r]
while {[gets $file data] != -1} {
if {![string match -nocase *$var1* $data]} {
puts "$data: does not match $var1"
}
}
END
outputs
this line does not match: does not match String
Ah, now you have clearly stated what you want: does the string exist in the file, yes or no. Here are some ways to accomplish that:
read the entire file, and string match against that.
set file [open log.text r]
set contents [read -nonewline $file]
close $file
set pattern_exists [string match -nocase *$var1* $contents]
if {$pattern_exists} {puts "$var1 found in file"}
read the file line-by-line until the pattern is found
set pattern_exists false
set file [open log.text r]
while {[gets $file line] != -1} {
if {[string match -nocase *$var1* $line]} {
set pattern_exists true
break
}
}
close $file
if {$pattern_exists} {puts "$var1 found in file"}
call out to grep to do the heavy lifting: grep exits with non-zero status when the pattern is not found, and exec thinks a non-zero exit status is an exception (see https://tcl.tk/man/tcl8.6/TclCmd/exec.htm#M27)
try {
exec grep -qi $var1 log.text
set pattern_exists true
} on error {e} {
set pattern_exists false
}
if {$pattern_exists} {puts "$var1 found in file"}
The code as you wrote it works… but I'm guessing it is a proxy for something else. If you are looking to see if an arbitrary string exists as a substring of a line, you are better off using string first instead of string match, since the latter has a few metacharacters (especially [ and ], which denote a set of characters) that can cause problems if you're not expecting them.
Try:
if {[string first [string toupper $var1] [string toupper $data]] >= 0} {
# The substring was there...
}
Alternatively, apply relevant backslash quoting when building your search pattern (possibly with string map) or use regexp, which has a useful find-a-literal mode:
if {[regexp -nocase ***=$var1 $data]} {
# The substring was there...
}
The ***= means “the rest of this pattern is a literal string to match” and we can pass -nocase as an option to allow us to not need to use string toupper.
set filePointer [open "fileName" "r"]
set fileWritePointer [open "fileNameWrite" "w"]
set lines [split [read $filePointer] "\n"]
close $filePointer
set length [llength $lines]
for {set i 0} {$i<$length} {incr i} {
if {[regexp "Matching1" $line]} {
puts $fileWritePointer $line
}
if {[regexp "Matching" $line]} {
puts $fileWritePointer $line
}
}
close $fileWritePointer
I am reading all the lines of the file at a time and splitting it by new line character and reading each line at a time inside the for loop.
After some syntax checks using regexp for the lines I am dumping only selected lines into a new file using the below syntax.
puts $filePointer $line
My file has around 2 million lines of code.
Like this many regexp matching is present roughly around 1.5.
Without knowing why the code is slow (or what exactly you're using a baseline for measurement against) it's hard to be sure what to do to accelerate it. However, you can try switching to streaming processing:
set fin [open "fileName"]
set fout [open "fileNameWrite" "w"]
while {[gets $fin line] >= 0} {
if {[regexp "Matching1" $line]} {
puts $fout $line
}
if {[regexp "Matching" $line]} {
puts $fout $line
}
}
close $fout
close $fin
You should make sure that your regular expressions are constant values for the duration of the processing to avoid recompiling them for every line (which would be very slow!) though those constant values can be stored in variables, so long as those variables are used without anything being added to them:
set RE1 "Matching1"
set RE2 "Matching"
# Note: these variables are NOT assigned to below! They are just used!
set fin [open "fileName"]
set fout [open "fileNameWrite" "w"]
while {[gets $fin line] >= 0} {
# Added “--” to make sure that the REs are never interpreted as anything else
if {[regexp -- $RE1 $line]} {
puts $fout $line
}
if {[regexp -- $RE2 $line]} {
puts $fout $line
}
}
close $fout
close $fin
You might also get extra speed by choosing the right encodings, putting all this code in a procedure, etc. As noted, it's hard to be sure what is the best thing to try without knowing why the code is actually slow, and that in part depends on the system on which it is being run.
Do you actually need regular expression matching? String matching is likely to be faster.
Can more than one match be made against the same line, and in that case do you really need the line to be written once for each match? If not, you can speed things up by skipping the rest of the matching attempts once one has succeeded:
if {[regexp -- $RE1 $line]} {
puts $fout $line
} elseif {[regexp -- $RE2 $line]} {
puts $fout $line
} elseif { ... } {
or
if {
[regexp -- $RE1 $line] ||
[regexp -- $RE2 $line] ||
...
} then {
puts $fout $line
}
or
switch -regexp -- $line \
$RE1 - \
$RE2 - \
... - \
default {
puts $fout $line
}
I have tried the below code, but it is checking line by line and want to check it in whole file. Please help me out in writing the correct code, once i get the pattern break it and says pattern is found else pattern is not found
set search "Severity Level: Critical"
set file [open "outputfile.txt" r]
while {[gets $file data] != -1} {
if {[string match *[string toupper $search]* [string toupper $data]] } {
puts "Found '$search' in the line '$data'"
} else {
puts "Not Found '$search' in the line '$data'"
}
}
If the file is “small” with respect to available memory (e.g., no more than a few hundred megabytes) then the easiest way to find if the string is present is to load it all in with read.
set search "Severity Level: Critical"
set f [open "thefilename.txt"]
set data [read $f]
close $f
set idx [string first $search $data]
if {$idx >= 0} {
puts "Found the search term at character $idx"
# Not quite sure what you'd do with this info...
} else {
puts "Search term not present"
}
If you want to know what line it is in, you might split the data up and then use lsearch with the right options to find it.
set search "Severity Level: Critical"
set f [open "thefilename.txt"]
set data [split [read $f] "\n"]
close $f
set lineidx [lsearch -regexp -- $data ***=$search]
if {$idx >= 0} {
puts "Found the search term at line $lineidx : [lindex $data $lineidx]"
} else {
puts "Search term not present"
}
The ***= is a special escape to say “treat the rest of the RE as literal characters” and it's ideal for the case where you can't be sure that the search term is free of RE metacharacters.
The string first command is very simple, so it's easy to use correctly and to work out whether it can do what you want. The lsearch command is not simple at all, and neither are regular expressions; determining when and how to use them is correspondingly trickier.
I have a file contain lines with path to the files. Sometimes a path contain SHELL environment variable and I want to check the file existence.
The following is my solution:
set fh [open "the_file_contain_path" "r"]
while {![eof $fh]} {
set line [gets $fh]
if {[regexp -- {\$\S+} $line]} {
catch {exec /usr/local/bin/tcsh -c "echo $line" } line
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}
}
I sure there is more elegant solution without using
/usr/local/bin/tcsh -c
You can capture the variable name in the regexp command and do a lookup in Tcl's global env array. Also, your use of eof as the while condition means your loop will interate one time too many (see http://phaseit.net/claird/comp.lang.tcl/fmm.html#eof)
set fh [open "the_file_contain_path" "r"]
while {[gets $fh line] != -1} {
# this can handle "$FOO/bar/$BAZ"
if {[string first {$} $line] != -1} {
regsub -all {(\$)(\w+)} $line {\1::env(\2)} new
set line [subst -nocommand -nobackslashes $new]
}
if {![file exists $line]} {
puts "ERROR: the file $line does not exist"
}
}
First off, it's usually easier (for small files, say of no more than 1–2MB) to read in the whole file and split it into lines instead of using gets and eof in a while loop. (The split command is very fast.)
Secondly, to do the replacement you need the place in the string to replace, so you use regexp -indices. That does mean that you need to take a little more complex approach to doing the replacement, with string range and string replace to do some of the work. Assuming you're using Tcl 8.5…
set fh [open "the_file_contain_path" "r"]
foreach line [split [read $fh] "\n"] {
# Find a replacement while there are any to do
while {[regexp -indices {\$(\w+)} $line matchRange nameRange]} {
# Get what to replace with (without any errors, just like tcsh)
set replacement {}
catch {set replacement $::env([string range $line {*}$nameRange])}
# Do the replacement
set line [string replace $line {*}$matchRange $replacement]
}
# Your test on the result
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}
TCL programs can read environment variables using the built-in global variable env. Read the line, look for $ followed by a name, look up $::env($name), and substitute it for the variable.
Using the shell for this is very bad if the file is supplied by untrusted users. What if they put ; rm * in the file? And if you're going to use a shell, you should at least use sh or bash, not tcsh.
I have a file in here which has multiple set statements. However I want to extract the lines of my interest. Can the following code help
set in [open filename r]
seek $in 0 start
while{ [gets $in line ] != -1} {
regexp (line to be extracted)
}
Other solution:
Instead of using gets I prefer using read function to read the whole contents of the file and then process those line by line. So we are in complete control of operation on file by having it as list of lines
set fileName [lindex $argv 0]
catch {set fptr [open $fileName r]} ;
set contents [read -nonewline $fptr] ;#Read the file contents
close $fptr ;#Close the file since it has been read now
set splitCont [split $contents "\n"] ;#Split the files contents on new line
foreach ele $splitCont {
if {[regexp {^set +(\S+) +(.*)} $ele -> name value]} {
puts "The name \"$name\" maps to the value \"$value\""
}
}
How to run this code:
say above code is saved in test.tcl
Then
tclsh test.tcl FileName
FileName is full path of file unless the file is in the same directory where the program is.
First, you don't need to seek to the beginning straight after opening a file for reading; that's where it starts.
Second, the pattern for reading a file is this:
set f [open $filename]
while {[gets $f line] > -1} {
# Process lines
if {[regexp {^set +(\S+) +(.*)} $line -> name value]} {
puts "The name \"$name\" maps to the value \"$value\""
}
}
close $f
OK, that's a very simple RE in the middle there (and for more complicated files you'll need several) but that's the general pattern. Note that, as usual for Tcl, the space after the while command word is important, as is the space between the while expression and the while body. For specific help with what RE to use for particular types of input data, ask further questions here on Stack Overflow.
Yet another solution:
as it looks like the source is a TCL script, create a new safe interpreter using interp which only has the set command exposed (and any others you need), hide all other commands and replace unknown to just skip anything unrecognised. source the input in this interpreter
Here is yet another solution: use the file scanning feature of Tclx. Please look up Tclx for more info. I like this solution for that you can have several scanmatch blocks.
package require Tclx
# Open a file, skip error checking for simplicity
set inputFile [open sample.tcl r]
# Scan the file
set scanHandle [scancontext create]
scanmatch $scanHandle {^\s*set} {
lassign $matchInfo(line) setCmd varName varValue; # parse the line
puts "$varName = $varValue"
}
scanfile $scanHandle $inputFile
close $inputFile
Yet another solution: use the grep command from the fileutil package:
package require fileutil
puts [lindex $argv 0]
set matchedLines [fileutil::grep {^\s*set} [lindex $argv 0]]
foreach line $matchedLines {
# Each line is in format: filename:line, for example
# sample.tcl:set foo bar
set varName [lindex $line 1]
set varValue [lindex $line 2]
puts "$varName = $varValue"
}
I've read your comments so far, and if I understand you correctly your input data file has 6 (or 9, depending which comment) data fields per line, separated by spaces. You want to use a regexp to parse them into 6 (or 9) arrays or lists, one per data field.
If so, I'd try something like this (using lists):
set f [open $filename]
while {[gets $f line] > -1} {
# Process lines
if {[regexp {(\S+) (\S+) (\S+) (\S+) (\S+) (\S+)} $line -> name source drain gate bulk inst]} {
lappend nameL $name
lappend sourceL $source
lappend drainL $drain
lappend gateL $gate
lappend bulkL $bulk
lappend instL $inst
}
}
close $f
Now you should have a set of 6 lists, one per field, with one entry in the list for each item in your input file. To access the i-th name, for example, you grab $nameL[$i].
If (as I suspect) your main goal is to get the parameters of the device whose name is "foo", you'd use a structure like this:
set name "foo"
set i [lsearch $nameL $name]
if {$i != -1} {
set source $sourceL[$i]
} else {
puts "item $name not found."
set source ''
# or set to 0, or whatever "not found" marker you like
}
set File [ open $fileName r ]
while { [ gets $File line ] >= 0 } {
regex {(set) ([a-zA-Z0-0]+) (.*)} $line str1 str2 str3 str4
#str2 contains "set";
#str3 contains variable to be set;
#str4 contains the value to be set;
close $File
}