regsub in Tcl to replace pattern with the empty string - tcl

I need some help in replacing a pattern in TCL.
set msg "<sytem>
<log>
<test>
<ggg>yyy</ggg>
</test>
</log>
</sytem>
one
two
three"
set res [regsub -all "<sytem(\>)+" $msg " " test]
puts $test
test should contain
one
two
three

I'm not quite sure what you're trying to remove here, but to strip out everything up to the last > in the string, you use this:
# Greedily matches everything up to the *last* ‘>’ and replaces it with the empty string
# No special quoting required; ‘>’ is not an RE metacharacter
set res [regsub {.*>} $msg ""]
Then you can normalize the result in various ways. This might be good enough, if what follows is some simple words:
set res [join $res "\n"]
Otherwise, you'd need to do this more complicated version:
set res [join [lmap [split [string trim $res] "\n"] x {string trim $x}] "\n"]
There are other ways of writing it too that are even trickier:
set res [regexp -inline -all -line {\S(?:.*\S)} $res]

Related

Include line nr in search result

I'm trying to figure out how to search a log file for "word" and also include on what line the "word" was found at.
Also if one can read from bottom to top?
Any ideas?
thanks!
set seen_trigger2 ".foo"
bind pub -|- $seen_trigger2 seen2:main
proc number {list} {
lmap item $list {list [incr number] $item}
}
proc seen2:main {nick uhost hand chan text} {
set f [open /home/mydir/eggdrop/logs/mylog.txt]
set lines [split [read $f] "\n"]
close $f
set pattern $text
set reverseNumberedLines [lreverse [number $lines]]
foreach lineInfo [lsearch -all -inline -index 1 $lines $pattern] {
lassign $lineInfo lineNumber lineContent
putlog "$lineNumber : $lineContent"
}
}
...
Thank you glenn jackman!
Thank you Donal Fellows!
Reading from bottom to top is fairly expensive except in the degenerate case where all lines are exactly the same length. It's easier to read the whole lot in, split into lines, and reverse.
# Some features used aren't in 8.5 and before
package require Tcl 8.6
# Generates “line” numbers
proc number {list} {
lmap item $list {list [incr number] $item}
}
# Classic get-all-the-lines code snippet
set f [open theFile.txt]
set lines [split [read $f] "\n"]
close $f
# Number and reverse
set reverseNumberedLines [lreverse [number $lines]]
# Find the matching lines
foreach lineInfo [lsearch -all -inline -index 1 $lines $pattern] {
lassign $lineInfo lineNumber lineContent
puts "$lineNumber : $lineContent"
}
Note that I'm using the -index 1 option, which basically applies lindex $item 1 to each item in the list (the first sub-item is the line number, the second is the line text) before doing the search. Since we already have all the line numbers applied, we don't care about the actual indices we found them at, and can -inline the results.

How to match a string and print the next word afterthat?

Lets say i have the following script and have to look for .model and print the next two word before (. The following is the contents of the file that I need to read.
.model Q2N2222 NPN(Is=14.34f Xti=3 Eg=1.11 Vaf=74.03 Bf=255.9 Ne=1.307
Ise=14.34f Ikf=.2847 Xtb=1.5 Br=6.092 Nc=2 Isc=0 Ikr=0 Rc=1
+ Cjc=7.306p Mjc=.3416 Vjc=.75 Fc=.5 Cje=22.01p Mje=.377 Vje=.75
+ Tr=46.91n Tf=411.1p Itf=.6 Vtf=1.7 Xtf=3 Rb=10)
* National pid=19 case=TO18
* 88-09-07 bam creation
*$
.model Q2N3904 NPN(Is=6.734f Xti=3 Eg=1.11 Vaf=74.03 Bf=416.4 Ne=1.259
.model Q2N3906 PNP(Is=1.41f Xti=3 Eg=1.11 Vaf=18.7 Bf=180.7 Ne=1.5 Ise=0
Here is the code i have written so far. But i couldnt get any. Need the help
proc find_lib_parts {f_name} {
set value [string first ".lib" $f_name]
if {$value != -1} {
#open the file
set fid [ open $f_name "r"]
#read the fid and split it in to lines
set infos [split [read $fid] "\n"]
close $fid
set res {}
append res "MODEL FOUND:\n"
if {[llength $line] > 2 && [lindex $line 0] eq {model}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
if {[llength $line] > 2 && [lindex $line 0] eq {MODEL}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
}
return $res
In this case, a regular expression is by far the simplest way of doing such a search. Assuming the words are always on the same line, it's easy:
proc find_lib_parts {f_name} {
set fid [open $f_name]
set infos [split [read $fid] "\n"]
close $fid
set found {}
foreach line $infos {
if {[regexp {\.model\s+(\w+\s+\w+)\(} $line -> twoWords]} {
lappend found $twoWords
}
}
return $found
}
For your input data sample, that'll produce a result like this:
{Q2N2222 NPN} {Q2N3904 NPN} {Q2N3906 PNP}
If there's nothing to find, you'll get an empty list. (I assume you pass filenames correctly anyway, so I omitted that check.)
The regular expression, which should virtually always be enclosed in {braces} in Tcl, is this:
\.model\s+(\w+\s+\w+)\(
It's relatively simple. The pieces of it are:
\.model — literal “.model” (with an escape of the . because it is a RE metacharacter)
\s+ — some whitespace
( — start a capturing group (the bit we put into the twoWords variable)
\w+ — a “word”, one or more alphanumeric (or underscore) characters
\s+ — some whitespace
\w+ — a “word”, one or more alphanumeric (or underscore) characters
) — end of the capturing group
\( — literal “(”, escaped
The regexp command matches this, returning whether or not it matched (effectively boolean without the -all option, which we're not using here), and assigning the various groups to the variables named afterwards, -> for the whole matched string (yes, that's a legal variable name; I like to use it for regexp variables that dump info I don't want) and twoWords for the interesting substring.

TCL Expect: how to remove a trailing newline

I am executing this expect script:
#!/usr/bin/expect
#####config file content :: user1|server1
set lx [exec awk {-F|} {/server1/ {print $1}} config]
puts "Values is $lx of server1."
The output is:
Values is user1
of server1.
There is a trailing new line in $lx that i am unable to remove.
I have tried :
set x [string trimright $lx \n]
set x [string trimright $lx "\n"]
set x [string trim $lx \n]
But there is always a \n left on the end of $x. How can this newline be removed?
By default string trim will remove any whitespace-type characters from either end of the string, so you could just set x [string trim $lx].
I would be inclined to do the whole processing in Expect like this:
set x [string trim [lindex [split [read [open config]] | ] 0]]
Used :: regsub -all [\r\n] $lx "" lx to fix this.
The canonical way of fixing this is string trimright $variable "\r\n", since Expect doesn't hide the details of CR/NL processing from you (unlike normal Tcl I/O operations). This is intentional, of course — Expect is often used in situations where being able to see the real details of terminal handling is useful — but means you have to deal with it.

Read lines from file exactly as they appear

I am reading from a file and need to find the exact line $(eval $(call CreateTest KEYWORD and everything following after the line (as the rest is all random). This is how I am currently trying to find it but it always reports back as nothing found to match.
proc listFromFile {$path1} {
set find {$(eval $(call CreateTest, KEYWORD}
upvar path1 path1
set f [open $path1 r]
set data [split [string trim [read $f]] \n]
close $f
# return [lsearch -all -inline $data *KEYWORD*]
return [lsearch -exact -all -inline $data $find*]
}
The commented out line is the closest I can get it to work but it pulls anything with KEYWORD anywhere in the file. the KEYWORD could appear in lines I do not want to read therefore I need to pull the exact line as stated above
EDIT
I should have mentioned that the file is formatted like so;
$(eval $(call CreateTest, KEYWORD ...
$(eval $(call CreateTest, NOT_KEYWORD ...
$(eval $(call CreateTest, KEYWORD ...
$(eval $(call CreateTest, KEYWORD ...
$(eval $(call CreateTest, NOT_KEYWORD ...
$(eval $(call CreateTest, KEYWORD ...
which means I only want to pull the lines containing the exact string and the keyword. But there are lines between what I am looking for that I do not want to display
I think you should just apply your match to each line as you read them.
proc getMatchingLines {filename match} {
set result {}
set f [open $filename r]
while {[gets $f line] != -1} {
if {[string match ${find}* $line]} {
lappend result $line
}
}
close $f
return $result
}
set find {$(eval $(call CreateTest, KEYWORD}
set matching [getMatchingLines $filename $find]
foreach line $matching {
# do something with the matching line
}
You could build up a list of results or do something immediately for each matching line as appropriate for your application. The main difference is that string match doesn't have many meta characters unlike regexp. Only * and ? are special so it is simple to match for a line matching your string followed by anything ie: ${find}*.
Use string first and string range instead:
# foo.tcl
set f [open "data.txt" r]
set body [read $f]
puts -nonewline [string range $body [string first "ccc" $body] [string length $body]]
close $f
Test:
$ cat data.txt
aaa
bbb
ccc
ddd
eee
$ tclsh foo.tcl
ccc
ddd
eee
I think in your code you have used * as a glob pattern.
return [lsearch -exact -all -inline $data $find*]
When -exact flag used, it will treat that * as a literal * thereby failing to get the desired result. Removing that * will solve the problem.
proc listFromFile {$path1} {
set find {$(eval $(call CreateTest, KEYWORD }
upvar path1 path1
set f [open $path1 r]
set data [split [string trim [read $f]] \n]
close $f
return [lsearch -all -inline $data $find]]
}
This should work:
proc listFromFile path {
set f [open $path r]
set data [split [string trim [read $f]] \n]
close $f
return [lsearch -exact -all -inline $data { KEYWORD}]
}
In my answer to your earlier question, I suggested lsearch (without -exact) and KEYWORD* as a pattern because that seemed to be what you were after. Considering the lines you show here, searching for a space character followed by the string KEYWORD seems more likely to work.
Another thing: your problem with the parameter (which you tried to solve with upvar) was that you had a dollar sign attached to the parameter name. If you leave out the dollar sign you get a usable parameter name like in the code above (it is possible to use it even with the dollar sign, but it's a lot harder).
Documentation: close, lsearch, open, proc, read, return, set, split, string

TCL String Manipulation and Extraction

I have a string xxxxxxx-s12345ab7_0_0_xx2.log and need to have an output like AB700_xx2 in TCL.
ab will be the delimiter and need to extract from ab to . (including ab) and also have to remove only the first two underscores.
Tried string trim, string trimleft and string trimright, but not much use. Is there anything like string split in TCL?
The first stage is to extract the basic relevant substring; the easiest way to do that is actually with a regular expression:
set inputString "xxxxxxx-s12345ab7_0_0_xx2.log"
if {![regexp {ab[^.]+} $inputString extracted]} {
error "didn't match!"
}
puts "got $extracted"
# ===> got ab7_0_0_xx2
Then, we want to get rid of those nasty underscores with string map:
set final [string map {"_" ""} $extracted]
puts "got $final"
# ===> ab700xx2
Hmm, not quite what we wanted! We wanted to keep the last underscore and to up-case the first part.
set pieces [split $extracted "_"]
set final [string toupper [join [lrange $pieces 0 2] ""]]_[join [lrange $pieces 3 end] "_"]
puts "got $final"
# ===> got AB700_xx2
(The split command divides a string up into “records” by an optional record specifier — which defaults to any whitespace character — that we can then manipulate easily with list operations. The join command does the reverse, but here I'm using an empty record specifier on one half which makes everything be concatenated. I think you can guess what the string toupper and lrange commands do…)
set a "xxxxxxx-s12345ab7_0_0_xx2.log"
set a [split $a ""]
set trig 0
set extract ""
for {set i 0} {$i < [llength $a]} {incr i} {
if {"ab" eq "[lindex $a $i][lindex $a [expr $i+1]]"} {
set trig 1
}
if {$trig == 1} {
append extract [lindex $a $i]
}
}
set extract "[string toupper [join [lrange [split [lindex [split $extract .] 0] _] 0 end-1] ""]]_[lindex [split [lindex [split $extract .] 0] _] end]"
puts $extract
Only regexp is enough to do the trick.
Set string "xxxxxxx-s12345ab7_0_0_xx2.log"
regexp {(ab)(.*)_(.*)_(.*)_(.*)\\.} $string -> s1 s2 s3 s4 s5
Set rstring "$s1$s2$s3$s4\_$s5"
Puts $rstring