catch multiple empty lines in file in tcl - tcl

There are 4 empty space in my file,set in wr_fp.I want to catch four empty space in code. But below code is not working.
while {[gets $wr_fp line3] >= 0} {
if {[regexp "\n\s+\n\s+\n\s+\n" $line3]} { puts "found 4 empty lines"}
}

tl;dr: Don't put REs in "quotes", put them in {braces}.
The problem is that you've put your RE in quotes, so that it is actually this:
s+
s+
s+
Because of Tcl's general substitution rules, \n becomes a newline and \s becomes a simple s. Putting the RE in braces inhibits this (unwanted in this case) behaviour.

this is my answer.I want this.
while {[gets $rd_fp line] >= 0} {
if {[string match "" $line]} {
if {[expr $count % 4] == 1} {puts "found 4 space"}
incr count
}
}

The gets / chan gets command reads one line at a time and discards the newline character from each line, so your test will never succeed. You need to read in the full contents of the file at once:
set txt [chan read $wr_fp]
if {[regexp {\n\s+\n\s+\n\s+\n} $txt]} { puts "found 4 empty lines"}
Note that you need to use braces around the regular expression as Donal explains.
On some typical pitfalls of RE formulation:
do you really intend to specify that there must be at least one whitespace character on each 'empty' line? If you want to allow lines with no characters at all between the newlines, use \s* instead of \s+.
Also note that this regular expression will match ranges with more than four newlines: the extra newlines will be consumed by one of the \s+ groups. If you want to disallow extra newlines, match with (e.g.) [ \t\f\r] (or any other combination of whitespace you want) instead of \s. Note that this means the expression will match exactly three lines with nothing but blanks, tabs, form feeds, and returns, the lines surrounded and separated by newlines: you might want to extend it with one more subgroup to match the fourth line.
I'm a bit mystified by your solution as described in your own answer, since it doesn't do what was specified in the question. With the following text file:
abc
def
ghi
jkl
mno
pqr
stu
vwx
yz.
(where there is a tab character in the second line after "pqr")
and assuming count has the value 0 when the code is called, your code outputs "found 4 space" after reading the blank lines after "def", "pqr", and "vwx", but not after the line before "stu", where your question indicated it should be.
This code
set count 0
while {[gets $rd_fp line] >= 0} {
if {[string is space $line]} {
incr count
if {$count == 4} {puts "found 4 space"}
} else {
set count 0
}
}
does do what you asked for (nearly): it accepts lines containing whitespace as empty, and it prints its message only after finding four consecutive empty lines. The major difference from the specification in your question is that it also accepts lines without any characters as empty. To match your specification, string is space -strict $line should be used instead.
Documentation: chan, gets, if, incr, puts, regexp, set, string, while

Related

How to delete a part of the text file if a pattern is found matching using tcl?

How can I remove a part of the text file if the pattern I am searching is matched?
eg:
pg_pin (VSS) {
direction : inout;
pg_type : primary_ground;
related_bias_pin : "VBN";
voltage_name : "VSS";
}
leakage_power () {
value : 0;
when : "A1&A2&X";
**related_pg_pin** : VBN;
}
My pattern is related_pg_pin. If this pattern is found i want to remove that particular section(starting from leakage power () { till the closing bracket}).
proc getSection f {
set section ""
set inSection false
while {[gets $f line] >= 0} {
if {$inSection} {
append section $line\n
# find the end of the section (a single right brace, #x7d)
if {[string match \x7d [string trim $line]]} {
return $section
}
} else {
# find the beginning of the section, with a left brace (#x7b) at the end
if {[string match *\x7b [string trim $line]]} {
append section $line\n
set inSection true
}
}
}
return
}
set f [open data.txt]
set g [open output.txt w]
set section [getSection $f]
while {$section ne {}} {
if {![regexp related_pg_pin $section]} {
puts $g $section
}
set section [getSection $f]
}
close $f
close $g
Starting with the last paragraph of the code, we open a file for reading (through the channel $f) and then get a section. (The procedure to get a section is a little bit convoluted, so it goes into a command procedure to be out of the way.) As long as non-empty sections keep coming, we check if the pattern occurs: if not, we print the section to the output file through the channel $g. Then we get the next section and go to the next iteration.
To get a section, first assume we haven't yet seen any part of a section. Then we keep reading lines until the end of the file is found. If a line ending with a left brace is found, we add it to the section and take a note that we are now in a section. From then on, we add every line to the section. If a line consisting of a single right brace is found, we quit the procedure and deliver the section to the caller.
Documentation:
! (operator),
>= (operator),
append,
close,
gets,
if,
ne (operator),
open,
proc,
puts,
regexp,
return,
set,
string,
while,
Syntax of Tcl regular expressions
Syntax of Tcl string matching:
* matches a sequence of zero or more characters
? matches a single character
[chars] matches a single character in the set given by chars (^ does not negate; a range can be given as a-z)
\x matches the character x, even if that character is special (one of *?[]\)
Here's a "clever" way to do it:
proc unknown args {
set body [lindex $args end]
if {[string first "related_pg_pin" $body] == -1} {puts $args}
}
source file.txt
Your data file appears to be Tcl-syntax-compatible, so execute it like a Tcl file, and for unknown commands, check to see if the last argument of the "command" contains the string you want to avoid.
This is clearly insanely risky, but it's fun.

Tcl script to do Indentation

I want to write a tcl script to align my tcl script with proper indentation. For Example if i have a code like :
proc calc { } {
set a 5
set b 10
if {a < b} {
puts "b Greater"
}
}
I need to change like:
proc calc { } {
set a 5
set b 10
if {a < b} {
puts "b Greater"
}
}
Could u guys help on this.
Writing an indenter that handles your example is trivial. A full indenter that can handle most Tcl scripts is going to be very big and quite complicated. An indenter that can handle any Tcl script will have to incorporate a full Tcl interpreter.
This is because Tcl source code is very dynamic: for one thing you can't always just look at the code and know which parts are executing code and which parts are data. Another thing is user-defined control structures, which might change how the code is to be viewed. The example below works by counting braces, but it makes no attempt to distinguish between quoting braces that should increase indentation and quoted braces that should not.
This example is a very simple indenter. It is severely limited and should not be used for serious implementations.
proc indent code {
set res {}
set ind 0
foreach line [split [string trim $code] \n] {
set line [string trim $line]
# find out if the line starts with a closing brace
set clb [string match \}* $line]
# indent the line, with one less level if it starts with a closing brace
append res [string repeat { } [expr {$ind - $clb}]]$line\n
# look through the line to find out the indentation level of the next line
foreach c [split $line {}] {
if {$c eq "\{"} {incr ind}
if {$c eq "\}"} {incr ind -1}
}
}
return $res
}
This will convert your first code example to your second one. Add even a single brace as data somewhere in the code to be indented, though, and the indentation will be off.
Documentation: append, expr, foreach, if, incr, proc, return, set, split, string

How to match a colon after a close bracket

Why does the following not match the :
expect {
timeout {puts timedout\n}
\[1\]: {puts matched\n}
}
> expect test.tcl
[1]:
timedout
If I change it and remove the colon the match works:
expect {
timeout {puts timedout\n}
\[1\] {puts matched\n}
}
$ expect test.tcl
[1]
matched
Or if I get rid of the 1st bracket
expect {
timeout {puts timedout\n}
1\]: {puts matched\n}
}
then it matches:
$ expect test.tcl
1]:
matched
It is not the problem with :, but with [.
The [ is special to both Tcl and the Expect pattern matcher so it is particularly messy. To match a literal [, you have to backslash once from Tcl and then again so that it is not treated as a range during pattern matching. The first backslash, of course, has to be backslashed to prevent it from turning the next backslash into a literal backslash!
expect "\\\[" ; #matches literal '['
So, your code should be,
expect {
timeout {puts timedout\n}
\\\[1]: {puts matched\n}
}
You can prefix the ] with a backslash if it makes you feel good, but it is not
necessary. Since there is no matching left-hand bracket to be matched within the
double-quoted string, nothing special happens with the right-hand bracket. It stands for itself and is passed on to the Expect command, where it is then interpreted as the end of the range.
The next set of examples shows the behavior of [ as a pattern preceded by differing numbers of backslashes. If the [ is not prefixed by a backslash, Tcl interprets whatever follows as a command. For these examples, imagine that there is a procedure named XY that returns the string n*w.
expect" [XY]" ; # matches n followed by anything
expect "\[XY]" ; # matches X or Y
expect "\\[XY]" ; # matches n followed by anything followed by w
expect "\\\[XY]" ; # matches [XYl
expect "\\\\[XY]" ; # matches \ followed by n followed ...
expect "\\\\\[XY]" ; # matches sequence of \ and X or Y
The \\[XY] case deserves close scrutiny. Tcl interprets the first backslash to mean that the second is a literal character. Tcl then produces n*w as the result of the XY command. The pattern matcher ultimately sees the four character string n*w. The pattern matcher interprets this in the usual way. The backslash indicates that the n is to be matched literally (which it would even without the backslash since the n is not special to the pattern matcher).
Source : Exploring Expect
The patterns that worked for me:
-exact {[1]:}
-exact "\[1]:"
{\[1]:}
"\\\[1]:"

Using backslash-newline sequence in Tcl

In Tcl, we are using the backslash for escaping special characters as well as for spreading long commands across multiple lines.
For example, a typical if loop can be written as
set some_Variable_here 1
if { $some_Variable_here == 1 } {
puts "it is equal to 1"
} else {
puts "it is not equal to 1"
}
With the help of backslash, it can be written as follows too
set some_Variable_here 1
if { $some_Variable_here == 1 } \
{
puts "it is equal to 1"
} \
else {
puts "it is not equal to 1"
}
So, with backslash we can make the statements to be treated as if like they are in the same line.
Lets consider the set statement
I can write something like as below
set x Albert\ Einstein;# This works
puts $x
#This one is not working
set y Albert\
Einstein
If I try with double quotes or braces, then the above one will work. So, is it possible to escape the newline with backslashes without double quotes or braces?
A backslash-newline-whitespace* sequence (i.e., following whitespace is skipped over) is always replaced with a single space. To get a backslash followed by a newline in the resulting string, use \\ followed by \n instead.
set y Albert\\\nEinstein

How to ensure my regular expression does not match too much

A file has few words with numbers in the begining of them. i want to extract a particular no line.when given 1, it extracts line 1 also with 11, 21
FILE.txt has contents:
1.sample
lines of
2.sentences
present in
...
...
10.the
11.file
when Executed pro 1 file.txt
gives results from line 1,10 and also from line 11
as these three results have 1 in their string. i.e
Output of the script:
1.sample
10.the
11.file
Expected output: the output which i am expecting
is only line 1 contents and not the line 10 or line 11 contents.
i.e
Expected output:
1.sample
My current code:
proc pro { pattern args} {
set file [open $args r]
set lnum 0
set occ 0
while {[gets $file line] >=0} {
incr lnum
if {[regexp $pattern $line]} {
incr occ
puts "The pattern is present in line: $lnum"
puts "$line"
} else {
puts "not found"
}
}
puts "total number of occurencese : $occ"
close $file
}
the program is working fine but the thing is i am retrieving lines that i dont want to along with the expected line. As the number (1) which i want to retrieve is present in the other strings such as 11, 21, 14 etc these lines are also getting printed.
kindly tolerate my unclear way of explaining the question.
You can solve the problem using word boundaries as suggested by glen but you can also consider the following things:
If after every line number there is a . then you can use it as delimiter in regular expression
regexp "^$lineNo\\." $a
I would also suggest to use ^ (match at the beginning of line) so that even if number is present in the line elsewhere it would not get counted.
tcl word boundaries are well explained at http://www.regular-expressions.info/wordboundaries.html
You have to ensure your pattern matches only between word boundaries:
if {[regexp "\\m$pattern\\M" $line]} { ...
See the documentation for regular expression syntax.
If what you're looking to do is as constrained as what you're describing, why not just use something like
if { [string range $line 0 [string length $pattern]] eq "${pattern}." } {
...
}