How to delete a part of the text file if a pattern is found matching using tcl? - tcl

How can I remove a part of the text file if the pattern I am searching is matched?
eg:
pg_pin (VSS) {
direction : inout;
pg_type : primary_ground;
related_bias_pin : "VBN";
voltage_name : "VSS";
}
leakage_power () {
value : 0;
when : "A1&A2&X";
**related_pg_pin** : VBN;
}
My pattern is related_pg_pin. If this pattern is found i want to remove that particular section(starting from leakage power () { till the closing bracket}).

proc getSection f {
set section ""
set inSection false
while {[gets $f line] >= 0} {
if {$inSection} {
append section $line\n
# find the end of the section (a single right brace, #x7d)
if {[string match \x7d [string trim $line]]} {
return $section
}
} else {
# find the beginning of the section, with a left brace (#x7b) at the end
if {[string match *\x7b [string trim $line]]} {
append section $line\n
set inSection true
}
}
}
return
}
set f [open data.txt]
set g [open output.txt w]
set section [getSection $f]
while {$section ne {}} {
if {![regexp related_pg_pin $section]} {
puts $g $section
}
set section [getSection $f]
}
close $f
close $g
Starting with the last paragraph of the code, we open a file for reading (through the channel $f) and then get a section. (The procedure to get a section is a little bit convoluted, so it goes into a command procedure to be out of the way.) As long as non-empty sections keep coming, we check if the pattern occurs: if not, we print the section to the output file through the channel $g. Then we get the next section and go to the next iteration.
To get a section, first assume we haven't yet seen any part of a section. Then we keep reading lines until the end of the file is found. If a line ending with a left brace is found, we add it to the section and take a note that we are now in a section. From then on, we add every line to the section. If a line consisting of a single right brace is found, we quit the procedure and deliver the section to the caller.
Documentation:
! (operator),
>= (operator),
append,
close,
gets,
if,
ne (operator),
open,
proc,
puts,
regexp,
return,
set,
string,
while,
Syntax of Tcl regular expressions
Syntax of Tcl string matching:
* matches a sequence of zero or more characters
? matches a single character
[chars] matches a single character in the set given by chars (^ does not negate; a range can be given as a-z)
\x matches the character x, even if that character is special (one of *?[]\)

Here's a "clever" way to do it:
proc unknown args {
set body [lindex $args end]
if {[string first "related_pg_pin" $body] == -1} {puts $args}
}
source file.txt
Your data file appears to be Tcl-syntax-compatible, so execute it like a Tcl file, and for unknown commands, check to see if the last argument of the "command" contains the string you want to avoid.
This is clearly insanely risky, but it's fun.

Related

How to remove a single letter/number

I have single letters and numbers in a variable that I would like to remove
example inputs:
USA-2019-1-aoiwer
USA-A-jowerasf
BB-a_owierlasdf-2019
flsfwer_5_2015-asfdlwer
desired outputs:
USA-2019--aoiwer
USA--jowerasf
BB-_owierlasdf-2019
flsfwer__2015-asfdlwer
my code:
bind pub "-|-" !aa proc:aa
proc proc:aa { nick host handle channel arg } {
set line [lindex $arg 0]
set line [string map {[a-z] """} $line]
set line [string map {[0-9] """} $line]
putnow "PRIVMSG $channel :$line"
}
Unfortunately that does not work and i have no other idea
Regards
string map would remove all the lowercase letters and numbers, if it worked. However, you also have unbalanced quotes, which causes a syntax error when the proc is resolving.
I would recommend using regsub. The hard part, however, would be to get a proper expression to do the task. I will suggest the following:
bind pub "-|-" !aa proc:aa
proc proc:aa { nick host handle channel arg } {
set line [lindex $arg 0]
regsub -nocase -all {([^a-z0-9]|\y)[a-z0-9]([^a-z0-9]|\y)} $line {\1\2} line
putnow "PRIVMSG $channel :$line"
}
Basically ([^a-z0-9]|\y) matches a character that is non alphanumeric, or a word boundary (which will match at the beginning of a sentence for example if it can, or at the end of a sentence), and stores it (this is the purpose of the parens).
The matched groups are stored in order starting with 1, so in the replace portion of regsub, I'm placing the parts that shouldn't be replaced back where they were.
The above should work fine.
You could technically go a little fancier with a slightly different expression:
regsub -nocase -all {([^a-z0-9]|\y)[a-z0-9](?![a-z0-9])} $line {\1} line
Which uses a negative lookahead ((?! ... )).
Anyway, if you do want to get more in depth, I recommend reading the manual on regular expression syntax

Tcl script to do Indentation

I want to write a tcl script to align my tcl script with proper indentation. For Example if i have a code like :
proc calc { } {
set a 5
set b 10
if {a < b} {
puts "b Greater"
}
}
I need to change like:
proc calc { } {
set a 5
set b 10
if {a < b} {
puts "b Greater"
}
}
Could u guys help on this.
Writing an indenter that handles your example is trivial. A full indenter that can handle most Tcl scripts is going to be very big and quite complicated. An indenter that can handle any Tcl script will have to incorporate a full Tcl interpreter.
This is because Tcl source code is very dynamic: for one thing you can't always just look at the code and know which parts are executing code and which parts are data. Another thing is user-defined control structures, which might change how the code is to be viewed. The example below works by counting braces, but it makes no attempt to distinguish between quoting braces that should increase indentation and quoted braces that should not.
This example is a very simple indenter. It is severely limited and should not be used for serious implementations.
proc indent code {
set res {}
set ind 0
foreach line [split [string trim $code] \n] {
set line [string trim $line]
# find out if the line starts with a closing brace
set clb [string match \}* $line]
# indent the line, with one less level if it starts with a closing brace
append res [string repeat { } [expr {$ind - $clb}]]$line\n
# look through the line to find out the indentation level of the next line
foreach c [split $line {}] {
if {$c eq "\{"} {incr ind}
if {$c eq "\}"} {incr ind -1}
}
}
return $res
}
This will convert your first code example to your second one. Add even a single brace as data somewhere in the code to be indented, though, and the indentation will be off.
Documentation: append, expr, foreach, if, incr, proc, return, set, split, string

How search for list's each element existence in file

How can I organize a cycle using TCL for searching list's each element existence in file or in another list, and if it doesn't exists there return unmatched element.
If the number of things that you are checking for is significantly smaller than the number of lines/tokens in the file, it is probably best to use the power of associative arrays to do the check as this can be done with linear scans (associative arrays are fast).
proc checkForAllPresent {tokens tokenList} {
foreach token $tokens {
set t($token) "dummy value"
}
foreach token $tokenList {
unset -nocomplain t($token)
}
# If the array is empty, all were found
return [expr {[array size t] == 0}]
}
Then, all we need to do is a little standard stanza to get the lines/tokens from the file and run them through the checker. Assuming we're dealing with lines:
proc getFileLines {filename} {
set f [open $filename]
set data [read $f]
close $f
return [split $data "\n"]
}
set shortList [getFileLines file1.txt]
set longList [getFileLines file2.txt]
if {[checkForAllPresent $shortList $longList]} {
puts "All were there"
} else {
puts "Some were absent"
}
It's probably better to return the list of absent lines (with return [array names t]) instead of whether everything is absent (with the general check of “is everything there” being done with llength) as that gives more useful information. (With more work, you can produce even more information about what is present, but that's a bit more code and makes things less clear.)
(When searching, be aware that leading and trailing whitespace on lines matters. This is all exact matching here. Or use string trim.)
Working with words instead of lines is really just as easy. You just end up with slightly different code to extract the tokens from the read-in contents of the files.
return [regexp -all -inline {\w+} $data]
Everything else is the same.

tcl error : extra characters after close-brace

having issues trying to debug this 'extra characters after close-brace' error. Error message points to my proc line ... I just can't see it for 2 days!
# {{{ MAIN PROGRAM
proc MAIN_PROGRAM { INPUT_GDS_OASIS_FILE L CELL_LIST_FILE } {
if { [file exists $CELL_LIST_FILE] == 0 } {
set celllist [$L cells]
} else {
set fp [open $CELL_LIST_FILE r]
set file_data [read $fp]
close $fp
set celllist [split $file_data "\n"]
set totalcells [expr [llength $celllist] - 1]
}
set counter 0
foreach cell $celllist {
set counter [expr {$counter + 1}]
set value [string length $cell]
set value3 [regexp {\$} $cell]
if { $value > 0 && $value2 == 0 && $value3 == 0 } {
# EXTRACT BOUNDRARY SIZE FIRST
puts "INFO -- READING Num : $counter/$totalcells -- $cell ..."
ONEIP_EXTRACT_BOUNDARY_SIZE $cell $L "IP_SIZE/$cell.txt"
exec gzip -f "IP_SIZE/$cell.txt"
}
}
# }}}
}
# }}}
This seems to be an unfortunate case of using braces in comments. The Tcl parser looks at braces before comments (http://tcl.tk/man/tcl8.5/TclCmd/Tcl.htm). It is a problem if putting braces in comments causes a mismatched number of open/close braces.
Try using a different commenting style, and remove the "{{{" and "}}}" from your comments.
I'm pretty sure that this is down to braces in comments within the proc body.
The wiki page here has a good explaination. In short a Tcl comment isn't like a comment most other languages and having unmatched braces in them leads to all
sorts of issues.
So the braces in the #}}} just before the end of the proc are probably the problem.
Tcl requires procedure bodies to be brace-balanced, even within comments.
OK, that's a total lie. Tcl really requires brace-quoted strings to be brace-balanced (Tcl's brace-quoted strings are just like single-quoted strings in bash, except they nest). The proc command just interprets its third argument as a script (used to define the procedure body) and it's very common to use brace-quoted strings for that sort of thing. This is a feature of Tcl's general syntax, and is why Tcl is very good indeed at handling things like DSLs.
You could instead do this:
proc brace-demo args "puts hi; # {{{"
brace-demo do it yeah
and that will work fine. Totally legal Tcl, and has a comment in a procedure body with unbalanced braces. It just happens that for virtually any real procedure, putting in all the required backslashes to stop interpretation of variable and command substitutions too soon is a total bear. Everyone uses braces for simplicity, and so has to balance them.
It's hardly ever a problem except occasionally for comments.

How to ensure my regular expression does not match too much

A file has few words with numbers in the begining of them. i want to extract a particular no line.when given 1, it extracts line 1 also with 11, 21
FILE.txt has contents:
1.sample
lines of
2.sentences
present in
...
...
10.the
11.file
when Executed pro 1 file.txt
gives results from line 1,10 and also from line 11
as these three results have 1 in their string. i.e
Output of the script:
1.sample
10.the
11.file
Expected output: the output which i am expecting
is only line 1 contents and not the line 10 or line 11 contents.
i.e
Expected output:
1.sample
My current code:
proc pro { pattern args} {
set file [open $args r]
set lnum 0
set occ 0
while {[gets $file line] >=0} {
incr lnum
if {[regexp $pattern $line]} {
incr occ
puts "The pattern is present in line: $lnum"
puts "$line"
} else {
puts "not found"
}
}
puts "total number of occurencese : $occ"
close $file
}
the program is working fine but the thing is i am retrieving lines that i dont want to along with the expected line. As the number (1) which i want to retrieve is present in the other strings such as 11, 21, 14 etc these lines are also getting printed.
kindly tolerate my unclear way of explaining the question.
You can solve the problem using word boundaries as suggested by glen but you can also consider the following things:
If after every line number there is a . then you can use it as delimiter in regular expression
regexp "^$lineNo\\." $a
I would also suggest to use ^ (match at the beginning of line) so that even if number is present in the line elsewhere it would not get counted.
tcl word boundaries are well explained at http://www.regular-expressions.info/wordboundaries.html
You have to ensure your pattern matches only between word boundaries:
if {[regexp "\\m$pattern\\M" $line]} { ...
See the documentation for regular expression syntax.
If what you're looking to do is as constrained as what you're describing, why not just use something like
if { [string range $line 0 [string length $pattern]] eq "${pattern}." } {
...
}