Tcl: Removing the pound sign commented line - tcl

Why can't I remove the pound sign commented line?
#!/usr/bin/tclsh
set lines [list file1.bmp { # file2.bmp} file3.bmp ]
# Now we apply the substitution to get a subst-string that
# will perform the computational parts of the conversion.
set out [regsub -all -line {^\s*#.*$} $lines {}]
puts $out
Output:
file1.bmp { # file2.bmp} file3.bmp
-UPDATE-
Expected output:
file1.bmp {} file3.bmp
{} means empty string.
In fact, it's my first step. My ultimate goal is to eliminating all commented line and all empty lines. The above question only changes all comment lines into empty lines. For example, if the input is:
set lines [list file1.bmp { # file2.bmp} {} file3.bmp ]
I want my ultimate results to be
file1.bmp file3.bmp
Note: Stackoverflow mistakenly dim everything from and after the pound (#) sign, thinking that those are comments. Yet in TCL syntax, it should not be comments.
#Tensibai:
I also want to remove empty lines, thus I match any number of spaces before '#'. (since after removing all following '#' included, it's an empty line). In fact, in my data, the comment always appears as a full line by itself. Yet the '#' sign may not appear at the 1st character => the spaces can leads a comment line.

Edit to answer after edit:
#!/usr/bin/tclsh
set lines [list file1.bmp { # file2.bmp } file3.bmp #test ]
puts $lines
# Now we apply the substitution to get a subst-string that
# will perform the computational parts of the conversion.
set out [lsearch -regexp -all -inline -not $lines {^\s*(#.*)?$}]
puts $out
Output:
file1.bmp file3.bmp
You're working on a list, the representation of a list is a simple text so you can regsub it, but it's a single line.
If you want to check elements on this list you have to use list related commands.
Here lsearch will do what you wich, checking each item to see if they match the regex, the -not tells to return the elements no matching with -all -inline
Old answer:
Why: because your regex match any pound preceded only by 0 or unlimited number of spaces. Thus it will only match comment lines and not inline comments.
Have a look to http://regex101.com to test regexes.
A working regex would be:
#!/usr/bin/tclsh
set lines [list file1.bmp { # file2.bmp} file3.bmp ]
# Now we apply the substitution to get a subst-string that
# will perform the computational parts of the conversion.
set out [regsub -all -line {^(.*?)#.*$} $lines {\1}]
puts $out
For the regex (complete details here):
^ Matches start of line
(.*?)# Matches and capture as limited number of chars as possible before the # (non greedy operator ? to limit the match)
.*$ matches any numbe of chars until end of line
And we replace with \1 which is the first capture group (and the only one in this case).
Output:
file1.bmp {
This will also remove full line comments but may leave spaces or tabs if there's some before the pound sign and so leave blank lines.

Related

change a number in txt file

I have the next expression: jj_ftfll h\\h\ -0.8898:0.006656 0.998:0.99999 h&j\hhh in a txt file,
and I need to add 0.005 to the 0.006656 number. I want to use Tcl and I can't think of any good idea.
There's several aspects that are tricky.
The file needs to be edited in-place despite the fact that the addition might change the length of the line. (Such an addition could potentially either lengthen or shorten the line.)
There needs to be a way of robustly recognising that that is the line to modify, and not some other line in the file. (This is actually the hardest of these problems in reality; it's extremely application-specific.)
The number needs to be extracted from the line, modified, and written back.
The values you are dealing with are potentially (well, actually) not represented precisely in IEEE binary floating point, which is what Tcl will use to do the calculations.
Bearing all that in mind, we are talking about these sorts of solutions:
We'll read the whole file in, split it into a list of strings, one string per line (henceforth referred to as the lines), update the lines of interest, and then write the whole lot back.
We'll use regexp to decide if a line is of interest. That's by far the most common command for this sort of task.
This one is messy in Tcl 8.6 and before. It's got a much better solution in Tcl 8.7.
There's really not all that much you can do about this. If you know the range of the numbers, you can use format to help… but it's messy. But maybe you'll get lucky.
set filename "foobar.txt"
# Get the lines of the file; this is GREAT if the file isn't too large
set f [open $filename]
set lines [split [read $f] "\n"]
close $f
# Now THAT'S what I call a horrible regular expression!
set RE {^(jj_ftfll\s+h\\\\h\\\s+-?[\d.]+:)(-?[\d.]+)(\s+-?[\d.]+:-?[\d.]+\s+h&j\hhh)$}
set newLines {}
foreach line $lines {
if {[regexp $RE $line -> prefix number suffix]} {
set line $prefix[expr {$number + 0.005}]$suffix
}
lappend newLines $line
}
# Write back over the file; the -nonewline prevents the number of lines from growing
set f [open $filename w]
puts -nonewline $f [join $newLines "\n"]
close $f
The trick with the regexp is that I am matching three pieces: the bit of the line before the part to replace (saved in the variable prefix), the number to replace itself (number), and the bit after the part to replace (suffix); the regexp command returns the number of times it matches (1 if the RE is found, 0 if it isn't). It's a scary RE mostly because it has -?[\d.]+ to match those floating point numbers, and I've changed spaces to \s+ (i.e., “at least one whitespace character”).
The version for 8.7 is this:
set filename "foobar.txt"
# Get the lines of the file; this is GREAT if the file isn't too large
set f [open $filename]
set lines [split [read $f] "\n"]
close $f
# Now THAT'S what I call a horrible regular expression!
set RE {^(jj_ftfll\s+h\\\\h\\\s+-?[\d.]+:)(-?[\d.]+)(\s+-?[\d.]+:-?[\d.]+\s+h&j\hhh)$}
proc addDeltaInLine {delta prefix number suffix} {
set number [expr {$number + $delta}]
return [string cat $prefix $number $suffix]
}
set newLines [lmap line $lines {
regsub -command $RE $line {addDeltaInLine 0.005}
}]
# Write back over the file; the -nonewline prevents the number of lines from growing
set f [open $filename w]
puts -nonewline $f [join $newLines "\n"]
close $f
The combination of lmap and regsub -command clean things up quite a bit. The RE is still scary though…

To find line index and word index by reading a text file

I have just started learning Tcl, can someone help me how to find line index and word index for a particular word by reading a text file using Tcl.
Thank you
As mentioned in the comments, there is a lot of basic commands you might utilize to solve your problem. To read a file into a list of lines you could use open, split, read and close commands as follows:
set file_name "x.txt"
# Open a file in a read mode
set handle [open $file_name r]
# Create a list of lines
set lines [split [read $handle] "\n"]
close $handle
Finding a certain word in a list of lines might be achieved by using a for loop, incr and a set of lists related commands like llength, lindex and lsearch. Every string in Tcl can be interpreted and processed as a list. The implementation might look like this:
# Searching for a word "word"
set neddle "word"
set w -1
# For each line (you can use `foreach` command here)
for {set l 0} {$l < [llength $lines]} {incr l} {
# Treat a line as a list and search for a word
if {[set w [lsearch [lindex $lines $l] $neddle]] != -1} {
# Exit the loop if you found the word
break
}
}
if {$w != -1} {
puts "Word '$neddle' found. Line index is $l. Word index is $w."
} else {
puts "Word '$neddle' not found."
}
Here, the script iterates over the lines and searches each one for a given word as if it was a list. Executing a list command on a string splits it by space by default. The loop stops when a word is found in a line (when lsearch returns a non-negative index).
Also note, that the list commands are treating multiple spaces as a single separator. In this case it seems to be a desired behavior. Using split command on a string with a double space would effectively create a "zero length word" which might yield an incorrect word index.

how to get specific parameters in a square bracket and store it in to a specific variable in tcl

set_dont_use [get_lib_cells */*CKGT*0P*] -power
set_dont_use [get_lib_cells */*CKTT*0P*] -setup
The above is a text file.
I Want to store */CKGTOP* and */CKTTOP* in to a variable this is the programme which a person helped me with
set f [open theScript.tcl]
# Even with 10 million lines, modern computers will chew through it rapidly
set lines [split [read $f] "\n"]
close $f
# This RE will match the sample lines you've told us about; it might need tuning
# for other inputs (and knowing what's best is part of the art of RE writing)
set RE {^set_dont_use \[get_lib_cells ([\w*/]+)\] -\w+$}
foreach line $lines {
if {[regexp $RE $line -> term]} {
# At this point, the part you want is assigned to $term
puts "FOUND: $term"
}
}
My question is if more than one cells like for example
set_dont_use [get_lib_cells */*CKGT*0P* */*CKOU*TR* /*....] -power
set_dont_use [get_lib_cells */*CKGT*WP* */*CKOU*LR* /*....] -setup
then the above script isn't helping me to store the these "n" number cells in the variable known as term
Could any of u people help me
Thanking you ahead in time
I would go with
proc get_lib_cells args {
global term
lappend term {*}$args
}
proc unknown args {}
and then just
source theScript.tcl
in a shell that doesn't have the module you are using loaded, and thus doesn't know any of these non-standard commands.
By setting unknown to do nothing, other commands in the script will just be passed over.
Note that redefining unknownimpairs Tcl's ability to automatically load some processes, so don't keep using that interpreter after this.
Documentation:
global,
lappend,
proc,
unknown,
{*} (syntax)
Your coding seems like the Synopsys syntax, meaning - it shouldn't work the way you wrote it, I'd expect curly braces:
set_dont_use [get_lib_cells {*/*CKGT*0P* */*CKOU*TR* /*....}] -power
moreover, the \w doesn't catch the *,/ (see this).
If I were you, I'd go for set RE {^set_dont_use \[get_lib_cells \{?([\S*]+ )+\}?\] -\w+$} and treat the resulting pattern match as a list.
Edit:
see this:
% regexp {^set_dont_use [get_lib_cells {?(\S+) ?}?]} $line -> match
1
% echo $match
*/*CKGT*0P*
If you have more than one item in your line, add another parentheses inside the curly braces:
regexp {^set_dont_use \[get_lib_cells \{?(\S+) ?(\S+)?\}?\]} $l -> m1 m2
ect.
Another Edit
take a look at this, just in case you want multiple matches with the same single pattern, but than, instead of \S+, you should try something that looks like this: [A-Za-z\/\*]

regular expression to treat unbalanced braces as a word

I am getting an error message in this regex when line contains unbalanced braces.
set line "a b { c{}"
set lst [regexp -all -inline {^(\s*(\S*)\s*)*(\{(.*)\})?(\s*(\S*)\s*)*$} $line]
set lst [lindex $lst 0]
set firstelement [lindex $lst 0]
How to avoid such cases and treat unbalanced braces as a word?
When you have a string from an arbitrary source (like a user) there's no guarantee at all that it is a well-formed list. Now regexp -inline returns a list of what it matched, but the elements of that list are strings (unless you use the -indices option, of course) and that means that you can't safely use lindex on them to pick out the pieces.
The safe way to get the first “word”, assuming you define “word” to be “sequence of non-whitespace characters” (the usual user definition), is to do this:
set firstWord [lindex [regexp -all -inline {\S+} $item] 0]
It's a bit ugly, but it's totally safe. (In fact, for the first word only, use regexp -inline {\S+} $item on its own, but that won't let you get later words.)
Using split to break a string into words is also possible, but that strongly assumes that the word separator is a single (whitespace-by-default) character and does something that you might not expect if you have multi-whitespace separators, or leading and trailing whitespace. Frankly, it's more useful for dividing up non-whitespace separated strings (e.g., a file into lines, an /etc/passwd record into fields) or for turning a string into the list of its characters (with an empty second argument).
The regexp command returns a list. You then take the first element of the list. But in the final line you then treat that element as a list - but it is not guaranteed to be so - hence the actual string content matters. Instead, if you want to deal with this item as a list you need to use split and convert it into words:
% split "a b {" " "
a b \{
In your case:
set lst [lindex $lst 0]
set firstelement [lindex [split $lst " "] 0]
You may also want to look into subst. It looks like you are trying to read poorly specified tcl lists as input and doing some parsing to get them as a proper tcl list. In which case, subst -nocommands [lindex $lst 0] might be more helpful to you. For example:
% lindex [subst -nocommands [lindex $lst 0]] 2
c{}
Note that this is the content of the braced part of $line.

Grep the word inside double quote

How can I extract a word inside a double quote inside a file?
e.g.
variable "xxx"
Reading a text file into Tcl is just this:
set fd [open $filename]
set data [read $fd] ;# Now $data is the entire contents of the file
close $fd
To get the first quoted string (under some assumptions, notably a lack backslashed double quote characters inside the double quotes), use this:
if {[regexp {"([^""]*)"} $data -> substring]} {
# We found one, it's now in $substring
}
(Doubling up the quote in the brackets is totally unnecessary — only one is needed — but it does mean that the highlighter does the right thing here.)
The simplest method of finding all the quoted strings is this:
foreach {- substring} [regexp -inline -all {"([^""]*)"} $data] {
# One of the substrings is $substring at this point
}
Notice that I'm using the same regular expression in each case. Indeed, it's actually good practice to factor such REs (especially if repeatedly used) into a variable of their own so that you can “name” them.
Combining all that stuff above:
set FindQuoted {"([^""]*)"}
set fd [open $filename]
foreach {- substring} [regexp -inline -all $FindQuoted [read $fd]] {
puts "I have found $substring for you"
}
close $fd
Internal Matching
If you're just looking for a regular expression, then you can use TCL's capture groups. For example:
set string {variable "xxx"}
regexp {"(.*)"} $string match group1
puts $group1
This will return xxx, discarding the quotes.
External Matching
If you want to match data in a file without having to handling reading the file into TCL directly, you can do that too. For example:
set match [exec sed {s/^variable "\(...\)"/\1/} /tmp/foo]
This will call sed to find just the parts of the match you want, and assign them to a TCL variable for further process. In this example, the match variable is set to xxx as above, but is operating on an external file rather than a stored string.
When you just want to find with grep all words in quotes in a file and do something with the words, you do something like this (in a shell):
grep -o '"[^"]*"' | while read word
do
# do something with $word
echo extracted: $word
done