Is there a simple way to parse a line of Tcl into its command and its arguments (not just splitting by whitespace) - tcl

Suppose I have a string which is also a Tcl command.
set line {lsort -unique [list a b c a]}
How can I convert this string into a list equivalent to this?
{
{lsort}
{-unique}
{[list a b c a]}
}
Because of whitespace inside the square brackets, I can't just use lindex.
For example:
> lindex $line 2
--> [list
The reason I'm asking is because I have a large Tcl script that I want to parse and re-write. I would like certain lines in the re-written script to have swapped argument order or some numerical arguments scaled by a factor.
I know I could parse the string character by character, keeping track of {}, [], and " characters, but this feels like re-inventing something that might already exist. I've been looking at the info and interp commands but couldn't find anything there.

I used info complete successfully in this proc.
proc command_to_list {command} {
# split by whitespace
set words [regexp -all -inline {\S+} $command]
set spaces [regexp -all -inline {\s+} $command]
set output_list [list]
set buffer ""
foreach word $words space $spaces {
append buffer $word
if {[info complete $buffer]} {
lappend output_list $buffer
set buffer ""
} else {
append buffer $space
}
}
return $output_list
}
This proc will group whitespace separated 'words' until they have no unmatched curlies, double quotes, or square brackets. Whitespace is preserved inside of matching pairs of curlies, double quotes or square brackets.
> set command {foreach {k v} [list k1 v1 k2 v2] {puts "$k $v"}}
> foreach word [command_to_list $command] {puts $word}
foreach
{k v}
[list k1 v1 k2 v2]
{puts "$k $v"}

Related

Using tcl split to solve `list element in braces followed by "]" instead of space` causes `unmatched open brace in list`

(This is from a nagelfar plugin -- it's a tcl analyzer written in tcl, which is why $x contains tcl code.)
In tcl shell:
% set x {proc {::$p} args {[subst { foo }]} }
proc {::$p} args {[subst { foo }]}
%
% lindex $x 3 0
list element in braces followed by "]" instead of space
According to http://forum.egghelp.org/viewtopic.php?t=2603 the solution is to use split, however:
% lindex [split $x] 3 0
unmatched open brace in list
What's the correct way to use lindex on a variable whose content is like the above $x?
The original string $x can be treated like a simple list...
% set x {proc {::$p} args {[subst { foo }]} }
% foreach item $x { puts $item}
proc
::$p
args
[subst { foo }]
...but not as a list of lists. [subst { foo }] is not formatted properly to be a list.
% lindex $x 3 0
Error: list element in braces followed by "]" instead of space
Use error_info for more info. (CMD-013)
A string can be treated like a list in Tcl if it can be separated by white space and optionally includes grouping with quotes or braces. The last white space separated part of the string [subst { foo }] is }]. This prevents first { character from forming a match for grouping.
For this specific string, you could first get the 4th item from $x and the use split to get the 1st item.
% lindex [split [lindex $x 3]] 0
[subst
In a general case, you could insert a space after every occurence of }.
% set y [regsub -all "\}" $x "\} "]
proc {::$p} args {[subst { foo } ]}
% lindex $y 3 0
[subst

How to match a string and print the next word afterthat?

Lets say i have the following script and have to look for .model and print the next two word before (. The following is the contents of the file that I need to read.
.model Q2N2222 NPN(Is=14.34f Xti=3 Eg=1.11 Vaf=74.03 Bf=255.9 Ne=1.307
Ise=14.34f Ikf=.2847 Xtb=1.5 Br=6.092 Nc=2 Isc=0 Ikr=0 Rc=1
+ Cjc=7.306p Mjc=.3416 Vjc=.75 Fc=.5 Cje=22.01p Mje=.377 Vje=.75
+ Tr=46.91n Tf=411.1p Itf=.6 Vtf=1.7 Xtf=3 Rb=10)
* National pid=19 case=TO18
* 88-09-07 bam creation
*$
.model Q2N3904 NPN(Is=6.734f Xti=3 Eg=1.11 Vaf=74.03 Bf=416.4 Ne=1.259
.model Q2N3906 PNP(Is=1.41f Xti=3 Eg=1.11 Vaf=18.7 Bf=180.7 Ne=1.5 Ise=0
Here is the code i have written so far. But i couldnt get any. Need the help
proc find_lib_parts {f_name} {
set value [string first ".lib" $f_name]
if {$value != -1} {
#open the file
set fid [ open $f_name "r"]
#read the fid and split it in to lines
set infos [split [read $fid] "\n"]
close $fid
set res {}
append res "MODEL FOUND:\n"
if {[llength $line] > 2 && [lindex $line 0] eq {model}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
if {[llength $line] > 2 && [lindex $line 0] eq {MODEL}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
}
return $res
In this case, a regular expression is by far the simplest way of doing such a search. Assuming the words are always on the same line, it's easy:
proc find_lib_parts {f_name} {
set fid [open $f_name]
set infos [split [read $fid] "\n"]
close $fid
set found {}
foreach line $infos {
if {[regexp {\.model\s+(\w+\s+\w+)\(} $line -> twoWords]} {
lappend found $twoWords
}
}
return $found
}
For your input data sample, that'll produce a result like this:
{Q2N2222 NPN} {Q2N3904 NPN} {Q2N3906 PNP}
If there's nothing to find, you'll get an empty list. (I assume you pass filenames correctly anyway, so I omitted that check.)
The regular expression, which should virtually always be enclosed in {braces} in Tcl, is this:
\.model\s+(\w+\s+\w+)\(
It's relatively simple. The pieces of it are:
\.model — literal “.model” (with an escape of the . because it is a RE metacharacter)
\s+ — some whitespace
( — start a capturing group (the bit we put into the twoWords variable)
\w+ — a “word”, one or more alphanumeric (or underscore) characters
\s+ — some whitespace
\w+ — a “word”, one or more alphanumeric (or underscore) characters
) — end of the capturing group
\( — literal “(”, escaped
The regexp command matches this, returning whether or not it matched (effectively boolean without the -all option, which we're not using here), and assigning the various groups to the variables named afterwards, -> for the whole matched string (yes, that's a legal variable name; I like to use it for regexp variables that dump info I don't want) and twoWords for the interesting substring.

Replace same strings with swap difference?

To manipulate Strings in Tcl, we use the string command.
If you need to replace comma:
set value { 10.00 }
puts [string map -nocase { . , } $value]
# Return: 10,00
We can replace several strings:
set text "This is a replacement test text"
puts [string map -nocase { e E s S a A } $text]
# Returns: THIS IS A TEXT OF REPLACEMENT TEST
Of course, we can replace words:
set text "This is a replacement test text"
puts [string map -nocase {test TEST a {second}} $text]
# Returns: This is the second replacement TEST text.
So far so good!
But one question that does not want to be silent is .. How to replace more than one identical occurrence in the sentence, giving a DIFFERENT substitution for each of them?
For example:
set time {10:02:12}
puts [string map -nocase { { : +} {: =} } $time]
I would like this result: 10 + 02 = 12
proc seqmap {str match args} {
set rc $str
foreach l [lreverse [regexp -all -indices -inline ***=$match $str]] \
replacement [lreverse $args] {
set rc [string replace $rc {*}$l $replacement]
}
return $rc
}
seqmap 10:02:12 : { + } { = }
=> 10 + 02 = 12
I'm using lreverse in case the replacement has a different length than the string it replaces. The indices would be off if the replacements were done from left to right.
The ***= is used to avoid special treatment of wildcard characters in the match string.
Of course, things get a lot more complicated if you want to handle the case where the number of occurrences doesn't match the number of provided substitutions. And even more if you want to replace several different strings.
This version handles the complications mentioned above:
proc seqmap {map str} {
# Transform the map into a dict with each key containing a list of replacements
set mapdict {}
foreach {s r} $map {dict lappend mapdict $s $r}
# Build a map where each key maps to a unique tag
# At the same time build a dict that maps our tags to the replacements
# First map the chosen tag character in case it is present in the string
set newmap {# #00}
set mapdict [dict map {s r} $mapdict {
lappend newmap $s [set s [format #%02d [incr num]]]
set r
}]
# Add the tag character to the dict so it can be mapped back
dict set mapdict #00 #
# Map the tags into the string
set rc [string map $newmap $str]
# Locate the positions where the tags ended up
set match [regexp -all -indices -inline {#\d\d} $rc]
# Create a list of replacements matching the tags
set replace [lmap l $match {
# Extract the tag
set t [string range $rc {*}$l]
# Obtain a replacement for this tag
set s [lassign [dict get $mapdict $t] r]
# Return the used replacement to the end of the list
dict set mapdict $t [linsert $s end $r]
# Add the replacement to the list
set r
}]
# Walk the two lists in reverse order, replacing the tags with the selected replacements
foreach l [lreverse $match] r [lreverse $replace] {
set rc [string replace $rc {*}$l $r]
}
# Done
return $rc
}
You call it just like you would string map, so with a key-value mapping and the string to perform the replacements on. Any duplicated keys specify the subsequent values to be substituted for each occurrence of the key. When the list is exhausted it starts over from the beginning.
So puts [seqmap {: + : = : *} 10:02:12] => 10+02=12
And puts [seqmap {: + : =} 10:02:12:04:16] => 10+02=12+04=16
As presented, the command can handle up to 99 unique keys. But it can easily be updated if more are needed.

reading file with "[" and manipulation each line TCL

I have file with the below lines (file.list):
insert_buffer [get_ports { port }] BUFF1 -new_net net -new_cell cell
I'm reading the file with the below script (read.tcl):
#! /usr/local/bin/tclsh
foreach arg $argv {
set file [open $arg r]
set data [ read $file ]
foreach line [ split $data "\n" ] {
puts $line
set name [lindex $line [expr [lsearch -all $line "-new_cell"]+1]]
puts $name
}
close $file
}
while running the above script (read.tcl file.list) I get error since I have "[" in file.list and script think its a beginning of TCL command.
list element in braces followed by "]" instead of space
while executing
"lsearch -all $line "-new_cell""
("foreach" body line 5)
invoked from within
"foreach line [ split $data "\n" ] {
How can I read the file correctly and overcome the "[" symbol?
How can I read the file correctly and overcome the "[" symbol?
I don't really understand why you are doing what you are doing (processing one Tcl script by another), but you have to make sure that each line is a valid Tcl list before submitting it to lsearch.
lsearch -all [split $line] "-new_cell"
Only split will turn an arbitrary string (containing characters special to Tcl) into a valid Tcl list.
This is one of the few times in Tcl that you need to worry about what type of data you have. $line holds a string. Don't use list commands on strings because there's no guarantee that an arbitrary string is a well-formed list.
Do this:
set fields [split $line]
# don't use "-all" here: you want a single index, not a list of indices.
set idx [lsearch -exact $fields "-new_cell"]
if {$idx == -1} {
do something here if there's no -new_cell in the line
} else {
set name [lindex $fields $idx+1]
}
In order to apply a list operation on the variable, it has to be a valid list. The variable $line is not a valid list.
It is better to use regexp rather than lsearch
regexp -- {-new_cell\s+(\S+)} $x match value
puts $value
Output :
cell

Replace several lines of commands with a single variable in tcl

I know I have been asking a lot of questions but I'm still learning tcl and I haven't found anything that similar to this issue anywhere so far. Is it at all possible to replace a set f commands in tcl with one variable function0 for example?
I want to be able to replace the following code;
set f [listFromFile $path1]
set f [lsort -unique $f]
set f [lsearch -all -inline $f "test_*"]
set f [regsub -all {,} $f "" ]
set len [llength $f]
set cnt 0
with a variable function0 because this same code appears numerous times within the script. I should mention it appears both in a proc and not in a proc
The above code relates to similar script as
while {$cnt < $len} {
puts [lindex $f $cnt]
incr cnt
after 25; #not needed, but for viewing purposes
}
Variables are for storing values. To hide away (encapsulate) some lines of code you need a command procedure, which you define using the proc command.
You wanted to hide away the following lines
set f [listFromFile $path1]
set f [lsort -unique $f]
set f [lsearch -all -inline $f "test_*"]
set f [regsub -all {,} $f "" ]
set len [llength $f]
set cnt 0
to be able to just invoke for instance function0 $path1 and have all those calculations made in one fell swoop. Further, you wanted to use the result of calling the procedure in code like this:
while {$cnt < $len} {
puts [lindex $f $cnt]
# ...
Which means you want function0 to produce three different values, stored in cnt, len, and f. There are several ways to have a command procedure return multiple values, but the cleanest solution here is to make it return a single value; the list that you want to print. The value in len can be calculated from that list with a single command, and the initialization of cnt is better performed outside the command procedure. What you get is this:
proc function0 path {
set f [listFromFile $path]
set f [lsort -unique $f]
set f [lsearch -all -inline $f test_*]
set f [regsub -all , $f {}]
return $f
}
which you can use like this:
set f [function0 $path1]
set len [llength $f]
set cnt 0
while {$cnt < $len} {
puts [lindex $f $cnt]
incr cnt
after 25; #not needed, but for viewing purposes
}
or like this:
set f [function0 $path1]
set len [llength $f]
for {set cnt 0} {$cnt < $len} {incr cnt} {
puts [lindex $f $cnt]
after 25; #not needed, but for viewing purposes
}
or like this:
set f [function0 $path1]
foreach item $f {
puts $item
after 25; #not needed, but for viewing purposes
}
This is why I didn't bother to create a procedure returning three values: you only really needed one.
glenn jackman makes a very good point (or two points, actually) in another answer about the use of regsub. For completeness, I will repeat it here.
Tcl is a bit confusing because it usually allows string operations (like string substitution) on data structures that aren't formally strings. This makes the language very powerful and expressive, but also means that newbies do not always get the kick in the shins that a regular type system would give them.
In this case you created a list structure inside listFromFile by reading a string from a file and then using split on it. From that point on it's a list and you should only perform list operations on it. If you wanted to take out all commas in your data you should either perform that operation on each item in the list, or else perform the operation inside listFromFile, before splitting the text.
String operations on lists will work, but sometimes the result will be garbled, so mixing them should be avoided. The other good point was that in this case string map is preferable to regsub, if nothing else it makes the code a bit clearer.
Documentation: for, foreach, lindex, llength, lsearch, lsort, proc, puts, regsub, set, split, string, while
(more of a comment than an answer, but I want the formatting)
One thing to be aware of: $f holds a list, then you use the string command regsub on it, then you treat the result of regsub as a list again.
Use list commands with list values. I'd replace the regsub command with
set f [lmap elem $f {string map {"," ""} $elem} ]
for Tcl version 8.5 or earlier, you could do this:
for {set i 0} {$i < [llength $f]} {incr i} {
lset f $i [string map {, ""} [lindex $f $i]]
}