Replace same strings with swap difference? - tcl

To manipulate Strings in Tcl, we use the string command.
If you need to replace comma:
set value { 10.00 }
puts [string map -nocase { . , } $value]
# Return: 10,00
We can replace several strings:
set text "This is a replacement test text"
puts [string map -nocase { e E s S a A } $text]
# Returns: THIS IS A TEXT OF REPLACEMENT TEST
Of course, we can replace words:
set text "This is a replacement test text"
puts [string map -nocase {test TEST a {second}} $text]
# Returns: This is the second replacement TEST text.
So far so good!
But one question that does not want to be silent is .. How to replace more than one identical occurrence in the sentence, giving a DIFFERENT substitution for each of them?
For example:
set time {10:02:12}
puts [string map -nocase { { : +} {: =} } $time]
I would like this result: 10 + 02 = 12

proc seqmap {str match args} {
set rc $str
foreach l [lreverse [regexp -all -indices -inline ***=$match $str]] \
replacement [lreverse $args] {
set rc [string replace $rc {*}$l $replacement]
}
return $rc
}
seqmap 10:02:12 : { + } { = }
=> 10 + 02 = 12
I'm using lreverse in case the replacement has a different length than the string it replaces. The indices would be off if the replacements were done from left to right.
The ***= is used to avoid special treatment of wildcard characters in the match string.
Of course, things get a lot more complicated if you want to handle the case where the number of occurrences doesn't match the number of provided substitutions. And even more if you want to replace several different strings.
This version handles the complications mentioned above:
proc seqmap {map str} {
# Transform the map into a dict with each key containing a list of replacements
set mapdict {}
foreach {s r} $map {dict lappend mapdict $s $r}
# Build a map where each key maps to a unique tag
# At the same time build a dict that maps our tags to the replacements
# First map the chosen tag character in case it is present in the string
set newmap {# #00}
set mapdict [dict map {s r} $mapdict {
lappend newmap $s [set s [format #%02d [incr num]]]
set r
}]
# Add the tag character to the dict so it can be mapped back
dict set mapdict #00 #
# Map the tags into the string
set rc [string map $newmap $str]
# Locate the positions where the tags ended up
set match [regexp -all -indices -inline {#\d\d} $rc]
# Create a list of replacements matching the tags
set replace [lmap l $match {
# Extract the tag
set t [string range $rc {*}$l]
# Obtain a replacement for this tag
set s [lassign [dict get $mapdict $t] r]
# Return the used replacement to the end of the list
dict set mapdict $t [linsert $s end $r]
# Add the replacement to the list
set r
}]
# Walk the two lists in reverse order, replacing the tags with the selected replacements
foreach l [lreverse $match] r [lreverse $replace] {
set rc [string replace $rc {*}$l $r]
}
# Done
return $rc
}
You call it just like you would string map, so with a key-value mapping and the string to perform the replacements on. Any duplicated keys specify the subsequent values to be substituted for each occurrence of the key. When the list is exhausted it starts over from the beginning.
So puts [seqmap {: + : = : *} 10:02:12] => 10+02=12
And puts [seqmap {: + : =} 10:02:12:04:16] => 10+02=12+04=16
As presented, the command can handle up to 99 unique keys. But it can easily be updated if more are needed.

Related

Is there a simple way to parse a line of Tcl into its command and its arguments (not just splitting by whitespace)

Suppose I have a string which is also a Tcl command.
set line {lsort -unique [list a b c a]}
How can I convert this string into a list equivalent to this?
{
{lsort}
{-unique}
{[list a b c a]}
}
Because of whitespace inside the square brackets, I can't just use lindex.
For example:
> lindex $line 2
--> [list
The reason I'm asking is because I have a large Tcl script that I want to parse and re-write. I would like certain lines in the re-written script to have swapped argument order or some numerical arguments scaled by a factor.
I know I could parse the string character by character, keeping track of {}, [], and " characters, but this feels like re-inventing something that might already exist. I've been looking at the info and interp commands but couldn't find anything there.
I used info complete successfully in this proc.
proc command_to_list {command} {
# split by whitespace
set words [regexp -all -inline {\S+} $command]
set spaces [regexp -all -inline {\s+} $command]
set output_list [list]
set buffer ""
foreach word $words space $spaces {
append buffer $word
if {[info complete $buffer]} {
lappend output_list $buffer
set buffer ""
} else {
append buffer $space
}
}
return $output_list
}
This proc will group whitespace separated 'words' until they have no unmatched curlies, double quotes, or square brackets. Whitespace is preserved inside of matching pairs of curlies, double quotes or square brackets.
> set command {foreach {k v} [list k1 v1 k2 v2] {puts "$k $v"}}
> foreach word [command_to_list $command] {puts $word}
foreach
{k v}
[list k1 v1 k2 v2]
{puts "$k $v"}

How to match a string and print the next word afterthat?

Lets say i have the following script and have to look for .model and print the next two word before (. The following is the contents of the file that I need to read.
.model Q2N2222 NPN(Is=14.34f Xti=3 Eg=1.11 Vaf=74.03 Bf=255.9 Ne=1.307
Ise=14.34f Ikf=.2847 Xtb=1.5 Br=6.092 Nc=2 Isc=0 Ikr=0 Rc=1
+ Cjc=7.306p Mjc=.3416 Vjc=.75 Fc=.5 Cje=22.01p Mje=.377 Vje=.75
+ Tr=46.91n Tf=411.1p Itf=.6 Vtf=1.7 Xtf=3 Rb=10)
* National pid=19 case=TO18
* 88-09-07 bam creation
*$
.model Q2N3904 NPN(Is=6.734f Xti=3 Eg=1.11 Vaf=74.03 Bf=416.4 Ne=1.259
.model Q2N3906 PNP(Is=1.41f Xti=3 Eg=1.11 Vaf=18.7 Bf=180.7 Ne=1.5 Ise=0
Here is the code i have written so far. But i couldnt get any. Need the help
proc find_lib_parts {f_name} {
set value [string first ".lib" $f_name]
if {$value != -1} {
#open the file
set fid [ open $f_name "r"]
#read the fid and split it in to lines
set infos [split [read $fid] "\n"]
close $fid
set res {}
append res "MODEL FOUND:\n"
if {[llength $line] > 2 && [lindex $line 0] eq {model}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
if {[llength $line] > 2 && [lindex $line 0] eq {MODEL}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
}
return $res
In this case, a regular expression is by far the simplest way of doing such a search. Assuming the words are always on the same line, it's easy:
proc find_lib_parts {f_name} {
set fid [open $f_name]
set infos [split [read $fid] "\n"]
close $fid
set found {}
foreach line $infos {
if {[regexp {\.model\s+(\w+\s+\w+)\(} $line -> twoWords]} {
lappend found $twoWords
}
}
return $found
}
For your input data sample, that'll produce a result like this:
{Q2N2222 NPN} {Q2N3904 NPN} {Q2N3906 PNP}
If there's nothing to find, you'll get an empty list. (I assume you pass filenames correctly anyway, so I omitted that check.)
The regular expression, which should virtually always be enclosed in {braces} in Tcl, is this:
\.model\s+(\w+\s+\w+)\(
It's relatively simple. The pieces of it are:
\.model — literal “.model” (with an escape of the . because it is a RE metacharacter)
\s+ — some whitespace
( — start a capturing group (the bit we put into the twoWords variable)
\w+ — a “word”, one or more alphanumeric (or underscore) characters
\s+ — some whitespace
\w+ — a “word”, one or more alphanumeric (or underscore) characters
) — end of the capturing group
\( — literal “(”, escaped
The regexp command matches this, returning whether or not it matched (effectively boolean without the -all option, which we're not using here), and assigning the various groups to the variables named afterwards, -> for the whole matched string (yes, that's a legal variable name; I like to use it for regexp variables that dump info I don't want) and twoWords for the interesting substring.

How I can get unmatched part of string using TCL?

I am comparing two strings, how I can get the part of string which did not match between these two
This is an interesting problem that requires a longest common subsequence algorithm. Tcl's got one of those already in Tcllib, but it's for lists. Fortunately, we can convert a string into a list of characters with split:
package require struct::list
set a "the quick brown fox"
set b "the slow green fox"
set listA [split $a ""]; set lenA [llength $listA]
set listB [split $b ""]; set lenB [llength $listB]
set correspondences [struct::list longestCommonSubsequence $listA $listB]
set differences [struct::list lcsInvertMerge $correspondences $lenA $lenB]
Now we can get the parts that didn't match up by picking the parts from the differences that are added, changed or deleted:
set common {}
set unmatchedA {}
set unmatchedB {}
foreach diff $differences {
lassign $diff type rangeA rangeB
switch $type {
unchanged {
lappend common [join [lrange $listA {*}$rangeA] ""]
}
added {
lappend unmatchedB [join [lrange $listB {*}$rangeB] ""]
}
changed {
lappend unmatchedA [join [lrange $listA {*}$rangeA] ""]
lappend unmatchedB [join [lrange $listB {*}$rangeB] ""]
}
deleted {
lappend unmatchedA [join [lrange $listA {*}$rangeA] ""]
}
}
}
puts common->$common
# common->{the } ow {n fox}
puts A->$unmatchedA
# A->{quick br}
puts B->$unmatchedB
# B->sl { gree}
In this case, we see the following correspondences (. is a spacer I've inserted to help line things up):
the quick br..ow.....n fox
the ........slow green fox
Whether this is exactly what you want, I don't know (and there's more detail in the computed differences; they're just a bit hard to read). You can easily switch to doing a word-by-word correspondence instead if that's more to your taste. It's pretty much just removing the split and join…
If you have a string and you want to remove a fixed substring, for example
set str "this is a larger? string"
set substr "a larger?"
Then you can do this:
set parts [split [string map [list $s2 \uffff] $s1] \uffff]
# returns the list: {this is } { string}
That globally replaces the substring within the larger string with a single character, then splits the result on that same character.

How to find ',' in a string in TCL

I am new to TCL, just wanted to know that how can we search for "," in a string and want the particular string before and after.
Example : tampa,florida
It has to search for , if in that string if there is , it should return tampa and florida we can use string replace but it will not work in my condition because i need to map, tampa and florida to different set of variables dont even know how the inbound would look like to use string range.
.
Thanks,
Arya
Unless there is some further condition, you could do it this way:
split tampa,florida ,
This command gives as result a list containing the two strings "tampa" and "florida".
Documentation: split
The shortest piece of code to do this would be using regular expressions:
if {[regexp {(.+),(.+)} $string a b c]} {
# $a is the complete match. But we don't care
# about that so we ignore it
puts $b; #tampa
puts $c; #florida
}
The regular expression (.+),(.+) means:
(
. any character
+ one or more of the above
) save it in a capture group
, comma character
(
. any character
+ one or more of the above
) save it in a capture group
See the documentation of regular expression syntax in tcl for more about regular expressions: https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm
But if you're not familiar with regular expressions and want an alternative way of doing this you can use the various string commands. This is one way to do it:
set comma_location [string first "," $string]
if {$comma_location > -1} {
set a [string range $string 0 [expr {$comma_location -1}]
set b [string range $string [expr {$comma_location +1}] end]
puts $a; #tampa
puts $b; #florida
}
A variant of slebetman's last answer.
proc before_after {value find {start 0}} {
set index [string first $find $value $start]
set left_side [string range $value $start [expr $index - 1]]
set right_side [string range $value [expr $index + 1] end]
return [list $left_side $right_side]
}
puts [before_after "tampa,fl" ","]
output:
tampa fl

TCL string match from file

I am trying to find a string in a file in TCL. Using the wish console, I get a successful match between two strings. When I read a string from a file and match it to its exact copy, it fails. I can see in Eclipse that the variables contain exactly the same string...that is unless there are invisible characters trailing. The following code never returns 1, even when the variables contain exactly the same strings.
set fileId [open $::InputFile "r"]
set file_data [read $fileId]
# Process data file
set data [split $file_data "\n"]
#search for string
foreach line $data {
set x $::StringToFind
set y $line
set z [string match x y]
puts $z
if [ string match $::StringToFind line ] {
return 1
}
}
You need to use the dollar sign on the line variable to get its value:
if [ string match $::StringToFind $line ] {
Also, it is a good practice to quote the condition of the if command:
if {[string match $::StringToFind $line]} {