Properly manipulating strings in TCL - tcl

Seems like a basic task, but I want to take a string that ends in something and replace the last 3 letters with something else. Here's my attempt:
set v "this is bob"
set index2 [string length $v]
set index1 [expr $index2 - 3]
set result [string replace $v index1 index2 dog]
puts $result; # I want it to now say "this is dog"
EDIT: Forgot to mention, the error message it gives me is:
bad index "index1": must be integer?[+-]integer? or end?[+-]integer?
while executing
"string replace $v index1 index2 .hdr"
invoked from within
"set result [string replace $v index1 index2 .hdr]"
(file "string_manipulation.tcl" line 7)

Here is one way to do it. Tcl recognizes tokens such as end, end-1, end-2, ... What you want is to replace from end-2 to end:
set v "this is bob"
set result [string replace $v end-2 end "dog"]
# Result now is "this is dog"
Update
If all you want to do is replacing the file extension, then use file rootname to remove the extension, then add your own:
set v filename.arm
set result [file rootname $v].hdr; # Replaces .arm with .hdr
This solution has the advantage of working with extensions of various lengths, not just 3.

Easy, you forgot you were in TCL right near the end. Instead of passing in the values of index1 and index2, you passed in the literal strings "index1" and "index2". So just add the missing dollar signs:
set result [string replace $v $index1 $index2 dog]
Voila! :-)

Related

TCL regsub uses RegEx match as index in associate array

I'd like to automatically convert URLs, i.e
"https://sc-uat.ct.example.com/sc/" into "https://invbeta.example.com/sc/"
"https://sc-dev.ct.example.com/sc/" into "https://invtest.example.com/sc/"
"https://sc-qa.ct.example.com/sc/" into "https://invdemo.example.com/sc/"
I've tried following code snippet in TCL
set loc "https://sc-uat.ct.example.com/sc/"
set envs(dev) "test"
set envs(uat) "beta"
set envs(qa) "demo"
puts $envs(uat)
regsub -nocase {://.+-(.+).ct.example.com} $loc {://inv[$envs(\1)].example.com} hostname
puts "new location = $hostname"
But the result is: new location = https://inv[$envs(uat)].example.com/sc/
It seems that [$envs(uat)] is NOT evaluated and substituted further with the real value. Any hints will be appreciated. Thanks in advance
But the result is: new location =
https://inv[$envs(uat)].example.com/sc/ It seems that [$envs(uat)] is eval-ed further.
You meant to say: [$envs(uat)] is not evaluated further?
This is because due to the curly braces in {://inv[$envs(\1)].example.com}, the drop-in string is taken literally, and not subjected to variable or command substitution. Besides, you don't want command and variable substitution ([$envs(\1)]), just one of them: $envs(\1) or [set envs(\1)].
To overcome this, you must treat the regsub-processed string further via subst:
set hostname [subst -nocommands -nobackslashes [regsub -nocase {://.+-(.+).ct.example.com} $loc {://inv$envs(\1).example.com}]]
Suggestions for improvement
I advise to avoid the use of subst in this context, because even when restricted, you might run into conflicts with characters special to Tcl in your hostnames (e.g., brackets in the IPv6 authority parts). Either you have to sanitize the loc string before, or, better work on string ranges like so:
if {[regexp -indices {://(.+-(.+)).ct.example.com} $loc _ replaceRange keyRange]} {
set key [string range $loc {*}$keyRange]
set sub [string cat "inv" $envs($key)]
set hostname [string replace $loc {*}$replaceRange $sub]
}

How to split string by numerics

I havetried to split but still failed.
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
puts [split $strdata "3334445330535"] ;#<---- this command does not work
The result needed as below:
{34a64323R6662w0332665323020346t534r66662v43037} {34a64323R6662w0332665323020346t534r66662v43037}
The split command's optional second argument is interpreted as a set of characters to split on, so it really isn't going to do what you want. However, there are other approaches. One of the simpler methods of doing what you want is to use string map to convert the character sequence into a character that isn't in the input data (Unicode is full of those!) and then split on that:
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
set splitterm "3334445330535"
set items [split [string map [list $splitterm "\uFFFF"] $strdata] "\uFFFF"]
foreach i $items {
puts "==> $i"
}
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> {}
Note that there is a {} (i.e., an empty-string list element) at the end because that's the string that came after the last split element. If you don't want that, add a string trimright between the string map and the split:
# Doing this in steps because the line is a bit long otherwise
set mapped [string map [list $splitterm "\uFFFF"] $strdata]
set trimmed [string trimright $mapped "\uFFFF"]
set items [split $trimmed "\uFFFF"]
The split command doesn't work like that, see the documentation.
Try making the data string into a list like this:
regsub -all 3334445330535 $strdata " "
i.e. replacing the delimiter with a space.
Documentation:
regsub,
split

Remove prefix substring from string

I have a string abc.def.ghi.j and I want to remove abc. from that, so that I have def.ghi.j.
1) What would be the best approach to remove such a prefix which has a specific pattern?
2) Since in this case, abc is coincidentally the prefix, that probably makes things easier. What if we wanted abc.ghi.j as the output?
I tried it with the split method like this
set name abc.def.ghi.j
set splitVar [split $name {{abc.}} ]
The problem is that it splits across each of a, b, c and . seperately instead of as a whole.
Well, there's a few ways, but the main ones are using string replace, regsub, string map, or split-lreplace-join.
We probably ought to be a bit careful because we must first check if the prefix really is a prefix. Fortunately, string equal has a -length operation that makes that easy:
if {[string equal -length [string length $prefix] $prefix $string]} {
# Do the replacement
}
Personally, I'd probably use regsub but then I'm happy with using RE engine tricks.
Using string replace
set string [string replace $string 0 [string length $prefix]-1]
# Older versions require this instead:
# set string [string replace $string 0 [expr {[string length $prefix]-1}]]
Using regsub
# ***= is magical and says "rest of RE is simple plain text, no escapes"
regsub ***=$prefix $string "" string
Using string map
# Requires cunning to anchor to the front; \uffff is unlikely in any real string
set string [string map [list \uffff$prefix ""] \uffff$string]
Using split…join
This is about what you were trying to do. It depends on the . being a sort of separator.
set string [join [lrange [split $string "."] 1 end] "."]

how to find and replace sencond occurance of string using regsub

I am new to tcl, trying to learn, need a help for below.
My string looks like in configFileBuf and trying to replace second occurance of ConfENB:local-udp-port>31001" with XYZ, but below regsub cmd i was tried is always replacing with first occurance (37896). Plz help how to replace second occurance with xyz.
set ConfigFileBuf "<ConfENB:virtual-phy>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>37896</ConfENB:local-udp-port>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>31001</ConfENB:local-udp-port>
</ConfENB:virtual-phy>"
regsub -start 1 "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>" $ConfigFileBuf "XYZ" ConfigFileBuf
puts $ConfigFileBuf
You have to use regexp -indices to find where to start the replacement, and only then regsub. It's not too bad if you put the regular expression in its own variable.
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] 1 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf
The 1 is the number of submatches in the RE (zero in this case) plus 1. You can compute it with the help of regexp -about, giving this piece of trickiness:
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set relen [expr {1 + [lindex [regexp -about $RE] 0]}]
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] $relen 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf
If your string was well-formed XML I'd suggest something like tDOM to manipulate it. DOM-style manipulation is almost always better than regular expression-based manipulation on XML markup. (I mention this on the off chance that it's actually supposed to be XML and you just quoted it wrong.)
It looks like you're trying to use -start 1 to tell regsub to skip the first match. The starting index is actually a character index, so in this invocation regsub will just skip the first character in the string. You could set -start further into your string, but that's fragile unless you use regexp to calculate where the first match ends.
I think the best solution would be to get a list of indices to matches by invoking regexp with -all -inline -indices, pick out the second index pair using lindex and finally use string replace to perform the substitution, like this:
set pattern {</ConfENB:local-ip-addr>[ \n\t]+<ConfENB:local-udp-port>[0-9 ]+</ConfENB:local-udp-port>}
set matches [regexp -all -inline -indices -- $pattern $ConfigFileBuf]
set match [lindex $matches 1]
set ConfigFileBuf [string replace $ConfigFileBuf {*}$match XYZ]
The variable match contains a pair of indices (start and end, respectively) for the range of characters you want to replace. As string replace expects those indices to be in different arguments you need to expand $match with the {*} prefix. If you have an earlier version of Tcl than 8.5, you need a slight change to the above code:
foreach {start end} $match break
set ConfigFileBuf [string replace $ConfigFileBuf $start $end XYZ]
In passing, note that you can avoid escaping e.g. character sets in a regular expression if you quote it with braces instead of double quotes.
Documentation links: regexp, lindex, string

TCL command - string trim

I was using the command 'string trimright' to trim my string but I found that this command trims more than required.
My expression is "dssss.dcsss" If I use string trim command to trim the last few characters ".dcsss", it trims the entire string. How can I deal with this?
Command:
set a [string trimright "dcssss.dcsss" ".dcsss"]
puts $a
Intended output:
dcsss
Actual output
""
The string trimright command treats its (optional) last argument as a set of characters to remove (and so .dcsss is the same as sdc. to it), just like string trim and string trimleft do; indeed, string trim is just like using both string trimright and string trimleft in succession. This makes it unsuitable for what you are trying to do; to remove a suffix if it is present, you can use several techniques:
# It looks like we're stripping a filename extension...
puts [file rootname "dcssss.dcsss"]
# Can use a regular expression if we're careful...
puts [regsub {\.dcsss$} "dcssss.dcsss" {}]
# Do everything by hand...
set str "dcssss.dcsss"
if {[string match "*.dcsss" $str]} {
set str [string range $str 0 end-6]
}
puts $str
If what you're doing really is filename manipulation, like it looks like, do use the first of these options. The file command has some really useful commands for working with filenames in a cross-platform manner in it.