Remove prefix substring from string - tcl

I have a string abc.def.ghi.j and I want to remove abc. from that, so that I have def.ghi.j.
1) What would be the best approach to remove such a prefix which has a specific pattern?
2) Since in this case, abc is coincidentally the prefix, that probably makes things easier. What if we wanted abc.ghi.j as the output?
I tried it with the split method like this
set name abc.def.ghi.j
set splitVar [split $name {{abc.}} ]
The problem is that it splits across each of a, b, c and . seperately instead of as a whole.

Well, there's a few ways, but the main ones are using string replace, regsub, string map, or split-lreplace-join.
We probably ought to be a bit careful because we must first check if the prefix really is a prefix. Fortunately, string equal has a -length operation that makes that easy:
if {[string equal -length [string length $prefix] $prefix $string]} {
# Do the replacement
}
Personally, I'd probably use regsub but then I'm happy with using RE engine tricks.
Using string replace
set string [string replace $string 0 [string length $prefix]-1]
# Older versions require this instead:
# set string [string replace $string 0 [expr {[string length $prefix]-1}]]
Using regsub
# ***= is magical and says "rest of RE is simple plain text, no escapes"
regsub ***=$prefix $string "" string
Using string map
# Requires cunning to anchor to the front; \uffff is unlikely in any real string
set string [string map [list \uffff$prefix ""] \uffff$string]
Using split…join
This is about what you were trying to do. It depends on the . being a sort of separator.
set string [join [lrange [split $string "."] 1 end] "."]

Related

Removing everything before a certain character in TCL

Input string : 4567-ABC
I want to remove everything before "-" in the string so that Output will be ABC.
Output: ABC
If you want to avoid regular expressions:
set string 4567-ABC
set output [lindex [split $string "-"] 1]
The split command takes a string and split characters as the arguments and returns a list.
string last is useful here:
set string 4567-ABC
set idx [string last "-" $string]
set wanted [string range $string $idx+1 end]
Or without the intermediate variable
set wanted [string range $string [string last "-" $string]+1 end]
That even works if the original string does not contain any hyphens.

TCL regsub uses RegEx match as index in associate array

I'd like to automatically convert URLs, i.e
"https://sc-uat.ct.example.com/sc/" into "https://invbeta.example.com/sc/"
"https://sc-dev.ct.example.com/sc/" into "https://invtest.example.com/sc/"
"https://sc-qa.ct.example.com/sc/" into "https://invdemo.example.com/sc/"
I've tried following code snippet in TCL
set loc "https://sc-uat.ct.example.com/sc/"
set envs(dev) "test"
set envs(uat) "beta"
set envs(qa) "demo"
puts $envs(uat)
regsub -nocase {://.+-(.+).ct.example.com} $loc {://inv[$envs(\1)].example.com} hostname
puts "new location = $hostname"
But the result is: new location = https://inv[$envs(uat)].example.com/sc/
It seems that [$envs(uat)] is NOT evaluated and substituted further with the real value. Any hints will be appreciated. Thanks in advance
But the result is: new location =
https://inv[$envs(uat)].example.com/sc/ It seems that [$envs(uat)] is eval-ed further.
You meant to say: [$envs(uat)] is not evaluated further?
This is because due to the curly braces in {://inv[$envs(\1)].example.com}, the drop-in string is taken literally, and not subjected to variable or command substitution. Besides, you don't want command and variable substitution ([$envs(\1)]), just one of them: $envs(\1) or [set envs(\1)].
To overcome this, you must treat the regsub-processed string further via subst:
set hostname [subst -nocommands -nobackslashes [regsub -nocase {://.+-(.+).ct.example.com} $loc {://inv$envs(\1).example.com}]]
Suggestions for improvement
I advise to avoid the use of subst in this context, because even when restricted, you might run into conflicts with characters special to Tcl in your hostnames (e.g., brackets in the IPv6 authority parts). Either you have to sanitize the loc string before, or, better work on string ranges like so:
if {[regexp -indices {://(.+-(.+)).ct.example.com} $loc _ replaceRange keyRange]} {
set key [string range $loc {*}$keyRange]
set sub [string cat "inv" $envs($key)]
set hostname [string replace $loc {*}$replaceRange $sub]
}

How to split string by numerics

I havetried to split but still failed.
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
puts [split $strdata "3334445330535"] ;#<---- this command does not work
The result needed as below:
{34a64323R6662w0332665323020346t534r66662v43037} {34a64323R6662w0332665323020346t534r66662v43037}
The split command's optional second argument is interpreted as a set of characters to split on, so it really isn't going to do what you want. However, there are other approaches. One of the simpler methods of doing what you want is to use string map to convert the character sequence into a character that isn't in the input data (Unicode is full of those!) and then split on that:
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
set splitterm "3334445330535"
set items [split [string map [list $splitterm "\uFFFF"] $strdata] "\uFFFF"]
foreach i $items {
puts "==> $i"
}
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> {}
Note that there is a {} (i.e., an empty-string list element) at the end because that's the string that came after the last split element. If you don't want that, add a string trimright between the string map and the split:
# Doing this in steps because the line is a bit long otherwise
set mapped [string map [list $splitterm "\uFFFF"] $strdata]
set trimmed [string trimright $mapped "\uFFFF"]
set items [split $trimmed "\uFFFF"]
The split command doesn't work like that, see the documentation.
Try making the data string into a list like this:
regsub -all 3334445330535 $strdata " "
i.e. replacing the delimiter with a space.
Documentation:
regsub,
split

how to find and replace sencond occurance of string using regsub

I am new to tcl, trying to learn, need a help for below.
My string looks like in configFileBuf and trying to replace second occurance of ConfENB:local-udp-port>31001" with XYZ, but below regsub cmd i was tried is always replacing with first occurance (37896). Plz help how to replace second occurance with xyz.
set ConfigFileBuf "<ConfENB:virtual-phy>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>37896</ConfENB:local-udp-port>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>31001</ConfENB:local-udp-port>
</ConfENB:virtual-phy>"
regsub -start 1 "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>" $ConfigFileBuf "XYZ" ConfigFileBuf
puts $ConfigFileBuf
You have to use regexp -indices to find where to start the replacement, and only then regsub. It's not too bad if you put the regular expression in its own variable.
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] 1 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf
The 1 is the number of submatches in the RE (zero in this case) plus 1. You can compute it with the help of regexp -about, giving this piece of trickiness:
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set relen [expr {1 + [lindex [regexp -about $RE] 0]}]
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] $relen 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf
If your string was well-formed XML I'd suggest something like tDOM to manipulate it. DOM-style manipulation is almost always better than regular expression-based manipulation on XML markup. (I mention this on the off chance that it's actually supposed to be XML and you just quoted it wrong.)
It looks like you're trying to use -start 1 to tell regsub to skip the first match. The starting index is actually a character index, so in this invocation regsub will just skip the first character in the string. You could set -start further into your string, but that's fragile unless you use regexp to calculate where the first match ends.
I think the best solution would be to get a list of indices to matches by invoking regexp with -all -inline -indices, pick out the second index pair using lindex and finally use string replace to perform the substitution, like this:
set pattern {</ConfENB:local-ip-addr>[ \n\t]+<ConfENB:local-udp-port>[0-9 ]+</ConfENB:local-udp-port>}
set matches [regexp -all -inline -indices -- $pattern $ConfigFileBuf]
set match [lindex $matches 1]
set ConfigFileBuf [string replace $ConfigFileBuf {*}$match XYZ]
The variable match contains a pair of indices (start and end, respectively) for the range of characters you want to replace. As string replace expects those indices to be in different arguments you need to expand $match with the {*} prefix. If you have an earlier version of Tcl than 8.5, you need a slight change to the above code:
foreach {start end} $match break
set ConfigFileBuf [string replace $ConfigFileBuf $start $end XYZ]
In passing, note that you can avoid escaping e.g. character sets in a regular expression if you quote it with braces instead of double quotes.
Documentation links: regexp, lindex, string

TCL command - string trim

I was using the command 'string trimright' to trim my string but I found that this command trims more than required.
My expression is "dssss.dcsss" If I use string trim command to trim the last few characters ".dcsss", it trims the entire string. How can I deal with this?
Command:
set a [string trimright "dcssss.dcsss" ".dcsss"]
puts $a
Intended output:
dcsss
Actual output
""
The string trimright command treats its (optional) last argument as a set of characters to remove (and so .dcsss is the same as sdc. to it), just like string trim and string trimleft do; indeed, string trim is just like using both string trimright and string trimleft in succession. This makes it unsuitable for what you are trying to do; to remove a suffix if it is present, you can use several techniques:
# It looks like we're stripping a filename extension...
puts [file rootname "dcssss.dcsss"]
# Can use a regular expression if we're careful...
puts [regsub {\.dcsss$} "dcssss.dcsss" {}]
# Do everything by hand...
set str "dcssss.dcsss"
if {[string match "*.dcsss" $str]} {
set str [string range $str 0 end-6]
}
puts $str
If what you're doing really is filename manipulation, like it looks like, do use the first of these options. The file command has some really useful commands for working with filenames in a cross-platform manner in it.