Tcl: replace string in a specific column

Tcl: replace string in a specific column - tcl

I have the below line:
^ 1 0.02199 0.03188 0.03667 0.00136 0.04155 0.00000 1.07223 1.07223 -0.47462 0.00335 -0.46457 buf_63733/Z DCKBD1BWP240H11P57PDULVT -
I want to replace column 3 with a different value and to keep the entire line with spaces as is.
I tried lreplace - but spaces deleted.
string map can only replace a word but didn't find a way to replace exact column.
Can someone advice?

Assuming the columns are separated by at least 2 spaces, you could use something like:
set indices [regexp -all -indices -inline {\S+(?:\s\S+)?\s{2,}} $line]
set colCount 1
set newValue 0.01234
foreach pair $indices {
if {$colCount == 3} {
lassign $pair start end
set column [string range $line $start $end]
set value [string trimright $column]
set valueEnd [expr {$end-[string length $column]+[string length $value]}]
set newLine [string replace $line $start $valueEnd $newValue]
} elseif {$colCount > 3} {
break
}
incr colCount
}
You can change the newValue to something else or the newLine to line if you don't need the old line.

Another method uses regsub to inject a command into the replacement string, and then subst to evaluate it. This is like perl's s/pattern/code/e
set newline [subst [regsub {^((?:\s+\S+){2})(\s+\S+)} $line \
{\1[format "%*s" [string length "\2"] $newvalue]}]]

Related

Split a string into words which are enclosed in single quotes

I want to split a string into separate words which which are enclosed in single quotes like below:
For example:
set str {'Name' 'Karna Mayer' ''}
I want to split this into 3 separate words. How can this be performed using Tcl.

For this sort of task, I'd use regexp -all -inline and lmap (to drop the unwanted bits from the results of that).
set input "'Name' 'Karna Mayer' ''"
set output [lmap {- bit} [regexp -all -inline {'([^'']*)'} $input] {set bit}]
The good thing about this is that if you have a way of escaping a single quote in that, you can use a more complex regular expression and match that too.
set output [lmap {- bit} [regexp -all -inline {'((?:\\.|[^''])*)'} $input] {
string map {\\ {}} $bit
}]

You can use string map to convert the single quotes to double quotes and escape existing quotes
set str [string map {{"} {\"} ' {"}} $str]
# "name" "Karna Mayer" ""
you can then use list and argument expansion to convert it to a list
set l [list {*}$str]
# Name {Karna Mayer} {}
full program
set str {'Name' 'Karna Mayer' ''}
set str [string map {{"} {\"} ' {"}} $str]
set l [list {*}$str]

If you use single quote as a separator, then you'll take every second element:
% set input "'Name' 'Karna Mayer' ''"
'Name' 'Karna Mayer' ''
% split $input {'}
{} Name { } {Karna Mayer} { } {} {}
We see: the empty string before the first quote; the first field; the space between the 1st and 2nd; the 2nd field; the next space; the (empty) 3rd field; and then the empty string after the last quote. We want to ignore this last element.
% set fields [lmap {_ field} [lrange [split $input {'}] 0 end-1] {set field}]
Name {Karna Mayer} {}
No thanks to the Tcl syntax highlighter.

Fetch a last occurrence of row containing a substring

I have a string a which is
set $a "I have a blah blah
xyz who r u
I have a car
xyz j r u"
I have a blah blah
xyz who r u // Line 2 which contains substring xyz
I have a car
xyz j r u //Line 4 which contains substring xyz
I am using foreach loop on variable a after splitting the string variable $a by new line.
set substring "xyz"
set b [split $a '\n']
foreach eachLine $b {
if{[string first $substring $eachLine] != -1} {
puts "$eachLine"
}
}
I want the output to be:
xyz j r u //Line 4 which contains substring xyz
Currently,this would print both line 2 and line 4.
In the above code, i am trying to fetch the last line which has occurance of substring "xyz".
Can you please suggest any good way to solve this.

You could store $eachLine in a variable and then only print it after the loop ends.
set lastSeen ""
foreach eachLine $b {
if {[string first $substring $eachLine] != -1} {
set lastSeen $eachLine
}
}
puts $lastSeen
You could reverse the list and print the first time you see it:
foreach line [lreverse $b] {
if {[string first $substring $line] != -1} {
puts $line
break
}
}

The built-in way to search a list is the lsearch command. You can extract only the last occurrence using the lindex command:
puts [lindex [lsearch -all -inline -regexp $b (?q)$substring] end]
This uses the -regexp option so the search pattern is not anchored (i.e.: It may occur anywhere within the list element). Then the (?q) embedded option suppresses interpreting any character in $substring as regular expression syntax, resulting in a search on the literal text stored in $substring.

Walk the list of lines from its end forward, and stop on the first match:
set i [expr {[llength $b] - 1}]
while {$i >= 0} {
set eachLine [lindex $b $i]
if {[string first $substring $eachLine] != -1} {
puts "$eachLine"
break;
}
incr i -1
}
This way you do not double the list (lreverse) or process the whole list (lsearch), only to retrieve one match, if any at all.

Return string after specific character

I have a question regarding possibility of getting string after specific character in TCL.
Whan I mean is :
Input:
abcdefgh = hgfedcba
Output:
hgfedcba
(return everything after "=" without possible whitespaces)
This is what I was using:
regexp {abcdefgh=\s+"(.*)"} $text_var all variable
In some cases it is ok (with spaces) but when there is no whitespaces then it is not working.

Assuming
% set s {abcdefgh = hgfedcba}
# => abcdefgh = hgfedcba
(or the same thing without one or both of the blanks) you could do one of these:
% scan $s {%*[^=]= %s}
# => hgfedcba
(Scan the string for a substring not containing "=", then advance past the equals sign and optional whitespace, then return the rest of the string.)
string trim [lindex [split $s =] 1]
(Split the string at the equals sign, return the (whitespace-trimmed) second resulting element.)
string trim [string range $s [string first = $s]+1 end]
(Return the (whitespace-trimmed) substring starting after the equals sign.)
string trim [lindex [regexp -inline {[^=]+$} $s] 0]
(Return the (whitespace-trimmed) first match of one or more characters, not including the equals sign, anchored on the end of the string.)
lindex [regexp -inline -all {[a-h]+} $s] 1
(Return the second match of consecutive characters from the set "a" to "h".)
string trimleft [string trimleft $s {abcdefgh }] {= }
(Remove all characters from the start of the string that occur in the set "a" to "h" and blank, then remove from start of the resulting string any characters that are equals sign or blank.)

% regexp {abcdefgh\s*=\s*(\S+)} "abcdefgh = hgfedcba" all variable
1
% set variable
hgfedcba
% regexp {abcdefgh\s*=\s*(\S+)} "abcdefgh=hgfedcba" all variable
1
% set variable
hgfedcba
%

how to find and replace sencond occurance of string using regsub

I am new to tcl, trying to learn, need a help for below.
My string looks like in configFileBuf and trying to replace second occurance of ConfENB:local-udp-port>31001" with XYZ, but below regsub cmd i was tried is always replacing with first occurance (37896). Plz help how to replace second occurance with xyz.
set ConfigFileBuf "<ConfENB:virtual-phy>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>37896</ConfENB:local-udp-port>
</ConfENB:local-ip-addr>
<ConfENB:local-udp-port>31001</ConfENB:local-udp-port>
</ConfENB:virtual-phy>"
regsub -start 1 "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>" $ConfigFileBuf "XYZ" ConfigFileBuf
puts $ConfigFileBuf

You have to use regexp -indices to find where to start the replacement, and only then regsub. It's not too bad if you put the regular expression in its own variable.
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] 1 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf
The 1 is the number of submatches in the RE (zero in this case) plus 1. You can compute it with the help of regexp -about, giving this piece of trickiness:
set RE "</ConfENB:local-ip-addr>\[ \n\t\]+<ConfENB:local-udp-port>\[0-9 \]+</ConfENB:local-udp-port>"
set relen [expr {1 + [lindex [regexp -about $RE] 0]}]
set start [lindex [regexp -all -indices -inline $RE $ConfigFileBuf] $relen 0]
regsub -start $start RE $ConfigFileBuf "XYZ" ConfigFileBuf

If your string was well-formed XML I'd suggest something like tDOM to manipulate it. DOM-style manipulation is almost always better than regular expression-based manipulation on XML markup. (I mention this on the off chance that it's actually supposed to be XML and you just quoted it wrong.)
It looks like you're trying to use -start 1 to tell regsub to skip the first match. The starting index is actually a character index, so in this invocation regsub will just skip the first character in the string. You could set -start further into your string, but that's fragile unless you use regexp to calculate where the first match ends.
I think the best solution would be to get a list of indices to matches by invoking regexp with -all -inline -indices, pick out the second index pair using lindex and finally use string replace to perform the substitution, like this:
set pattern {</ConfENB:local-ip-addr>[ \n\t]+<ConfENB:local-udp-port>[0-9 ]+</ConfENB:local-udp-port>}
set matches [regexp -all -inline -indices -- $pattern $ConfigFileBuf]
set match [lindex $matches 1]
set ConfigFileBuf [string replace $ConfigFileBuf {*}$match XYZ]
The variable match contains a pair of indices (start and end, respectively) for the range of characters you want to replace. As string replace expects those indices to be in different arguments you need to expand $match with the {*} prefix. If you have an earlier version of Tcl than 8.5, you need a slight change to the above code:
foreach {start end} $match break
set ConfigFileBuf [string replace $ConfigFileBuf $start $end XYZ]
In passing, note that you can avoid escaping e.g. character sets in a regular expression if you quote it with braces instead of double quotes.
Documentation links: regexp, lindex, string

How can I cut the substring from a string in tcl

I have string like NYMEX UTBPI. Here I want to fetch the index of white space in middle of NYMEX and UTBPI and then from that index to last index I want to cut the substring. In this case my substring will be UTBPI
I'm using below
set part1 [substr $line [string index $line " "] [string index $line end-1]]
I'm getting below error.
wrong # args: should be "string index string charIndex"
while executing
"string index $line "
("foreach" body line 2)
invoked from within
"foreach line $pollerName {
set part1 [substr $line [string index $line ] [string index $line end-1]]
puts $part1
puts $line
}"
(file "Config.tcl" line 9)
Can you give me the idea on how can I do some other string manupulation as well. Any good link for this.

I would just use string range and pass it the index of the whitespace (that you can find using string first or whatever).
% set s "NYMEX UTBPI"
NYMEX UTBPI
% string range $s 6 end
UTBPI
Or using string first to dynamically find the whitespace:
% set output [string range $s [expr {[string first " " $s] + 1}] end]
UTBPI

If processor time isn't a problem, split it into a list and take the 2nd element:
set part1 [lindex [split $line] 1]
If the string can have an arbitrary number of words,
set new [join [lrange [split $line] 1 end]]
However, I'd use Donal's suggestion and stick with string operations.

I think, the best way to do it in Tcl, is:
set s "NYMEX UTBPI"
regexp -indices " " $s index;
puts [lindex $index 0]
the variable index will contain the first and the last index of your matching pattern. Here, as you are looking for single char, first and last will be the same, so you can use
puts [lindex $index 0]
or
puts [lindex $index 1]
For more info, this is the official doc: http://www.tcl.tk/man/tcl8.5/TclCmd/regexp.htm#M7

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Tcl: replace string in a specific column - tcl

Another method uses regsub to inject a command into the replacement string, and then subst to evaluate it. This is like perl's s/pattern/code/e set newline [subst [regsub {^((?:\s+\S+){2})(\s+\S+)} $line \ {\1[format "%*s" [string length "\2"] $newvalue]}]]

Related

Split a string into words which are enclosed in single quotes

Fetch a last occurrence of row containing a substring

Return string after specific character

how to find and replace sencond occurance of string using regsub

How can I cut the substring from a string in tcl

Categories

Resources