How can I cut the substring from a string in tcl - tcl

I have string like NYMEX UTBPI. Here I want to fetch the index of white space in middle of NYMEX and UTBPI and then from that index to last index I want to cut the substring. In this case my substring will be UTBPI
I'm using below
set part1 [substr $line [string index $line " "] [string index $line end-1]]
I'm getting below error.
wrong # args: should be "string index string charIndex"
while executing
"string index $line "
("foreach" body line 2)
invoked from within
"foreach line $pollerName {
set part1 [substr $line [string index $line ] [string index $line end-1]]
puts $part1
puts $line
}"
(file "Config.tcl" line 9)
Can you give me the idea on how can I do some other string manupulation as well. Any good link for this.

I would just use string range and pass it the index of the whitespace (that you can find using string first or whatever).
% set s "NYMEX UTBPI"
NYMEX UTBPI
% string range $s 6 end
UTBPI
Or using string first to dynamically find the whitespace:
% set output [string range $s [expr {[string first " " $s] + 1}] end]
UTBPI

If processor time isn't a problem, split it into a list and take the 2nd element:
set part1 [lindex [split $line] 1]
If the string can have an arbitrary number of words,
set new [join [lrange [split $line] 1 end]]
However, I'd use Donal's suggestion and stick with string operations.

I think, the best way to do it in Tcl, is:
set s "NYMEX UTBPI"
regexp -indices " " $s index;
puts [lindex $index 0]
the variable index will contain the first and the last index of your matching pattern. Here, as you are looking for single char, first and last will be the same, so you can use
puts [lindex $index 0]
or
puts [lindex $index 1]
For more info, this is the official doc: http://www.tcl.tk/man/tcl8.5/TclCmd/regexp.htm#M7

Related

How to split the string and save the last word in Tcl

I have a string like this : abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs
I need to extract apple followed by everything until t_ and use that as a variable.
for example; if the string goes through the code, I am expecting apple_b_0_t as my output. I tried split and lindex but didnt work out.
set s "abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs"
set prefix [split $s "__"]
set c [lindex $prefix 4]
So ended up doing this and it worked but I am wondering if there is a easier/generic solution
set prefix [join [lrange [split $tile_dfx_fclk "__"] 12 15] _]
I'd use a regex:
set s abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfS
regexp {.*__(.*t)_.*} $s _ t
puts $t ;# => apple_b_0_t
The problem with split $s "__" is that the 2nd argument to split is not a substring: it's a set of characters, so it's just the same as split $s "_"
tcllib has a textutil::split package containing a splitx proc that splits a string on a regular expression
package require textutil::split
namespace import textutil::split::splitx
set last [lindex [splitx $s "__"] end] ;# => apple_b_0_t_boo_dfS
# and then
set wanted [regsub {[^t]*$} $last ""] ;# => apple_b_0_t
Another approach is to find the last place that __ occurs in the string:
set idx [string last "__" $s] ;# => 52
# and then
set last [string range $s $idx+2 end] ;# => apple_b_0_t_boo_dfS
This could also be done:
set s "abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs"
set c [string range $s [string first "apple_" $s] [string last "t_" $s]]
puts $c
-> apple_b_0_t

How to match a string and print the next word afterthat?

Lets say i have the following script and have to look for .model and print the next two word before (. The following is the contents of the file that I need to read.
.model Q2N2222 NPN(Is=14.34f Xti=3 Eg=1.11 Vaf=74.03 Bf=255.9 Ne=1.307
Ise=14.34f Ikf=.2847 Xtb=1.5 Br=6.092 Nc=2 Isc=0 Ikr=0 Rc=1
+ Cjc=7.306p Mjc=.3416 Vjc=.75 Fc=.5 Cje=22.01p Mje=.377 Vje=.75
+ Tr=46.91n Tf=411.1p Itf=.6 Vtf=1.7 Xtf=3 Rb=10)
* National pid=19 case=TO18
* 88-09-07 bam creation
*$
.model Q2N3904 NPN(Is=6.734f Xti=3 Eg=1.11 Vaf=74.03 Bf=416.4 Ne=1.259
.model Q2N3906 PNP(Is=1.41f Xti=3 Eg=1.11 Vaf=18.7 Bf=180.7 Ne=1.5 Ise=0
Here is the code i have written so far. But i couldnt get any. Need the help
proc find_lib_parts {f_name} {
set value [string first ".lib" $f_name]
if {$value != -1} {
#open the file
set fid [ open $f_name "r"]
#read the fid and split it in to lines
set infos [split [read $fid] "\n"]
close $fid
set res {}
append res "MODEL FOUND:\n"
if {[llength $line] > 2 && [lindex $line 0] eq {model}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
if {[llength $line] > 2 && [lindex $line 0] eq {MODEL}} {
#lappend res [lindex $data 2] \n
lappend res [split $line "("]\n
}
}
return $res
In this case, a regular expression is by far the simplest way of doing such a search. Assuming the words are always on the same line, it's easy:
proc find_lib_parts {f_name} {
set fid [open $f_name]
set infos [split [read $fid] "\n"]
close $fid
set found {}
foreach line $infos {
if {[regexp {\.model\s+(\w+\s+\w+)\(} $line -> twoWords]} {
lappend found $twoWords
}
}
return $found
}
For your input data sample, that'll produce a result like this:
{Q2N2222 NPN} {Q2N3904 NPN} {Q2N3906 PNP}
If there's nothing to find, you'll get an empty list. (I assume you pass filenames correctly anyway, so I omitted that check.)
The regular expression, which should virtually always be enclosed in {braces} in Tcl, is this:
\.model\s+(\w+\s+\w+)\(
It's relatively simple. The pieces of it are:
\.model — literal “.model” (with an escape of the . because it is a RE metacharacter)
\s+ — some whitespace
( — start a capturing group (the bit we put into the twoWords variable)
\w+ — a “word”, one or more alphanumeric (or underscore) characters
\s+ — some whitespace
\w+ — a “word”, one or more alphanumeric (or underscore) characters
) — end of the capturing group
\( — literal “(”, escaped
The regexp command matches this, returning whether or not it matched (effectively boolean without the -all option, which we're not using here), and assigning the various groups to the variables named afterwards, -> for the whole matched string (yes, that's a legal variable name; I like to use it for regexp variables that dump info I don't want) and twoWords for the interesting substring.

Tcl: replace string in a specific column

I have the below line:
^ 1 0.02199 0.03188 0.03667 0.00136 0.04155 0.00000 1.07223 1.07223 -0.47462 0.00335 -0.46457 buf_63733/Z DCKBD1BWP240H11P57PDULVT -
I want to replace column 3 with a different value and to keep the entire line with spaces as is.
I tried lreplace - but spaces deleted.
string map can only replace a word but didn't find a way to replace exact column.
Can someone advice?
Assuming the columns are separated by at least 2 spaces, you could use something like:
set indices [regexp -all -indices -inline {\S+(?:\s\S+)?\s{2,}} $line]
set colCount 1
set newValue 0.01234
foreach pair $indices {
if {$colCount == 3} {
lassign $pair start end
set column [string range $line $start $end]
set value [string trimright $column]
set valueEnd [expr {$end-[string length $column]+[string length $value]}]
set newLine [string replace $line $start $valueEnd $newValue]
} elseif {$colCount > 3} {
break
}
incr colCount
}
You can change the newValue to something else or the newLine to line if you don't need the old line.
Another method uses regsub to inject a command into the replacement string, and then subst to evaluate it. This is like perl's s/pattern/code/e
set newline [subst [regsub {^((?:\s+\S+){2})(\s+\S+)} $line \
{\1[format "%*s" [string length "\2"] $newvalue]}]]

TCL String Manipulation and Extraction

I have a string xxxxxxx-s12345ab7_0_0_xx2.log and need to have an output like AB700_xx2 in TCL.
ab will be the delimiter and need to extract from ab to . (including ab) and also have to remove only the first two underscores.
Tried string trim, string trimleft and string trimright, but not much use. Is there anything like string split in TCL?
The first stage is to extract the basic relevant substring; the easiest way to do that is actually with a regular expression:
set inputString "xxxxxxx-s12345ab7_0_0_xx2.log"
if {![regexp {ab[^.]+} $inputString extracted]} {
error "didn't match!"
}
puts "got $extracted"
# ===> got ab7_0_0_xx2
Then, we want to get rid of those nasty underscores with string map:
set final [string map {"_" ""} $extracted]
puts "got $final"
# ===> ab700xx2
Hmm, not quite what we wanted! We wanted to keep the last underscore and to up-case the first part.
set pieces [split $extracted "_"]
set final [string toupper [join [lrange $pieces 0 2] ""]]_[join [lrange $pieces 3 end] "_"]
puts "got $final"
# ===> got AB700_xx2
(The split command divides a string up into “records” by an optional record specifier — which defaults to any whitespace character — that we can then manipulate easily with list operations. The join command does the reverse, but here I'm using an empty record specifier on one half which makes everything be concatenated. I think you can guess what the string toupper and lrange commands do…)
set a "xxxxxxx-s12345ab7_0_0_xx2.log"
set a [split $a ""]
set trig 0
set extract ""
for {set i 0} {$i < [llength $a]} {incr i} {
if {"ab" eq "[lindex $a $i][lindex $a [expr $i+1]]"} {
set trig 1
}
if {$trig == 1} {
append extract [lindex $a $i]
}
}
set extract "[string toupper [join [lrange [split [lindex [split $extract .] 0] _] 0 end-1] ""]]_[lindex [split [lindex [split $extract .] 0] _] end]"
puts $extract
Only regexp is enough to do the trick.
Set string "xxxxxxx-s12345ab7_0_0_xx2.log"
regexp {(ab)(.*)_(.*)_(.*)_(.*)\\.} $string -> s1 s2 s3 s4 s5
Set rstring "$s1$s2$s3$s4\_$s5"
Puts $rstring

splitting input line with varying formats in tcl with

Good afternoon,
I am attempting to write a tcl script which given the input file
input hreadyin;
input wire htrans;
input wire [7:0] haddr;
output logic [31:0] hrdata;
output hreadyout;
will produce
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout)
In other words, the format is:
<input/output> <wire/logic optional> <width, optional> <paramName>;
with the number of whitespaces unrestricted between each of them.
I have no problem reading from the input file and was able to put each line in a $line element. Now I have been trying things like:
set param0 [split $line "input"]
set param1 [lindex $param0 1]
But since not all lines have "input" line in them i am unable to get the elements i want (the name and the width if it exists).
Is there another command in tcl capable for doing this kind of parsing?
The regexp command is useful to find words separated by arbitrary whitespace:
while {[gets $fh line] != -1} {
# get all whitespace-separated words in the line, ignoring the semi-colon
set i [string first ";" $line]
set fields [regexp -inline -all {\S+} [string range $line 0 $i-1]]
switch -exact -- [llength $fields] {
2 - 3 {
set name [lindex $fields end]
puts [format "%s(%s)," $name $name]
}
4 {
lassign $fields - - width name
puts [format "%s(%s%s)," $name $name $width]
}
}
}
I think you should look at something like
# Compress all multiple spaces to single spaces
set compressedLine [resgub " +" $line " "]
set items [split [string range $compressedLine 0 end-1] $compressedLine " "]
switch [llength $items] {
2 {
# Handle case where neither wire/logic nor width is specificed
set inputOutput [lindex $items 0]
set paramName [lindex $items 1]
.
.
.
}
4 {
# Handle case where both wire/logic and width are specified
set inputOutput [lindex $items 0]
set wireLogic [lindex $items 1]
set width [lindex $items 2]
set paramName [lindex $items 3]
.
.
.
}
default {
# Don't know how to handle other cases - add them in if you know
puts stderr "Can't handle $line
}
}
I hope it's not legal to have exactly one of wire/logic and width specified - you'd need to work hard to determine which is which.
(Note the [string range...] fiddle to discard the semicolon at the end of the line)
Or if you can write up a regex that catches the right data, you can do this with this:
set data [open "file.txt" r]
set output [open "output.txt" w]
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
puts $output "$params\($params$width\),"
}
close $data
close $output
This one will also print the comma you have inserted in your expected output, but will insert it in the last line as well so you get:
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout),
If you don't want it and the file is not too large (apparently the limit is 2147483672 bytes for a list, which I'm gonna use), you could use a group like this:
set data [open "file.txt" r]
set output [open "output.txt" w]
set listing "" #Empty list
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
lappend listing "$params\($params$width\)" #Appending to list instead
}
puts $output [join $listing ",\n"] #Join all in a single go
close $data
close $output