TCL String Manipulation and Extraction - tcl

I have a string xxxxxxx-s12345ab7_0_0_xx2.log and need to have an output like AB700_xx2 in TCL.
ab will be the delimiter and need to extract from ab to . (including ab) and also have to remove only the first two underscores.
Tried string trim, string trimleft and string trimright, but not much use. Is there anything like string split in TCL?

The first stage is to extract the basic relevant substring; the easiest way to do that is actually with a regular expression:
set inputString "xxxxxxx-s12345ab7_0_0_xx2.log"
if {![regexp {ab[^.]+} $inputString extracted]} {
error "didn't match!"
}
puts "got $extracted"
# ===> got ab7_0_0_xx2
Then, we want to get rid of those nasty underscores with string map:
set final [string map {"_" ""} $extracted]
puts "got $final"
# ===> ab700xx2
Hmm, not quite what we wanted! We wanted to keep the last underscore and to up-case the first part.
set pieces [split $extracted "_"]
set final [string toupper [join [lrange $pieces 0 2] ""]]_[join [lrange $pieces 3 end] "_"]
puts "got $final"
# ===> got AB700_xx2
(The split command divides a string up into “records” by an optional record specifier — which defaults to any whitespace character — that we can then manipulate easily with list operations. The join command does the reverse, but here I'm using an empty record specifier on one half which makes everything be concatenated. I think you can guess what the string toupper and lrange commands do…)

set a "xxxxxxx-s12345ab7_0_0_xx2.log"
set a [split $a ""]
set trig 0
set extract ""
for {set i 0} {$i < [llength $a]} {incr i} {
if {"ab" eq "[lindex $a $i][lindex $a [expr $i+1]]"} {
set trig 1
}
if {$trig == 1} {
append extract [lindex $a $i]
}
}
set extract "[string toupper [join [lrange [split [lindex [split $extract .] 0] _] 0 end-1] ""]]_[lindex [split [lindex [split $extract .] 0] _] end]"
puts $extract

Only regexp is enough to do the trick.
Set string "xxxxxxx-s12345ab7_0_0_xx2.log"
regexp {(ab)(.*)_(.*)_(.*)_(.*)\\.} $string -> s1 s2 s3 s4 s5
Set rstring "$s1$s2$s3$s4\_$s5"
Puts $rstring

Related

How to split the string and save the last word in Tcl

I have a string like this : abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs
I need to extract apple followed by everything until t_ and use that as a variable.
for example; if the string goes through the code, I am expecting apple_b_0_t as my output. I tried split and lindex but didnt work out.
set s "abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs"
set prefix [split $s "__"]
set c [lindex $prefix 4]
So ended up doing this and it worked but I am wondering if there is a easier/generic solution
set prefix [join [lrange [split $tile_dfx_fclk "__"] 12 15] _]
I'd use a regex:
set s abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfS
regexp {.*__(.*t)_.*} $s _ t
puts $t ;# => apple_b_0_t
The problem with split $s "__" is that the 2nd argument to split is not a substring: it's a set of characters, so it's just the same as split $s "_"
tcllib has a textutil::split package containing a splitx proc that splits a string on a regular expression
package require textutil::split
namespace import textutil::split::splitx
set last [lindex [splitx $s "__"] end] ;# => apple_b_0_t_boo_dfS
# and then
set wanted [regsub {[^t]*$} $last ""] ;# => apple_b_0_t
Another approach is to find the last place that __ occurs in the string:
set idx [string last "__" $s] ;# => 52
# and then
set last [string range $s $idx+2 end] ;# => apple_b_0_t_boo_dfS
This could also be done:
set s "abc0__remote_contr_major_abc__remote_hjk_klo_hcf_uio__apple_b_0_t_boo_dfs"
set c [string range $s [string first "apple_" $s] [string last "t_" $s]]
puts $c
-> apple_b_0_t

how to split a file to list of lists TCL

I'm coding TCL and I would like to split a file into two lists of lists,
the file contain:
(1,2) (3,4) (5,6)
(7,8) (9,10) (11,12)
and I would like to get two list
one for each line, that contain lists that each one contain to two number
for example:
puts $list1 #-> {1 2} {3 4} {5 6}
puts [lindex $list1 0] #-> 1 2
puts [lindex $list2 2] #-> 11 12
I tried to use regexp and split but no success
The idea of using regexp is good, but you'll need to do some post-processing on its output.
# This is what you'd read from a file
set inputdata "(1,2) (3,4) (5,6)\n(7,8) (9,10) (11,12)\n"
foreach line [split $inputdata "\n"] {
# Skip empty lines.
# (I often put a comment format in my data files too; this is where I'd handle it.)
if {$line eq ""} continue
# Parse the line.
set bits [regexp -all -inline {\(\s*(\d+)\s*,\s*(\d+)\s*\)} $line]
# Example results of regexp:
# (1,2) 1 2 (3,4) 3 4 (5,6) 5 6
# Post-process to build the lists you really want
set list([incr idx]) [lmap {- a b} $bits {list $a $b}]
}
Note that this is building up an array; long experience says that calling variables list1, list2, …, when you're building them in a loop is a bad idea, and that an array should be used, effectively giving variables like list(1), list(2), …, as that yields a much lower bug rate.
An alternate approach is to use a simpler regexp and then have scan parse the results. This can be more effective when the numbers aren't just digit strings.
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\([^()]+\)} $line]
set list([incr idx]) [lmap substr $bits {scan $substr "(%d,%d)"}]
}
If you're not using Tcl 8.6, you won't have lmap yet. In that case you'd do something like this instead:
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\(\s*(\d+)\s*,\s*(\d+)\s*\)} $line]
set list([incr idx]) {}
foreach {- a b} $bits {
lappend list($idx) [list $a b]
}
}
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\([^()]+\)} $line]
set list([incr idx]) {}
foreach substr $bits {
lappend list($idx) [scan $substr "(%d,%d)"]
# In *very* old Tcl you'd need this:
# scan $substr "(%d,%d)" a b
# lappend list($idx) [list $a $b]
}
}
You have an answer already, but it can actually be done a little bit simpler (or at least without regexp, which is usually a good thing).
Like Donal, I'll assume this to be the text read from a file:
set lines "(1,2) (3,4) (5,6)\n(7,8) (9,10) (11,12)\n"
Clean it up a bit, removing the parentheses and any white space before and after the data:
% set lines [string map {( {} ) {}} [string trim $lines]]
1,2 3,4 5,6
7,8 9,10 11,12
One way to do it with good old-fashioned Tcl, resulting in a cluster of variables named lineN, where N is an integer 1, 2, 3...:
set idx 0
foreach lin [split $lines \n] {
set res {}
foreach li [split $lin] {
lappend res [split $li ,]
}
set line[incr idx] $res
}
A doubly iterative structure like this (a number of lines, each having a number of pairs of numbers separated by a single comma) is easy to process using one foreach within the other. The variable res is used for storing result lines as they are assembled. At the innermost level, the pairs are split and list-appended to the result. For each completed line, a variable is created to store the result: its name consists of the string "line" and an increasing index.
As Donal says, it's not a good idea to use clusters of variables. It's much better to collect them into an array (same code, except for how the result variable is named):
set idx 0
foreach lin [split $lines \n] {
set res {}
foreach li [split $lin] {
lappend res [split $li ,]
}
set line([incr idx]) $res
}
If you have the results in an array, you can use the parray utility command to list them in one fell swoop:
% parray line
line(1) = {1 2} {3 4} {5 6}
line(2) = {7 8} {9 10} {11 12}
(Note that this is printed output, not a function return value.)
You can get whole lines from this result:
% set line(1)
{1 2} {3 4} {5 6}
Or you can access pairs:
% lindex $line(1) 0
1 2
% lindex $line(2) 2
11 12
If you have the lmap command (or the replacement linked to below), you can simplify the solution somewhat (you don't need the res variable):
set idx 0
foreach lin [split $lines \n] {
set line([incr idx]) [lmap li [split $lin] {
split $li ,
}]
}
Still simpler is to let the result be a nested list:
set lineList [lmap lin [split $lines \n] {
lmap li [split $lin] {
split $li ,
}
}]
You can access parts of the result similar to above:
% lindex $lineList 0
{1 2} {3 4} {5 6}
% lindex $lineList 0 0
1 2
% lindex $lineList 1 2
11 12
Documentation:
array,
foreach,
incr,
lappend,
lindex,
lmap (for Tcl 8.5),
lmap,
parray,
set,
split,
string
The code works for windows :
TCL file code is :
proc captureImage {} {
#open the image config file.
set configFile [open "C:/main/image_config.txt" r]
#To retrive the values from the config file.
while {![eof $configFile]} {
set part [split [gets $configFile] "="]
set props([string trimright [lindex $part 0]]) [string trimleft [lindex $part 1]]
}
close $configFile
set time [clock format [clock seconds] -format %Y%m%d_%H%M%S]
set date [clock format [clock seconds] -format %Y%m%d]
#create the folder with the current date
set folderPath $props(folderPath)
append folderDate $folderPath "" $date "/"
set FolderCreation [file mkdir $folderDate]
while {0} {
if { [file exists $date] == 1} {
}
break
}
#camera selection to capture image.
set camera "video"
append cctv $camera "=" $props(cctv)
#set the image resolution (XxY).
set resolutionX $props(resolutionX)
set resolutionY $props(resolutionY)
append resolution $resolutionX "x" $resolutionY
#set the name to the save image
set imagePrefix $props(imagePrefix)
set imageFormat $props(imageFormat)
append filename $folderDate "" $imagePrefix "_" $time "." $imageFormat
set logPrefix "Image_log"
append logFile $folderDate "" $logPrefix "" $date ".txt"
#ffmpeg command to capture image in background
exec ffmpeg -f dshow -benchmark -i $cctv -s $resolution $filename >& $logFile &
after 3000
}
}
captureImage
thext file code is :
cctv=Integrated Webcam
resolutionX=1920
resolutionY=1080
imagePrefix=ImageCapture
imageFormat=jpg
folderPath=c:/test/
//camera=video=Integrated Webcam,Logitech HD Webcam C525
This code works for me me accept the code from text file were list of parameters are passed.

Remove double quotes from a 'string with comma' inside csv

i'm converting xls to csv. Since i'm having commas in a single column, i'm getting csv as below:
AMP FAN,Yes,Shichi,PON Seal,,"Brass, Silver"
AMP FAN,Yes,Shichi,PON Seal,,"Platinum, Gel"
If you see double quote is coming for the last column as it has comma inside. Now i'm reading this csv in tcl file and i'm sending to my target system. In target system this value is getting saved with double quotes (means exactly like "Brass, Silver"). But the user doesn't want that double quotes. So i want to set like Brass, Silver . is there any way i can avoid that double quotes. below is the current script i'm using.
while {[gets $fileIn sLine] >= 0} {
#using regex to handle multiple commas in a single column
set matches [regexp -all -inline -- {("[^\"]+"|[^,]*)(?:$|,)} $sLine]
set lsLine {}
foreach {a b} $matches {lappend lsLine $b}
set sType [lindex $lsLine 0]
set sIsOk [lindex $lsLine 1]
set sMaterial [lindex $lsLine 5]
#later i'm setting sMaterial to some attribute
}
Kindly help me.
Note : I will not be able to use csv package as the user don't have that in their environment and i can't add there myself.
You can remove them from the token after getting each element, like this:
while {[gets $fileIn sLine] >= 0} {
#using regex to handle multiple commas in a single column
set matches [regexp -all -inline -- {("[^\"]+"|[^,]*)(?:$|,)} $sLine]
set lsLine {}
foreach {a b} $matches {
# Remove the quotes here
lappend lsLine [string map {\" {}} $b]
}
set sType [lindex $lsLine 0]
set sIsOk [lindex $lsLine 1]
set sMaterial [lindex $lsLine 5]
#later i'm setting sMaterial to some attribute
}
% set input {AMP FAN,Yes,Shichi,PON Seal,,"Brass, Silver"}
AMP FAN,Yes,Shichi,PON Seal,,"Brass, Silver"
% regsub -all \" $input {}
AMP FAN,Yes,Shichi,PON Seal,,Brass, Silver
%

splitting input line with varying formats in tcl with

Good afternoon,
I am attempting to write a tcl script which given the input file
input hreadyin;
input wire htrans;
input wire [7:0] haddr;
output logic [31:0] hrdata;
output hreadyout;
will produce
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout)
In other words, the format is:
<input/output> <wire/logic optional> <width, optional> <paramName>;
with the number of whitespaces unrestricted between each of them.
I have no problem reading from the input file and was able to put each line in a $line element. Now I have been trying things like:
set param0 [split $line "input"]
set param1 [lindex $param0 1]
But since not all lines have "input" line in them i am unable to get the elements i want (the name and the width if it exists).
Is there another command in tcl capable for doing this kind of parsing?
The regexp command is useful to find words separated by arbitrary whitespace:
while {[gets $fh line] != -1} {
# get all whitespace-separated words in the line, ignoring the semi-colon
set i [string first ";" $line]
set fields [regexp -inline -all {\S+} [string range $line 0 $i-1]]
switch -exact -- [llength $fields] {
2 - 3 {
set name [lindex $fields end]
puts [format "%s(%s)," $name $name]
}
4 {
lassign $fields - - width name
puts [format "%s(%s%s)," $name $name $width]
}
}
}
I think you should look at something like
# Compress all multiple spaces to single spaces
set compressedLine [resgub " +" $line " "]
set items [split [string range $compressedLine 0 end-1] $compressedLine " "]
switch [llength $items] {
2 {
# Handle case where neither wire/logic nor width is specificed
set inputOutput [lindex $items 0]
set paramName [lindex $items 1]
.
.
.
}
4 {
# Handle case where both wire/logic and width are specified
set inputOutput [lindex $items 0]
set wireLogic [lindex $items 1]
set width [lindex $items 2]
set paramName [lindex $items 3]
.
.
.
}
default {
# Don't know how to handle other cases - add them in if you know
puts stderr "Can't handle $line
}
}
I hope it's not legal to have exactly one of wire/logic and width specified - you'd need to work hard to determine which is which.
(Note the [string range...] fiddle to discard the semicolon at the end of the line)
Or if you can write up a regex that catches the right data, you can do this with this:
set data [open "file.txt" r]
set output [open "output.txt" w]
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
puts $output "$params\($params$width\),"
}
close $data
close $output
This one will also print the comma you have inserted in your expected output, but will insert it in the last line as well so you get:
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout),
If you don't want it and the file is not too large (apparently the limit is 2147483672 bytes for a list, which I'm gonna use), you could use a group like this:
set data [open "file.txt" r]
set output [open "output.txt" w]
set listing "" #Empty list
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
lappend listing "$params\($params$width\)" #Appending to list instead
}
puts $output [join $listing ",\n"] #Join all in a single go
close $data
close $output

TCL Program that Compare String

I'm trying to create a program that the First and last characters are compared, Second and second to the last are compared, Third and third to the last are compared, and so on, and if any of these characters match, the two will be converted to the uppercase of that character.
Example:
Please enter a text: Hello Philippines
finals: HEllo PhIlippinEs
I can't create any piece of code, I'm stuck with
puts "Please enter text:"
set myText [gets stdin]
string index $myText 4
Can someone help me please?
This procedure will also capitalize the first i in Phillipines because it's equidistant from the start and the end of the string.
proc compare_chars {str} {
set letters [split $str ""]
for {set i [expr {[llength $letters] / 2}]} {$i >= 0} {incr i -1} {
set a [lindex $letters $i]
set b [lindex $letters end-$i]
if {$a eq $b} {
lset letters $i [set L [string toupper $a]]
lset letters end-$i $L
}
}
join $letters ""
}
puts [compare_chars "Hello Phillipines"]
# outputs => HEllo PhIllipinEs
The simplest way to code this is to use foreach over the split-up characters. (It's formally not the most efficient, but it's very easy to code correctly.)
puts "Please enter text:"
set myText [gets stdin]
set chars [split $myText ""]
set idx 0
foreach a $chars b [lreverse $chars] {
if {[string equals -nocase $a $b]} {
lset chars $idx [string toupper $a]
}
incr idx
}
set output [join $chars ""]
puts $output
Note that the foreach is iterating over a copy of the list; there are no problems with concurrent modification. In fact, the only vaguely-tricky part from a coding perspective is actually that we need to keep track of the index to modify, in the idx variable above.
With Tcl 8.6 you could write:
set chars [split $myText ""]
set output [join [lmap a $chars b [lreverse $chars] {
expr {[string equals -nocase $a $b] ? [string toupper $a] : $a}
}] ""]
That does depend on having the new lmap command though.
If you're really stuck with 8.3 (it's unsupported and has been so for years, so you should be prioritizing upgrading to something more recent) then try this:
set chars [split $myText ""]
set idx [llength $chars]
set output {}
foreach ch $chars {
if {[string equals -nocase $ch [lindex $chars [incr idx -1]]]} {
append output [string toupper $ch]
} else {
append output [string tolower $ch]
}
}
All the features this uses were present in 8.3 (though some were considerably slower than in later versions).