regular expression issue in TCL for checking - tcl

In TCL, I have declared an array sstr with some patterns and I would like to match that patterns with the cryplist. If I found that match, I am displaying with array key and the matched list member. But the below program is not working. Hope I did some mistake in the declaration of regular expression.
#!/bin/tclsh
set cryplist [list "$:adzctg-cm20decadt/sr" "$:yyzpty-cm23febadt/sr" "dc*aed1740.0*gbp" "dc*ars1*usd" "dc*gbp10.00*/r" "d|t|lbb/den" "d|t|ordphx"]
array set sstr {
z "dc*[a-z]{3}*"
dl "d\$*[0-9]"
fd "\$:[a-z]{6}"
md "d|t|[a-z]{3}\/[a-z]{3}"
ms "d|t|[a-z]{6}"
}
foreach i $cryplist {
puts "------------- $i --------------"
foreach {n str} [array get sstr] {
puts "$n -> $str"
if { [regexp {$str} $i ] } {
puts "============= $n -> $i ================"
break
}
}
}

The problem is that you're using regexp {$str} $i, which makes the regular expression be the literal $str and not the contents of the str variable. Change to regexp -- $str $i and it should work; the -- says “no further options” (just for safety) and the unquoted $str reads from the variable for that argument (what you want).

Related

Is there a simple way to parse a line of Tcl into its command and its arguments (not just splitting by whitespace)

Suppose I have a string which is also a Tcl command.
set line {lsort -unique [list a b c a]}
How can I convert this string into a list equivalent to this?
{
{lsort}
{-unique}
{[list a b c a]}
}
Because of whitespace inside the square brackets, I can't just use lindex.
For example:
> lindex $line 2
--> [list
The reason I'm asking is because I have a large Tcl script that I want to parse and re-write. I would like certain lines in the re-written script to have swapped argument order or some numerical arguments scaled by a factor.
I know I could parse the string character by character, keeping track of {}, [], and " characters, but this feels like re-inventing something that might already exist. I've been looking at the info and interp commands but couldn't find anything there.
I used info complete successfully in this proc.
proc command_to_list {command} {
# split by whitespace
set words [regexp -all -inline {\S+} $command]
set spaces [regexp -all -inline {\s+} $command]
set output_list [list]
set buffer ""
foreach word $words space $spaces {
append buffer $word
if {[info complete $buffer]} {
lappend output_list $buffer
set buffer ""
} else {
append buffer $space
}
}
return $output_list
}
This proc will group whitespace separated 'words' until they have no unmatched curlies, double quotes, or square brackets. Whitespace is preserved inside of matching pairs of curlies, double quotes or square brackets.
> set command {foreach {k v} [list k1 v1 k2 v2] {puts "$k $v"}}
> foreach word [command_to_list $command] {puts $word}
foreach
{k v}
[list k1 v1 k2 v2]
{puts "$k $v"}

Replace same strings with swap difference?

To manipulate Strings in Tcl, we use the string command.
If you need to replace comma:
set value { 10.00 }
puts [string map -nocase { . , } $value]
# Return: 10,00
We can replace several strings:
set text "This is a replacement test text"
puts [string map -nocase { e E s S a A } $text]
# Returns: THIS IS A TEXT OF REPLACEMENT TEST
Of course, we can replace words:
set text "This is a replacement test text"
puts [string map -nocase {test TEST a {second}} $text]
# Returns: This is the second replacement TEST text.
So far so good!
But one question that does not want to be silent is .. How to replace more than one identical occurrence in the sentence, giving a DIFFERENT substitution for each of them?
For example:
set time {10:02:12}
puts [string map -nocase { { : +} {: =} } $time]
I would like this result: 10 + 02 = 12
proc seqmap {str match args} {
set rc $str
foreach l [lreverse [regexp -all -indices -inline ***=$match $str]] \
replacement [lreverse $args] {
set rc [string replace $rc {*}$l $replacement]
}
return $rc
}
seqmap 10:02:12 : { + } { = }
=> 10 + 02 = 12
I'm using lreverse in case the replacement has a different length than the string it replaces. The indices would be off if the replacements were done from left to right.
The ***= is used to avoid special treatment of wildcard characters in the match string.
Of course, things get a lot more complicated if you want to handle the case where the number of occurrences doesn't match the number of provided substitutions. And even more if you want to replace several different strings.
This version handles the complications mentioned above:
proc seqmap {map str} {
# Transform the map into a dict with each key containing a list of replacements
set mapdict {}
foreach {s r} $map {dict lappend mapdict $s $r}
# Build a map where each key maps to a unique tag
# At the same time build a dict that maps our tags to the replacements
# First map the chosen tag character in case it is present in the string
set newmap {# #00}
set mapdict [dict map {s r} $mapdict {
lappend newmap $s [set s [format #%02d [incr num]]]
set r
}]
# Add the tag character to the dict so it can be mapped back
dict set mapdict #00 #
# Map the tags into the string
set rc [string map $newmap $str]
# Locate the positions where the tags ended up
set match [regexp -all -indices -inline {#\d\d} $rc]
# Create a list of replacements matching the tags
set replace [lmap l $match {
# Extract the tag
set t [string range $rc {*}$l]
# Obtain a replacement for this tag
set s [lassign [dict get $mapdict $t] r]
# Return the used replacement to the end of the list
dict set mapdict $t [linsert $s end $r]
# Add the replacement to the list
set r
}]
# Walk the two lists in reverse order, replacing the tags with the selected replacements
foreach l [lreverse $match] r [lreverse $replace] {
set rc [string replace $rc {*}$l $r]
}
# Done
return $rc
}
You call it just like you would string map, so with a key-value mapping and the string to perform the replacements on. Any duplicated keys specify the subsequent values to be substituted for each occurrence of the key. When the list is exhausted it starts over from the beginning.
So puts [seqmap {: + : = : *} 10:02:12] => 10+02=12
And puts [seqmap {: + : =} 10:02:12:04:16] => 10+02=12+04=16
As presented, the command can handle up to 99 unique keys. But it can easily be updated if more are needed.

How to find multiple sub string patterns in a string in TCL

I am trying to find multiple string patterns in a string in TCL. I cannot get the correct and optimized way to do that.
I have tried some code and it is not working
I have to find -h ,-he,-hel ,-help in the string -help
set args "-help"
set res1 [string first "-h" $args]
set res2 [ string first -he $args]
set res3 [string first -hel $args]
set res4 [string first "-help" $args"]
if { $res1 == -1 || $res2 || $res3 || $res4 } {
puts "\n string not found"
} else {
puts "\n string found"
}
how to use regexp here I am not sure , so need some inputs.
The expected output is
This is a case where using regexp is easier. (Asking if a string is a prefix of -help is a separate problem.) The trick here is to use ? and (…) (or rather (?:…) which is the non-capturing version) in the RE and you must use the -- option because the RE begins with a -:
if {[regexp -- {-h(?:e(?:lp?)?)?} $string]} {
puts "Found the string"
} else {
puts "Did not find the string"
}
If you want to know what string you actually found, add in a variable to pick up the overall match:
if {[regexp -- {-h(?:e(?:lp?)?)?} $string matched]} {
puts "Found the string '$matched'"
} else {
puts "Did not find the string"
}
If you instead want the indices where it matched, you need an extra option:
if {[regexp -indices -- {-h(?:e(?:lp?)?)?} $string match]} {
puts "Found the string at $match"
} else {
puts "Did not find the string"
}
If you were instead interested in whether the string was a prefix of -help, you instead should do:
if {[string equal -length [string length $string] $string "-help"]} {
puts "Found the string"
} else {
puts "Did not find the string"
}
Many uses of this sort of thing are actually doing command line parsing. In that case, the tcl::prefix command is very useful. For example, tcl::prefix match finds the entry in a list of options that a string is a unique prefix of and generates an error message when things are ambiguous or simply don't match; the result can be switched on easily:
set MY_OPTIONS {
-help
-someOtherOpt
}
switch [tcl::prefix match $MY_OPTIONS $string] {
-help {
puts "I have -help"
}
-someOtherOpt {
puts "I have -someOtherOpt"
}
}

TCL character class

According to the TCL man page for "string match" I should be able to do a character class. A simple example:
set rev revA
if { [string match "revA*" $rev } {
puts "revA"
} elseif {[string match "revB" $rev]} {
puts "revB"
} elseif {[string match "rev[CD]" $rev]} {
puts "revC or D"
} else {
puts "none"
}
Without the "rev[CD]" section it works fine but as shown above I get:
% tclsh tmp.tcl
missing close-bracket
What am I missing?
The real error is that you are actually missing a close bracket in the first if
if { [string match "revA*" $rev } {
# .............................^
Don't forget that [brackets] also signify Tcl's command substitution, and command substitution happens within double quotes. You need to either escape the open bracket or use different quotes:
} elseif {[string match "rev\[CD]" $rev]} {
} elseif {[string match {rev[CD]} $rev]} {
However, it's more efficient in this case, instead of cascading if-elseifs, to use switch
switch -glob -- $rev {
"revA*" {puts "revA"}
"revB" {puts "revB"}
{rev[CD]} {puts "revC or D"}
default {puts "none"}
}

how to order a tcl procedure to use variables?

I'm trying to have a very generic function and ordering it to use variables it come across from the outside.
I've tried the following (simplified code), but with no use:
set line "found \$find1 at \$find2"
do_search $line
proc do_search {line} {
...
if {[regexp $exp $string match find1 find2} {
puts "$line"
}
However all I get is: found $find1 at $find2 or, if I don't use the \ before the $find, the value of find before I call the function.
Given that this regexp is part of a while loop while parsing a file, I can't use the values after the proc is called.
Any ideas how can it be done?
For your exact style, you want subst:
if {[regexp $exp $string match find1 find2} {
puts [subst $line]
}
But you might consider using format too:
set fmt "found %s at %s"
do_search $fmt
proc do_search {fmt} {
...
if {[regexp $exp $string match find1 find2} {
puts [format $fmt $find1 $find2]
}