How to split string by numerics - tcl

I havetried to split but still failed.
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
puts [split $strdata "3334445330535"] ;#<---- this command does not work
The result needed as below:
{34a64323R6662w0332665323020346t534r66662v43037} {34a64323R6662w0332665323020346t534r66662v43037}

The split command's optional second argument is interpreted as a set of characters to split on, so it really isn't going to do what you want. However, there are other approaches. One of the simpler methods of doing what you want is to use string map to convert the character sequence into a character that isn't in the input data (Unicode is full of those!) and then split on that:
set strdata "34a64323R6662w0332665323020346t534r66662v43037333444533053534a64323R6662w0332665323020346t534r66662v430373334445330535"
set splitterm "3334445330535"
set items [split [string map [list $splitterm "\uFFFF"] $strdata] "\uFFFF"]
foreach i $items {
puts "==> $i"
}
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> 34a64323R6662w0332665323020346t534r66662v43037
# ==> {}
Note that there is a {} (i.e., an empty-string list element) at the end because that's the string that came after the last split element. If you don't want that, add a string trimright between the string map and the split:
# Doing this in steps because the line is a bit long otherwise
set mapped [string map [list $splitterm "\uFFFF"] $strdata]
set trimmed [string trimright $mapped "\uFFFF"]
set items [split $trimmed "\uFFFF"]

The split command doesn't work like that, see the documentation.
Try making the data string into a list like this:
regsub -all 3334445330535 $strdata " "
i.e. replacing the delimiter with a space.
Documentation:
regsub,
split

Related

Removing everything before a certain character in TCL

Input string : 4567-ABC
I want to remove everything before "-" in the string so that Output will be ABC.
Output: ABC
If you want to avoid regular expressions:
set string 4567-ABC
set output [lindex [split $string "-"] 1]
The split command takes a string and split characters as the arguments and returns a list.
string last is useful here:
set string 4567-ABC
set idx [string last "-" $string]
set wanted [string range $string $idx+1 end]
Or without the intermediate variable
set wanted [string range $string [string last "-" $string]+1 end]
That even works if the original string does not contain any hyphens.

list searching to find exact matches using TCL lsearch

I have a list and need to search some strings in this list. My list is like following:
list1 = {slt0_reg_11.CK slt0_reg_11.Q slt0_reg_12.CK slt0_reg_12.Q}
I am trying to use lsearch to check if above list includes some strings or not. Strings are like:
string1 = {slt0_reg_1 slt0_reg_1}
I am doing the following to check this:
set listInd [lsearch -all -exact -nocase -regexp $list1 $string1]
This commands gives the indexes if list1 includes $string1 (This is what I want). However, problem is if I have a string like slt0_reg_1, the above command identifies the first two elements of the list (slt0_reg_11.CK slt0_reg_11.Q) because these covers the string I search.
How can I make exact search?
It sound like you want to add in word-boundary constraints (\y) to your RE. (Don't use -exact and -regexp at the same time; only one of those modes can be used on any run because they change the comparison engine used.) A little care must be taken because we can't enclose the RE in braces as we want to do variable substitution within it.
set list1 {slt0_reg_11.CK slt0_reg_11.Q slt0_reg_12.CK slt0_reg_12.Q}
foreach str {slt0_reg_11 slt0_reg_1} {
set matches [lsearch -all -regexp $list1 "\\y$str\\y"]
puts "$str: $matches"
}
Prints:
slt0_reg_11: 0 1
slt0_reg_1:
If you want to compare your list for an exact match of the part before the dot against another list, you may be better off using lmap:
set index -1
set listInd [lmap str $list1 {
incr index
if {[lindex [split $str .] 0] ni $string1} continue
set index
}]

How to remove a single letter/number

I have single letters and numbers in a variable that I would like to remove
example inputs:
USA-2019-1-aoiwer
USA-A-jowerasf
BB-a_owierlasdf-2019
flsfwer_5_2015-asfdlwer
desired outputs:
USA-2019--aoiwer
USA--jowerasf
BB-_owierlasdf-2019
flsfwer__2015-asfdlwer
my code:
bind pub "-|-" !aa proc:aa
proc proc:aa { nick host handle channel arg } {
set line [lindex $arg 0]
set line [string map {[a-z] """} $line]
set line [string map {[0-9] """} $line]
putnow "PRIVMSG $channel :$line"
}
Unfortunately that does not work and i have no other idea
Regards
string map would remove all the lowercase letters and numbers, if it worked. However, you also have unbalanced quotes, which causes a syntax error when the proc is resolving.
I would recommend using regsub. The hard part, however, would be to get a proper expression to do the task. I will suggest the following:
bind pub "-|-" !aa proc:aa
proc proc:aa { nick host handle channel arg } {
set line [lindex $arg 0]
regsub -nocase -all {([^a-z0-9]|\y)[a-z0-9]([^a-z0-9]|\y)} $line {\1\2} line
putnow "PRIVMSG $channel :$line"
}
Basically ([^a-z0-9]|\y) matches a character that is non alphanumeric, or a word boundary (which will match at the beginning of a sentence for example if it can, or at the end of a sentence), and stores it (this is the purpose of the parens).
The matched groups are stored in order starting with 1, so in the replace portion of regsub, I'm placing the parts that shouldn't be replaced back where they were.
The above should work fine.
You could technically go a little fancier with a slightly different expression:
regsub -nocase -all {([^a-z0-9]|\y)[a-z0-9](?![a-z0-9])} $line {\1} line
Which uses a negative lookahead ((?! ... )).
Anyway, if you do want to get more in depth, I recommend reading the manual on regular expression syntax

Remove prefix substring from string

I have a string abc.def.ghi.j and I want to remove abc. from that, so that I have def.ghi.j.
1) What would be the best approach to remove such a prefix which has a specific pattern?
2) Since in this case, abc is coincidentally the prefix, that probably makes things easier. What if we wanted abc.ghi.j as the output?
I tried it with the split method like this
set name abc.def.ghi.j
set splitVar [split $name {{abc.}} ]
The problem is that it splits across each of a, b, c and . seperately instead of as a whole.
Well, there's a few ways, but the main ones are using string replace, regsub, string map, or split-lreplace-join.
We probably ought to be a bit careful because we must first check if the prefix really is a prefix. Fortunately, string equal has a -length operation that makes that easy:
if {[string equal -length [string length $prefix] $prefix $string]} {
# Do the replacement
}
Personally, I'd probably use regsub but then I'm happy with using RE engine tricks.
Using string replace
set string [string replace $string 0 [string length $prefix]-1]
# Older versions require this instead:
# set string [string replace $string 0 [expr {[string length $prefix]-1}]]
Using regsub
# ***= is magical and says "rest of RE is simple plain text, no escapes"
regsub ***=$prefix $string "" string
Using string map
# Requires cunning to anchor to the front; \uffff is unlikely in any real string
set string [string map [list \uffff$prefix ""] \uffff$string]
Using split…join
This is about what you were trying to do. It depends on the . being a sort of separator.
set string [join [lrange [split $string "."] 1 end] "."]

TCL command - string trim

I was using the command 'string trimright' to trim my string but I found that this command trims more than required.
My expression is "dssss.dcsss" If I use string trim command to trim the last few characters ".dcsss", it trims the entire string. How can I deal with this?
Command:
set a [string trimright "dcssss.dcsss" ".dcsss"]
puts $a
Intended output:
dcsss
Actual output
""
The string trimright command treats its (optional) last argument as a set of characters to remove (and so .dcsss is the same as sdc. to it), just like string trim and string trimleft do; indeed, string trim is just like using both string trimright and string trimleft in succession. This makes it unsuitable for what you are trying to do; to remove a suffix if it is present, you can use several techniques:
# It looks like we're stripping a filename extension...
puts [file rootname "dcssss.dcsss"]
# Can use a regular expression if we're careful...
puts [regsub {\.dcsss$} "dcssss.dcsss" {}]
# Do everything by hand...
set str "dcssss.dcsss"
if {[string match "*.dcsss" $str]} {
set str [string range $str 0 end-6]
}
puts $str
If what you're doing really is filename manipulation, like it looks like, do use the first of these options. The file command has some really useful commands for working with filenames in a cross-platform manner in it.