I'm trying to append two string in tcl. I'm reading from csv and set values to the variables and then i will use that for assigning it my application. I tried the below one.
set vMyvalue [lindex $lsLine 17]
append vMyvalue " [lindex $lsLine 18]"
it is giving me the expected result. Like for e.g if i have values 250 and km in 17th and 18th position in csv. i'm getting
250 km
But the problem is when there are no values in the 17th and 18th i mean when it is empty, that time also it is adding space. But my application won't allow me to assign space for that value. How can i resolve this? I just started working in TCL. I'm not aware of many functions.
I think the most intuitive way to handle cases similar to this one if you don't know a function to do this (including for example if you are joining two strings with some character but if any one of them are empty strings, then you want something different to be done), would be to use if. In this case:
if {$vMyvalue eq " "} {set vMyvalue ""}
If you want to make your code somewhat shorter, you can make use of the functions lrange (list range), join and string:
set vMyvalue [string trim [join [lrange $lsLine 17 18] " "]]
lrange returns a list of elements from the list $lsLine between indices 17 to 18 inclusive, then join literally joins those elements with a space, and last, string trim cleans up any leading and trailing spaces (removing the space completely if it is the only character in the string).
There are several ways to do this. The minimum modification from the code you already have is probably to trim the result. Trim removes leading and trailing whitespace but if it's only whitespace it would trim it to an empty string. So:
set myValue [string trim $myValue]
Related
I have this TCL expression:
[string toupper [join [lrange [file split [value [topnode].file]] 1 1]]]
This retrieves companyName value from c:/companyName... and I need to split that value before the first capital letter into Company Name. Any ideas?
Thanks in advance.
That's rather more in one word than I would consider a good idea. It makes the whole thing quite opaque! Let's split it up.
Firstly, I would expect the base company name to be better retrieved with lindex from the split filename.
set companyName [lindex [file split [value [topnode].file]] 1]
Now, we need to process that to get the human-readable version out of it. Alas, that's going be a bit difficult without knowing what's been done to it, but if we use as our example fooBarBoo_grill then we can see what we can do. First, we get the pieces with some regular expressions (this part might need tweaking if there are non-ASCII characters involved, or if certain critical characters need special treatment):
# set companyName "fooBarBoo_grill"
set pieces [regexp -all -inline {[a-z]+|[A-Z][a-z]*} $companyName]
# pieces = foo Bar Boo grill
Next, we need to capitalise. I'll assume you're using Tcl 8.6 and so have lmap as it is perfect for this task. The string totitle command has been around for a very long time.
set pieces [lmap word $pieces {string totitle $word}]
# pieces = Foo Bar Boo Grill
That list might need a bit more tweaking, or it might be OK as it is. An example of tweaking that might be necessary is if you've got an Irish name like O'Hanrahan, or if you need to insert a comma before and period after Inc.
Finally, we properly ought to set companyName [join $pieces] to get back a true string, but that doesn't have a noticeable effect with a list of words made purely out of letters. Also, more complex joins with regular expressions might be needed if you've done insertion of prefixing punctuation (the , Inc. case).
If I was doing this for real, I'd try to have the proper company name expressed directly elsewhere rather than relying on the filename. Much simpler to get right!
To begin with, try using
lindex [file split [value [topnode].file]] 1
The lrange command will return a list, which might cause problems with some directory names. The join command should be pointless if you don't use lrange, and string toupper removes the information you need to do the operation you want to do.
To split before uppercase letters, you can use repetitive matches of either (?:[a-z]+|[A-Z][a-z]+) (ASCII / English alphabet letters only) or (?:[[:lower:]]+|[[:upper:]][[:lower:]]+) (any Unicode letters).
% regexp -all -inline {(?:[a-z]+|[A-Z][a-z]+)} camelCaseWord
camel Case Word
Use string totitle to change the first letter of the first word to upper case.
Documentation:
file,
lindex,
regexp,
string,
Syntax of Tcl regular expressions
I want 10576.53012.46344.35174 from string
"CompositionClassification|CC000003|01|10576.53012.46344.35174"
I have index of last occurrence of |, how will i get complete 10576.53012.46344.35174 sub-string from last |
Not familiar with TCL, Suggest solution on this :)
If you know the index of the first character you want, and you want from there to the end, you use:
set theSubstring [string range $theString $idx end]
However in this case I'd use split and lindex, since it looks like a simple delimited list:
set theSubstring [lindex [split $theString "|"] end]
In TCL, I need to split an ipv6 address and port combination in the format [fec1::10]:80 to fec1::10 and 80.
Please suggest a way to do it.
Thanks!
(In the examples below I assume that the address will be subjected to further processing (expansion, etc) because there are a lot of forms that it can take: hence, in this preliminary stage I treat it simply as a string of any character rather than groups of hex digits separated by colons. The ip package mentioned by kostix is excellent for processing the address, just not for separating the address from the port number.)
Given the variable
set addrport {[fec1::10]:80}
There are several possible ways, including brute-force regular expression matching:
regexp -- {\[(.+)\]:(\d+)} $addrport -> addr port
(which means "capture a non-empty sequence of any character that is inside literal brackets, then skip a colon and thereafter capture a non-empty sequence of any digit"; the three variables at the end of the invocation get the whole match, the first captured submatch, and the second captured submatch, respectively)
(note 1: American usage of the word 'brackets' here: for British speakers I mean square brackets, not round brackets/parentheses)
(note 2: I'm using the code fragment -> in two ways: as a variable name in the above example, and as a commenting symbol denoting return value in some of the following examples. I hope you're not confused by it. Both usages are kind of a convention and are seen a lot in Tcl examples.)
regexp -inline -- {\[(.+)\]:(\d+)} $addrport
# -> {[fec1::10]:80} fec1::10 80
will instead give you a list with three elements (again, the whole match, the address, and the port).
Many programmers will stop looking for possible solutions here, but you're still with me, aren't you? Because there are more, possibly better, methods.
Another alternative is to convert the string to a two-element list (where the first element is the address and the second the port number):
split [string map {[ {} ]: { }} $addrport]
# -> fec1::10 80
(which means "replace any left brackets with empty strings (i.e. remove them) and any substrings that consist of a right bracket and a colon with a single space; then split the resulting string into a list")
it can be used to assign to variables like so:
lassign [split [string map {[ {} ]: { }} $addrport]] addr port
(which performs a sequential assign from the resulting list into two variables).
The scan command will also work:
scan $addrport {[%[^]]]:%d} addr port
(which means "after a left bracket, take a sequence of characters that does not include a right bracket, then skip a right bracket and a colon and then take a decimal number")
want the result as a list instead?
scan $addrport {[%[^]]]:%d}
# -> fec1::10 80
Even split works, in a slightly roundabout way:
set list [split $addrport {[]:}]
# -> {} fec1 {} 10 {} 80
set addr [lindex $list 1]::[lindex $list 3]
set port [lindex $list 5]
(note: this will have to be rewritten for addresses that are expanded to more than two groups).
Take your pick, but remember to be wary of regular expressions. Quicker, easier, more seductive they are, but always bite you in the ass in the end, they will.
(Note: the 'Hoodiecrow' mentioned in the comments is me, I used that nick earlier. Also note that at the time this question appeared I was still sceptical towards the ip module: today I swear by it. One is never to old to learn, hopefully.)
The ip package from the Tcl standard library can do that, and more.
One of the simplest ways to parse these sorts of things is with scan. It's the command that many Tclers forget!
set toParse {[fec1::10]:80}
scan $toParse {[%[a-f0-9:]]:%d} ip port
puts "host is $ip and port is $port"
The trick is that you want “scan charcters from limited set”. And in production code you want to check the result of scan, which should be the number of groups matched (2 in this case).
I have a file in which I have to search for "if statement" and corresponding "end if statement" . Currently I am doing it using lsearch( separately for "if" and "end if" and then using lappend to combine the two). Problem arises when there is cascaded if statement, which makes it difficult to identify the related "if" and "end if" pairs. If there is no assignment between the two statements then I use lreplace to delete the lines between the if and end if pair. This has to run in loop because there are multiple such pairs. Every time lreplace is used, lsearch is used again to calculate the new indexes. I am finding that this is very inefficient implementation. Can anyone suggest some pointers to improve the same.
This is not a simple thing to do. The issue is that you're really needing a pushdown automaton rather than a simple finite automaton. Simple searching won't cut it.
What you can do though is this: go through and replace each if and end if keyword with characters otherwise unused (\u0080 and \u0081 are good candidates; the C1 controls are really obscure). Then you can use a simple match in a loop to pick off each inner pair while requiring there to be no unmatched \u0080/\u0081 inside. With each match, you get swap the characters back to the tokens and do the other processing you want at the same time. Once there are no more matches, you're done.
set txt [string map {"end if" "\u0081" "if" "\u0080"} $txt]
while {[regexp -indices {\u0080[^\u0080\u0081]*\u0081} $txt span]} {
set bit [string map {"\u0081" "end if" "\u0080" "if"} [string range $txt {*}$span]]
puts "matched $bit"
# ...
set txt [string replace $txt $bit {*}$span]
}
I am trying to split a Tcl string by TABS (\t).
Please consider the following sampleString:
I . am -> a . programmer # let "." be spaces and "->" be tabs
If I try to do the following:
set myVar [split $sampleString "\t"]
Tcl will split by spaces as well and not just the tabs.
How can I split only by tabs?
Thanks
I suspect you're just a little confused as to which output you are looking at.
% set s "I am\ta programmer"
I am a programmer
% split $s
I am a programmer
% split $s "\t"
{I am} {a programmer}
The only difference between the two splits is that without the optional second argument, the split-set is “all whitespace” (for a reasonable definition of “all”), and neither split affects the value in the variable as there is no explicit write-back here.