How to add many selections to one variable in tcl - tcl

after doing some other things in my script I end up with a series of variables set in tcl ($sel1, $sel2, $sel3, ...) and I need to add them to the following line:
set all [::TopoTools::selections2mol "$box $sel1 $sel2 $sel3 $sel4"]
Now, if I only had four this would be fine by hand, but in the final version I will have hundreds which is untenable to do by hand. I'm sure the answer is some kind of loop, but I've been giving it some thought now and I can't quite figure it out. If I had, say, $sel1, $sel2, all the way to a given number, how would I add them to that line in the format shown at any amount that I want, with the $box at the beginning as shown? Thanks very much for your help.
It may or may not be relevant, but I define the variables in a loop as follows:
set sel$i [atomselect $id all]

I'm not familiar with the software you are using, but it should be possible to fix this without too much hassle.
If you put this inside the loop instead:
set sell$i [atomselect $id all]
append valueStr " " [set sell$i]
(or perhaps this, even if it is little C:)
append valueStr " " [set sell$i [atomselect $id all]]
you will get the string that " $sel1 $sel2 $sel3 $sel4" is substituted into (remember to put $box in as well).
With Tcl 8.5 or later, you can do
dict set values $i [atomselect $id all]
inside the loop, which gives you a dictionary structure containing all values, and then create the sequence of values with:
set all [::Topotools::selections2mol [concat $box [dict values $values]]]
Depending on the output and input formats of atomselect and selections2mol, the latter might not actually work without a little fine-tuning, but it should be worth a try.
In the latter case, you aren't getting the variables, but each value is available as
dict get $values $i
You can do this with an array also:
set values($i) [atomselect $id all]
but then you need to sort the keys before collecting the values, like this
set valueStr [list $box]
foreach key [lsort -integer [array names values]] {
append valueStr " " $values($key)
}
Documentation:
append,
array,
concat,
dict,
foreach,
list,
lsort,
set

Related

apparent inconsistency read/write variable

I'm learning about Tcl just now. I've seen just a bit of it, I see for instance to create a variable (and initialize it) you can do
set varname value
I am familiarizing with the fact that basically everything is a string, such as "value" above, but "varname" gets kind of a special treatment I guess because of the "set" built-in function, so varname is not interpreted as a string but rather as a name.
I can later on access the value with $varname, and this is fine to me, it is used to specify varname is not to be considered as a string.
I'm now reading about lists and a couple commands make me a bit confused
set colors {"aqua" "maroon" "cyan"}
puts "list length is [llength $colors]"
lappend colors "purple"
So clearly "lappend" is another one of such functions like set that can interpret the first argument as a name and not a string, but then why didn't they make it llength the same (no need for $)?
I'm thinking that it's just a convention that, in general, when you "read" a variable you need the $ while you don't for "writing".
A different look at the question: what Tcl commands are appropriate for list literals?
It's valid to count the elements of a list literal:
llength {my dog has fleas}
But it doesn't make sense to append a new element to a literal
lappend {my dog has fleas} and ticks
(That is actually valid Tcl, but it sets the odd variable ${my dog has fleas})
this is more sensible:
set mydog {my dog has fleas}
lappend mydog and ticks
Names are strings. Or rather a string is a name because it is used as a name. And $ in Tcl means “read this variable right now”, unlike in some other languages where it really means “here is a variable name”.
The $blah syntax for reading from a variable is convenient syntax that approximately stands in for doing [set blah] (with just one argument). For simple names, they become the same bytecode, but the $… form doesn't handle all the weird edge cases (usually with generated names) that the other one does. If a command (such as set, lappend, unset or incr) takes a variable name, it's because it is going to write to that variable and it will typically be documented to take a varName (variable name, of course) or something like that. Things that just read the value (e.g., llength or lindex) will take the value directly and not the name of a variable, and it is up to the caller to provide the value using whatever they want, perhaps $blah or [call something].
In particular, if you have:
proc ListRangeBy {from to {by 1}} {
set result {}
for {set x $from} {$x <= $to} {incr x $by} {
lappend result $x
}
return $result
}
then you can do:
llength [ListRangeBy 3 77 8]
and
set listVar [ListRangeBy 3 77 8]
llength $listVar
and get exactly the same value out of the llength. The llength doesn't need to know anything special about what is going on.

TCL use a variable to generate a varaible and use a variable for file open/close

As an easy example I just want to loop thorugh opening/closing files and use a variable to create another variable. In PERL this is pretty easy but I cnat figure it out in TCL
set gsrs ""
lappend gsrs "sir"
lappend gsrs "dir"
foreach gsr $gsrs {
set file "sdrv/icc/instance_toggle_overwrite.$gsr.txt"
puts "*** I : Generating $file"
set tempGSR gsr
puts "$$tempGSR" # would like output to be value of $gsr
set $gsr [open $file "w"] # normally you would not use a variable here for filename setting
close $$gsr
}
Double-dereferencing is usually not recommended, as it leads to complex code that is quite hard to maintain. However, if you insist on doing it then use set with one argument to do it:
puts [set $tempGSR]
Usually, thinking about using this sort of thing is a sign that either upvar (possibly upvar 0) or an array should be used instead.

Speed up TCL lsearch for long list of long strings

I am globbing files with wildcards in the files system. To avoid double counting I use a list of files that I have already captured and then lserach to check. Now the names of the full file paths are pretty long and I am coming to thousands of files. The lsearch lookup is getting really slow.
In simplified version it looks like this.
foreach fn [ glob $pattern ] {
if {[lsearch $done $fn] == -1} {
lappend done $fn
# Do somethig with $fn
} else {
#puts "Duplicate fn not processed."
}
}
Over time the lsearch has to look up pretty longs strings in a longer and longer list. What can be done to improve this? I was thinking to make the strings shorter by using some sort of CRC and putting that into the done list. But the fingerprint computation shouldn't take longer than the search.
There are two options, providing you are only ever interested in whether a literal string is present or not (that seems likely to me since the patterns are coming from glob):
If you can ensure that the list that you are searching against is alphabetically sorted, lsearch -sorted is much faster (O(log n) in the size of the data rather than O(n); it does a binary search). The one-time cost of sorting the list might be worthwhile.
If you only really care whether the value is present or not, you can load the list entries into a dictionary or array as keys; checking for presence of a value then (dict exists or info exists) is a very cheap operation, even with a lot of data. Under the covers, dicts and arrays are hash tables and so are highly suited to this sort of thing.
If you're building the list piecemeal as a check against repeating work (sounds like you are) then option 2 is absolutely the best one.
I found the better idea to append all into the list first and then uniquify at the end. So only one pass through the list is needed.
set all_fn {}
foreach fn $files {
regsub {stuff} $fn {stuff_with_wildcards} pattern
set all_fn [concat $all_fn [ glob $pattern ] ]
}
set all_fn_u [ lsort -unique $all_fn ]
foreach fn $all_fn_u {
#Do something with $fn
}
To follow up on Donal's suggestion:
you can load the list entries into a dictionary or array as keys
set d [dict create {*}[string cat [join [glob {*}$patterns] " _ "] " _"]]
foreach fn [dict keys $d] {
puts $fn
}
glob can work on multiple patterns at a time, no need for feeding only one at a time.
Create an even-sized list of elements (its string representation) from glob's results using join.
Load the resulting string rep of a Tcl list into a dictionary using dict create.
Use dict keys to obtain the (unique) list of keys.

TCL output with space

I have 5 different variable coming from different if and loop statements, when I use "put" to take output into text file all characters and digits are altogether like this : alphaclass112098voip. where
variables: name = alpha
category = class1
number = 12098
service = voip
I want output in file as like this with spaces on same line.
Alpha class1 12098 voip
Beta class1 12093 DHCP SIP
Also at certain point I want to through delimiters for future purposes.
The easiest way to deal with this is to construct a Tcl list that represents the record. You can do this piecemeal with lappend, or all at once with list. Or mix them.
foreach name $names category $categories number $numbers service $services {
set record [list $name $category]
lappend record $number
lappend record $service
puts $record
}
This shows the record for each line in a format that Tcl finds easy to parse (you'll see what I mean if you have a name with a space in it). To use a delimiter to separate the values instead, the join command is very useful:
puts [join $record $delimiter]
The default delimiter is space, but try a : instead to see how it works.
If you're generating a CSV file, do use the csv package in Tcllib. It handles the tricky edge-cases (e.g., embedded commas) for you.

editing file at multiple places using tcl/tk patterns

I have a file in which I have to search for "if statement" and corresponding "end if statement" . Currently I am doing it using lsearch( separately for "if" and "end if" and then using lappend to combine the two). Problem arises when there is cascaded if statement, which makes it difficult to identify the related "if" and "end if" pairs. If there is no assignment between the two statements then I use lreplace to delete the lines between the if and end if pair. This has to run in loop because there are multiple such pairs. Every time lreplace is used, lsearch is used again to calculate the new indexes. I am finding that this is very inefficient implementation. Can anyone suggest some pointers to improve the same.
This is not a simple thing to do. The issue is that you're really needing a pushdown automaton rather than a simple finite automaton. Simple searching won't cut it.
What you can do though is this: go through and replace each if and end if keyword with characters otherwise unused (\u0080 and \u0081 are good candidates; the C1 controls are really obscure). Then you can use a simple match in a loop to pick off each inner pair while requiring there to be no unmatched \u0080/\u0081 inside. With each match, you get swap the characters back to the tokens and do the other processing you want at the same time. Once there are no more matches, you're done.
set txt [string map {"end if" "\u0081" "if" "\u0080"} $txt]
while {[regexp -indices {\u0080[^\u0080\u0081]*\u0081} $txt span]} {
set bit [string map {"\u0081" "end if" "\u0080" "if"} [string range $txt {*}$span]]
puts "matched $bit"
# ...
set txt [string replace $txt $bit {*}$span]
}