Getting the next key in an array - tcl

My goal is to find the next key in an array... below my data :
# Index increment may change, there is not necessarily continuity like this example.
# My $index can be 1,2,3,4,8,12,25,32...
# but the size of my array is about 100,000 elements.
for {set index 1} {$index < 100000} {incr index} {
set refdata($index,Pt,X) [expr {10 + $index}]
}
I need to know the next key to be able to build a geometric line... I did not find in the help a command that allows me to find the next key of my array so I created my own function below :
proc SearchNextKeyArrayElement {dataarray mykey} {
upvar $dataarray myarray
set mydata [lsort -dictionary [array names myarray]]
set index [lsearch $mydata $mykey]
if {$index > -1} {
return [lindex $mydata [expr {$index + 1}]]
}
return ""
}
foreach k [lsort -dictionary [array names refdata]] {
if {[string match "*,Pt,*" $k]} {
set nextkey [SearchNextKeyArrayElement refdata $k]
}
}
And it takes a long time...array nextelement command is maybe the solution...But I do not understand how to use it ?

Here's an example:
start a search with array startsearch
loop while array anymore is true
get the next key with array nextelement
tidy up with array donesearch
use try {} catch {} finally for safety
# array foreach
# to be subsumed in Tcl 8.7 by `array for`
# https://core.tcl.tk/tips/doc/trunk/tip/421.md
#
# example:
# array set A {foo bar baz qux}
# array foreach {key val} A {puts "name=$key, value=$val"}
#
# A note on performance: we're not saving any time with this approach.
# This is essentially `foreach name [array names ary] {...}
# We are saving memory: iterating over the names versus extracting
# them all at the beginning.
#
proc array_foreach {vars arrayName body} {
if {[llength $vars] != 2} {
error {array foreach: "vars" must be a 2 element list}
}
lassign $vars keyVar valueVar
# Using the complicated `upvar 1 $arrayName $arrayName` so that any
# error messages propagate up with the user's array name
upvar 1 $arrayName $arrayName \
$keyVar key \
$valueVar value
set sid [array startsearch $arrayName]
# If the array is modified while a search is ongoing, the searchID will
# be invalidated: wrap the commands that use $sid in a try block.
try {
while {[array anymore $arrayName $sid]} {
set key [array nextelement $arrayName $sid]
set value [set "${arrayName}($key)"]
uplevel 1 $body
}
} trap {TCL LOOKUP ARRAYSEARCH} {"" e} {
puts stderr [list $e]
dict set e -errorinfo "detected attempt to add/delete array keys while iterating"
return -options $e
} finally {
array donesearch $arrayName $sid
}
return
}

Generally speaking, Tcl arrays have no order at all; they can change their order on any modification to the array or any of its elements. The commands that iterate over the array (array for, array get, array names, and the iteration commands array startsearch/array nextelement/array anymore) only work with the current order. However, you can use array names to get the element names into a Tcl list (which is order preserving), sort those to get the order that you're going to iterate over, and then use foreach over that. As long as you're not adding or removing elements, it'll be fine. (Adding elements is sort-of OK too; you'll just not see them in your iteration.)
foreach key [lsort -dictionary [array names myarray]] {
ProcessElement $key $myarray($key)
}
By contrast, trying to just go from one element to the next will hurt a lot; that operation is not exposed.
Using the iteration commands is done like this:
set s [array startsearch myarray]
while {[array anymore myarray $s]} {
set key [array nextelement myarray $s]
ProcessElement $key $myarray($key)
}
Note that you don't get an option to sort the search. You won't see these used much in production code; doing array names or array get is usually better. And now (well, 8.7 is still in alpha) you've also got array for:
array for {key value} myarray {
ProcessElement $key $value
}
Efficient for large arrays, but still doesn't permit sorting; supporting direct sorting would require a different sort of storage engine on the back of the array.

This is why it's slow: You're sorting the array names once for the foreach command and then again for each element. Sort once and cache it, then you can iterate over it much more efficiently
set sorted_names [lsort -dictionary [array names refdata -glob {*,Pt,*}]]
set len [llength $sorted_names]
for {set i 0; set j 1} {$i < $len} {incr i; incr j} {
set this_name [lindex $sorted_names $i]
set next_name [lindex $sorted_names $j]
# ...
}

Related

How to create an efficient permutation algorithm in Tcl?

I have written the following proc in tcl which gives a permutation of the set {1, 2, ..., n} for some positive integer n:
proc permu {n} {
set list {}
while {[llength $list] < $n} {
set z [expr 1 + int(rand() * $n)]
if {[lsearch $list $z] == -1} {
lappend list $z
}
}
return $list
}
I have used some code snippets from tcl-codes which I found on other web sites in order to write the above one.
The following part of the code is problematic:
[lsearch $list $z] == -1
This makes the code quite inefficient. For example, if n=10000 then it takes a few seconds
until the result is displayed and if n=100000 then it takes several minutes. On the other hand, this part is required as I need to check whether a newly generated number is already in my list.
I need an efficient code to permute the set {1, 2, ..., n}. How can this be solved in tcl?
Thank you in advance!
Looking up a value in a list is a problem that grows in runtime as the list gets larger. A faster way is to look up a key in a dictionary. Key lookup time does not increase as the size of the dictionary increases.
Taking advantage of the fact the Tcl dictionary keys are ordered by oldest to most recent:
proc permu {n} {
set my_dict [dict create]
while {[dict size $my_dict] < $n} {
set z [expr 1 + int(rand() * $n)]
if {![dict exists $my_dict $z]} {
dict set my_dict $z 1
}
}
return [dict keys $my_dict]
}
This fixes the problem of slow list lookup, but the random number z is now the limiting factor. As the dict size approaches $n you need to wait longer and longer for a new value of z to be a unique value.
A different faster approach is to first assign the numbers 1 to n as value to randomized keys in a dict. Next, you can get values of each sorted key.
proc permu2 {n} {
# Add each number in sequence as a value to a dict for a random key.
set random_key_dict [dict create]
for {set i 1} {$i <= $n} {incr i} {
while {1} {
set random_key [expr int(rand() * $n * 100000)]
if {![dict exists $random_key_dict $random_key]} {
dict set random_key_dict $random_key $i
break
}
}
}
# Sort the random keys to shuffle the values.
set permuted_list [list]
foreach key [lsort -integer [dict keys $random_key_dict]] {
lappend permuted_list [dict get $random_key_dict $key]
}
return $permuted_list
}

How to find duplicated strings which appear more than once in a file

I have following code to print string which appears more than once in the list
set a [list str1/str2 str3/str4 str3/str4 str5/str6]
foreach x $a {
set search_return [lsearch -all $a $x]
if {[llength $search_return] > 1} {
puts "search_return : $search_return"
}
}
I need to print str3/str4 which appears more than once in the list
The canonical methods of doing this are with arrays or dictionaries, both of which are associative maps. Here's a version with a single loop over the data using a dictionary (it doesn't know the total number of times an item appears when it prints, but sometimes just knowing you've got a multiple is enough).
set a [list str1/str2 str3/str4 str3/str4 str5/str6]
# Make sure that the dictionary doesn't exist ahead of time!
unset -nocomplain counters
foreach item $a {
if {[dict incr counters $item] == 2} {
puts "$item appears several times"
}
}
I guess you could use an array to do something like that, since arrays have unique keys:
set a [list str1/str2 str3/str4 str3/str4 str5/str6]
foreach x $a {
incr arr($x) ;# basically counting each occurrence
}
foreach {key val} [array get arr] {
if {$val > 1} {puts "$key appears $val times"}
}

Get result of TCL exec command into array

How to get the result of a tcl exec command into an array of strings where each item is a line of my exec output?
Example:
exec ls -la
How to capture that result into an array and print it in a foreach?
Can I advise you to use list instead of array? If so...
set output [exec ls]
set output_list [split $output \n]
foreach line $output_list {
puts $line
}
List is much more useful collection in this situation, because all you need is to store lines one by one. On the other hand, array in Tcl was made to store named collection (without order).
I can make it with array, but it would be ugly.
set output [exec ls]
set output_list [split $output \n]
set i 0
foreach line $output_list {
set arr($i) $line
incr i
}
foreach index [array names arr] {
puts $arr($index)
}
As you can see, foreach for arrays can't guaranty order of records. For example I've got this
% foreach index [array names arr] {
puts arr($index)
}
arr(8)
arr(4)
arr(0)
arr(10)
arr(9)
arr(5)
arr(1)
arr(6)
arr(2)
arr(7)
arr(3)
So if you want to work with array as it is ordered collection, you need to use counter.
for {set i 0} {$i < [array size arr]} {incr i} {
puts $arr($i)
}

Remove duplicate elements from list in Tcl

How to remove duplicate element from Tcl list say:
list is like [this,that,when,what,when,how]
I have Googled and have found lsort unique but same is not working for me. I want to remove when from list.
The following works for me
set myList [list this that when what when how]
lsort -unique $myList
this returns
how that this what when
which you could store in a new list
set uniqueList [lsort -unique $myList]
You could also use an dictionary, where the keys must be unique:
set l {this that when what when how}
foreach element $l {dict set tmp $element 1}
set unique [dict keys $tmp]
puts $unique
this that when what how
That will preserve the order of the elements.
glenn jackman's answer work perfectly on Tcl 8.6 and above.
For Tcl 8.4 and below (No dict command). You can use:
proc list_unique {list} {
array set included_arr [list]
set unique_list [list]
foreach item $list {
if { ![info exists included_arr($item)] } {
set included_arr($item) ""
lappend unique_list $item
}
}
unset included_arr
return $unique_list
}
set list [list this that when what when how]
set unique [list_unique $list]
This will also preserve the order of the elements
and this is the result:
this that when what how
Another way, if do not wanna use native lsort function.This is what the interviewer asks :)
`set a "this that when what when how"
for {set i 0} {$i < [llength $a]} {incr i} {
set indices [lsearch -all $a [lindex $a $i]]
foreach index $indices {
if {$index != $i} {
set a [lreplace $a $index $index]
}
}
}
`

tcl set list of arrays produce duplicates

I'm producing a TCL procedure that will return a list of arrays of devices under a switch. The definition is an XML file that is read. The resulting lists of XML entries are parsed using a recursive procedure and the device attributes are placed in an array.
Each array is then placed in a list and reflected back to the caller. My problem is that when I print out the list of devices, the last device added to the list is printed out each time. The contents of the list is all duplicates.
Note: I'm using the excellent proc, 'xml2list' that was found here. I'm sorry, I forgot who submitted this.
The following code illustrates the problem:
source C:/src/tcl/xml2list.tcl
# Read and parse XML file
set fh [open C:/data/tcl/testfile.xml r]
set myxml [read $fh]
set mylist [xml2list $myxml]
array set mydevice {}
proc devicesByName { name thelist list_to_fill} {
global mydevice
global set found_sw 0
upvar $list_to_fill device_arr
foreach switch [lindex $thelist 2] {
set atts [lindex $switch 1]
if { [lindex $switch 0] == "Switch" } {
if { $name == [lindex $atts 3] } {
set found_sw 1
puts "==== Found Switch: $name ===="
} else {
set found_sw 0
}
} elseif { $found_sw == 1 && [string length [lindex $atts 3]] > 0 } {
set mydevice(hdr) [lindex $switch 0]
set mydevice(port) [lindex $atts 1]
set mydevice(name) [lindex $atts 3]
set mydevice(type) [lindex $atts 5]
puts "Device Found: $mydevice(name)"
set text [lindex $switch 2]
set mydevice(ip) [lindex [lindex $text 0] 1]
lappend device_arr mydevice
}
devicesByName $name $switch device_arr
}
}
#--- Call proc here
# set a local array var and send to the proc
set device_arr {}
devicesByName "Switch1" $mylist device_arr
# read out the contents of the list of arrays
for {set i 0} {$i<[llength $device_arr]} {incr i} {
upvar #0 [lindex $device_arr $i] temp
if {[array exists temp]} {
puts "\[$i\] Device: $temp(name)-$temp(ip)"
}
}
The XML file is here:
<Topology>
<Switch ports="48" name="Switch1" ip="10.1.1.3">
<Device port="1" name="RHEL53-Complete1" type="host">10.1.1.10</Device>
<Device port="2" name="Windows-Complete1" type="host">10.1.2.11</Device>
<Device port="3" name="Solaris-Complete1" type="host">10.1.2.12</Device>
</Switch>
<Switch ports="36" name="Switch2" ip="10.1.1.4">
<Device port="1" name="Windows-Complete2" type="host">10.1.3.10</Device>
</Switch>
<Router ports="24" name="Router1" ip="10.1.1.2">
<Device port="1" name="Switch1" type="switch">10.1.1.3</Device>
<Device port="2" name="Switch2" type="switch">10.1.1.4</Device>
</Router>
</Topology>
If my code blocks look bad, please excuse that. I followed the directions as I read them, but it didn't look correct. I could not fix it, so just posted anyway.
Thanks in advance...
Arrays in tcl are not values. Therefore they don't behave like regular variables. They are in fact something special like filehandles or sockets.
You cannot assign an array to a list like that. Doing:
lappend device_arr mydevice
simply appends the string "mydevice" to the list device_arr. That string happens to be the name of a global variable so that string may be used later to access that global variable.
To build up a key-value data structure what you want is a dict. You can think of a dict as a special list that has even numbers of elements in the format: {key value key value}. In fact, this data structure works even on very old versions of tcl before the introduction of dicts because the foreach loop in tcl can be used to process key-value pairs.
So what you want is to create a new $mydevice dict each loop and use [dict set] to assign the values.
Alternatively you can keep most of your code and change your lappend to:
lappend device_arr [array get mydevice]
This works because [array get] returns a key-value list which can be treated as a dict. You can later access the data using the dict command.
Array variables can't be used as values. To put the contents of one into a list element, send it to a proc, write it to a file etc, convert it to list form (key, value, key, value...) with array get.
lappend device_arr [array get mydevice]
To use it later, write the list back to an array with array set.
foreach device_l $device_arr {
#array unset device
array set device $device_l
puts "$device(name)-$device(ip)"
}
Note that array set doesn't erase the old keys in the destination array, so if you use it in a loop and the key names aren't always the same, you need to clear the array every iteration.
You can store this information in two ways using arrays . First is as a multi-dimensional array, in this case a three dimensional array and the second is a one dimensional array storing a list that can be converted easily to an array later for accessing data at a later time.
For the 3d array the key would be Switch Name,device_port,dataname you would change your erroneous temporary myDevice and lappend code to
# attr is a list of { attributename1 value1 ... attributenameN valueN}
array set temp $attr
set port $temp(port)
set text [lindex $switch 2]
set ip [lindex [lindex $text 0] 1]
# name already set to "Switch1" etc
foreach f [array names temp ] {
set device_arr($name,$port,$f) $temp($f)
}
set device_arr($name,$port,ip) $ip
array unset temp
this code results in the following ( when parray device_arr
parray device_arr
device_arr(Switch1,1,name) "Switch1"
device_arr(Switch1,1,port) 1
device_arr(Switch1,1,type) "RedHat .."
device_arr(Switch1,1,ip) 10..
device_arr(Switch1,2,name) "Switch1"
device_arr(Switch1,2,port) 1
device_arr(Switch1,2,type) "RedHat .."
device_arr(Switch1,2,ip) 10..
...
device_arr(Switch2,1,name) "Switch1"
device_arr(Switch2,1,port) 1
device_arr(Switch2,1,type) "Windows Complete"
device_arr(Switch2,1,ip) 10..
....
to find ip of Switch1 port2 you would:
puts "the ip of Switch1 port 2 is $device_arr(Switch1,2,ip)"
Note lots of data duplication but you can access all data directly without having to go to an intermediate step to get to the data as in the next scheme
# attr is a list of { attributename1 value1 ... attributenameN valueN}
set data $attr
array set temp $attr
set text [lindex $switch 2]
set ip [lindex [lindex $text 0] 1]
lappend data ip $ip
set key "$name,$temp(port)"
# name already set to "Switch1" etc
set device_arr($name,$port) $data
array unset temp
doing a parray device_arr gives:
device_arr(Switch1,1) { port "1" name "RHEL53-Complete1" type "host" ip 10.1.1.10 }
device_arr(Switch1,2) { port "2" name "Windows-Complete1" type "host" ip 10.1.2.11}
....
to find the ip of swtich1 port 2 you would
array set temp $device_array(Switch1,2)
puts "ip of device 2 is $temp(ip)"