How to create an efficient permutation algorithm in Tcl? - tcl

I have written the following proc in tcl which gives a permutation of the set {1, 2, ..., n} for some positive integer n:
proc permu {n} {
set list {}
while {[llength $list] < $n} {
set z [expr 1 + int(rand() * $n)]
if {[lsearch $list $z] == -1} {
lappend list $z
}
}
return $list
}
I have used some code snippets from tcl-codes which I found on other web sites in order to write the above one.
The following part of the code is problematic:
[lsearch $list $z] == -1
This makes the code quite inefficient. For example, if n=10000 then it takes a few seconds
until the result is displayed and if n=100000 then it takes several minutes. On the other hand, this part is required as I need to check whether a newly generated number is already in my list.
I need an efficient code to permute the set {1, 2, ..., n}. How can this be solved in tcl?
Thank you in advance!

Looking up a value in a list is a problem that grows in runtime as the list gets larger. A faster way is to look up a key in a dictionary. Key lookup time does not increase as the size of the dictionary increases.
Taking advantage of the fact the Tcl dictionary keys are ordered by oldest to most recent:
proc permu {n} {
set my_dict [dict create]
while {[dict size $my_dict] < $n} {
set z [expr 1 + int(rand() * $n)]
if {![dict exists $my_dict $z]} {
dict set my_dict $z 1
}
}
return [dict keys $my_dict]
}
This fixes the problem of slow list lookup, but the random number z is now the limiting factor. As the dict size approaches $n you need to wait longer and longer for a new value of z to be a unique value.
A different faster approach is to first assign the numbers 1 to n as value to randomized keys in a dict. Next, you can get values of each sorted key.
proc permu2 {n} {
# Add each number in sequence as a value to a dict for a random key.
set random_key_dict [dict create]
for {set i 1} {$i <= $n} {incr i} {
while {1} {
set random_key [expr int(rand() * $n * 100000)]
if {![dict exists $random_key_dict $random_key]} {
dict set random_key_dict $random_key $i
break
}
}
}
# Sort the random keys to shuffle the values.
set permuted_list [list]
foreach key [lsort -integer [dict keys $random_key_dict]] {
lappend permuted_list [dict get $random_key_dict $key]
}
return $permuted_list
}

Related

Getting the next key in an array

My goal is to find the next key in an array... below my data :
# Index increment may change, there is not necessarily continuity like this example.
# My $index can be 1,2,3,4,8,12,25,32...
# but the size of my array is about 100,000 elements.
for {set index 1} {$index < 100000} {incr index} {
set refdata($index,Pt,X) [expr {10 + $index}]
}
I need to know the next key to be able to build a geometric line... I did not find in the help a command that allows me to find the next key of my array so I created my own function below :
proc SearchNextKeyArrayElement {dataarray mykey} {
upvar $dataarray myarray
set mydata [lsort -dictionary [array names myarray]]
set index [lsearch $mydata $mykey]
if {$index > -1} {
return [lindex $mydata [expr {$index + 1}]]
}
return ""
}
foreach k [lsort -dictionary [array names refdata]] {
if {[string match "*,Pt,*" $k]} {
set nextkey [SearchNextKeyArrayElement refdata $k]
}
}
And it takes a long time...array nextelement command is maybe the solution...But I do not understand how to use it ?
Here's an example:
start a search with array startsearch
loop while array anymore is true
get the next key with array nextelement
tidy up with array donesearch
use try {} catch {} finally for safety
# array foreach
# to be subsumed in Tcl 8.7 by `array for`
# https://core.tcl.tk/tips/doc/trunk/tip/421.md
#
# example:
# array set A {foo bar baz qux}
# array foreach {key val} A {puts "name=$key, value=$val"}
#
# A note on performance: we're not saving any time with this approach.
# This is essentially `foreach name [array names ary] {...}
# We are saving memory: iterating over the names versus extracting
# them all at the beginning.
#
proc array_foreach {vars arrayName body} {
if {[llength $vars] != 2} {
error {array foreach: "vars" must be a 2 element list}
}
lassign $vars keyVar valueVar
# Using the complicated `upvar 1 $arrayName $arrayName` so that any
# error messages propagate up with the user's array name
upvar 1 $arrayName $arrayName \
$keyVar key \
$valueVar value
set sid [array startsearch $arrayName]
# If the array is modified while a search is ongoing, the searchID will
# be invalidated: wrap the commands that use $sid in a try block.
try {
while {[array anymore $arrayName $sid]} {
set key [array nextelement $arrayName $sid]
set value [set "${arrayName}($key)"]
uplevel 1 $body
}
} trap {TCL LOOKUP ARRAYSEARCH} {"" e} {
puts stderr [list $e]
dict set e -errorinfo "detected attempt to add/delete array keys while iterating"
return -options $e
} finally {
array donesearch $arrayName $sid
}
return
}
Generally speaking, Tcl arrays have no order at all; they can change their order on any modification to the array or any of its elements. The commands that iterate over the array (array for, array get, array names, and the iteration commands array startsearch/array nextelement/array anymore) only work with the current order. However, you can use array names to get the element names into a Tcl list (which is order preserving), sort those to get the order that you're going to iterate over, and then use foreach over that. As long as you're not adding or removing elements, it'll be fine. (Adding elements is sort-of OK too; you'll just not see them in your iteration.)
foreach key [lsort -dictionary [array names myarray]] {
ProcessElement $key $myarray($key)
}
By contrast, trying to just go from one element to the next will hurt a lot; that operation is not exposed.
Using the iteration commands is done like this:
set s [array startsearch myarray]
while {[array anymore myarray $s]} {
set key [array nextelement myarray $s]
ProcessElement $key $myarray($key)
}
Note that you don't get an option to sort the search. You won't see these used much in production code; doing array names or array get is usually better. And now (well, 8.7 is still in alpha) you've also got array for:
array for {key value} myarray {
ProcessElement $key $value
}
Efficient for large arrays, but still doesn't permit sorting; supporting direct sorting would require a different sort of storage engine on the back of the array.
This is why it's slow: You're sorting the array names once for the foreach command and then again for each element. Sort once and cache it, then you can iterate over it much more efficiently
set sorted_names [lsort -dictionary [array names refdata -glob {*,Pt,*}]]
set len [llength $sorted_names]
for {set i 0; set j 1} {$i < $len} {incr i; incr j} {
set this_name [lindex $sorted_names $i]
set next_name [lindex $sorted_names $j]
# ...
}

To add preceding digits in a list in Tcl

I am trying to incorporate the preceding digits in a numerical list, say 1-14, before the list 15-43.
I am using TCL for writing my code.
I am expecting the list should be 1,2,3,....43, instead the list is coming to be 15,16,17,...43.
I have tried to incorporate the missing numbers as follows:
set nres2 ""
set x 0
set y [lindex $nres $x]
while {$x < [llength $nres]} {
set i [lindex $nres $x]
while {$y < $i} {
lappend nres2 $y
incr y
}
incr x
incr y
}
This will incorporate missing numbers within a list like, if a list 15,16,17...,43 do not have numbers like 18, 22, 34, etc., it will incorporate these numbers in a separate list named as nres2.
But, I cannot include the preceding numbers of a corresponding list.
Any comments/suggestions will be greatly helpful.
Thanks in advance...
Prathit Chatterjee
I would change the y at the start:
set nres2 ""
set x 0
set y 1
while {$x < [llength $nres]} {
set i [lindex $nres $x]
while {$y < $i} {
lappend nres2 $y
incr y
}
incr x
incr y
}
With nres as 4 6 8 9, nres2 becomes 1 2 3 5 7, which is what I think you are trying to get.
Jerry posted already the immediate fix, but you may think about general improvements over your implementation using pairwise comparisons:
proc printMissings1 {lst} {
set b [lrepeat [lindex $lst end] 0]
foreach i $lst {
lset b $i 1
}
lsearch -exact -integer -all $b 0
}
printMissings1 $nres
lrepeat is used to create a Tcl list (array) of [llength $nres] elements.
lset is used to mark the elements whose indices correspond to values in the given range.
lsearch is used to return the indices of the elements left unmarked, which correspond to the missing values.

Finding Median and average of list in tcl

I am having trouble finding a way to calculate the median and average of a list of numbers and the resources online seem to be really limited with Tcl. So far I managed to only print the numbers of the list.
Your help would be greatly appreciated.
proc ladd {l} {
set total 0
set counter 0
foreach nxt $l {
incr total $nxt
incr counter 1
}
puts "$total"
puts "$counter"
set average ($total/$counter)
puts "$average"
}
set a [list 4 3 2 1 15 6 29]
ladd $a
To get the average (i.e., the arithmetic mean) of a list, you can just do:
proc average {list} {
expr {[tcl::mathop::+ {*}$list 0.0] / max(1, [llength $list])}
}
That sums the values in the list (the trailiing 0.0 forces the result to be a floating point value, even if all the added numbers are integers) and divides by the number of elements (or 1 if the list is empty so an empty list gets a mean of 0.0 instead of an error).
To get the median of a list, you have to sort it and pick the middle element.
proc median {list {mode -real}} {
set list [lsort $mode $list]
set len [llength $list]
if {$len & 1} {
# Odd number of elements, unique middle element
return [lindex $list [expr {$len >> 1}]]
} else {
# Even number of elements, average the middle two
return [average [lrange $list [expr {($len >> 1) - 1] [expr {$len >> 1}]]]
}
}
To complete the set, here's how to get the mode of the list if there is a unique one (relevant for some applications where values are selected from a fairly small set):
proc mode {list} {
# Compute a histogram
foreach val $list {dict incr h $val}
# Sort the histogram in descending order of frequency; type-puns the dict as a list
set h [lsort -stride 2 -index 1 -descending -integer $h]
# The mode is now the first element
return [lindex $h 0]
}
I'll leave handling the empty and non-unique cases as an exercise.

Combinations of all charcaters and all lengths with using less number of loops?

Brain Teaser: I self originated this question, but stuck completely.
I want to create all possible combination of all characters, but of all possible lengths. Suppose, [a-z] combination of 1 length, then [a-z] combination of 2 length, and so on till the maximum length achieved.
this could be very easily done by iterative looping.
Example for 3 length:
proc triples list {
foreach i $list {
foreach j $list {
foreach k $list {
puts [list $i $j $k]
}
}
}
}
But, it should solve using less loops (looping needs to be dynamic)
set chars "abcdefghijklmnopqrstuvwxyz"
set chars [split $chars ""]
set complete_length [llength $chars]
set start 0
set maximum_length 15
while {1} {
if {$start > $maximum_length} {
break
}
for {set i [expr $maximum_length-$start]} {$i >= 0} {incr i -1} {
# dump combinations
}
incr start
}
In this chunk, what algorithm or method i should apply? Any kind of suggestions/help/code will be appreciated.
Sry, this is not an answer, but hopefully some interesting discussion anyway:
The word "combinations" is often used way too generally, so it can be interpreted in many different ways. Let's say that you have a source list of 26 different elements, the english letters, and you want to pick 3 of them and combine in a 3 element destination list:
Can you always pick any letter from the source list, or do the elements disappear from it as you pick them? Either define "pick" (are the elements copied or moved during a pick), or define the set of source values (is there 1 of each of A-Z or an infinite amount of A-Z).
Does the order in the destination list matter? Is AHM considered to be the same combination as HAM? Define "combine".
If you have a list where not all elements are different, e.g. {2 10 10 64 100}, you have even more possibilities. Define your set of values.
Your first example prints permutations, not combinations. If that's what you want, the easiset way is a recursive procedure. Combinations are more complicated to generate.
EDIT:
I wrote this procedure for a Project Euler program. It picks all the elements, but maybe you can modify it to pick n. It takes a command prefix as argument, so you don't have to store all permutations.
package require Tcl 8.5.0
proc forEachPerm {list cmdPrefix} {
_forEachPerm {} $list $cmdPrefix
}
proc _forEachPerm {head list cmdPrefix} {
if {![llength $list]} {
{*}$cmdPrefix $head
} else {
for {set i 0} {$i < [llength $list]} {incr i} {
_forEachPerm [concat $head [lrange $list $i $i]] [lreplace $list $i $i] $cmdPrefix
}
}
}
# example use:
forEachPerm {a b c} {apply {{list} {puts [join $list]}}}

Searching for a number in a sorted list in Tcl

I'm using Tcl. I have a sorted list of real numbers. Given the number n I need to find an index of the list element which is:
either less of equal to n;
or greater than n.
Is there any standard way to do this? lsearch expects exact match and can't be used.
With Tcl 8.6 (still in beta) lsearch will do what your asking, the -sorted and new -bisect options allow the following:
-bisect
Inexact search when the list elements are in sorted order. For an increasing list the last index where the element is less than
or equal to the pattern is returned. For a decreasing list the last
index where the element is greater than or equal to the pattern is
returned.
For Tcl versions prior to 8.6 your going to have to roll your own code, given that the list is sorted it should be fairly straightforward to write a binary search with the properties that you require, Rosetta code here contains a description of a pure binary search and also a Tcl implemention. You should be able to use this as your starting point.
Here is a very quick version I created, it returns the index of either the value you search for or the value greater than it. The exception to watch for is the end of the list, searching for a value beyond the largest element returns the largest element. It's only had minimal testing so if you do use it do some additional tests! I also don't stop if the search finds the value, if this is likely to happen often you may want to optimize for this.
set lst [lsort -real [list 1.2 3.4 5.4 7.9 2.3 1.1 0.9 22.7 4.3]]
puts $lst
# Assumes that lst is sorted in ascending order
proc bisect { lst val } {
puts "Looking for $val in $lst"
set len [llength $lst]
# Initial interval - the start to the middle of the list
set start 0
set end [expr $len - 1]
set mid [expr $len / 2]
set lastmid -1
while { $mid != $lastmid } {
if { [expr $val <= [lindex $lst $mid]] } {
# val lies somewhere between the start and the mid
set end $mid
} else {
# val lies somewhere between mid and end
set start [expr $mid + 1]
}
set lastmid $mid
set mid [expr ($start + $end ) / 2]
}
return $mid
}
set res [bisect $lst 2.4]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst -1]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 999]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 1.2]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 0.9]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 22.7]
puts "found [lindex $lst $res] at index $res"