Finding Median and average of list in tcl - tcl

I am having trouble finding a way to calculate the median and average of a list of numbers and the resources online seem to be really limited with Tcl. So far I managed to only print the numbers of the list.
Your help would be greatly appreciated.
proc ladd {l} {
set total 0
set counter 0
foreach nxt $l {
incr total $nxt
incr counter 1
}
puts "$total"
puts "$counter"
set average ($total/$counter)
puts "$average"
}
set a [list 4 3 2 1 15 6 29]
ladd $a

To get the average (i.e., the arithmetic mean) of a list, you can just do:
proc average {list} {
expr {[tcl::mathop::+ {*}$list 0.0] / max(1, [llength $list])}
}
That sums the values in the list (the trailiing 0.0 forces the result to be a floating point value, even if all the added numbers are integers) and divides by the number of elements (or 1 if the list is empty so an empty list gets a mean of 0.0 instead of an error).
To get the median of a list, you have to sort it and pick the middle element.
proc median {list {mode -real}} {
set list [lsort $mode $list]
set len [llength $list]
if {$len & 1} {
# Odd number of elements, unique middle element
return [lindex $list [expr {$len >> 1}]]
} else {
# Even number of elements, average the middle two
return [average [lrange $list [expr {($len >> 1) - 1] [expr {$len >> 1}]]]
}
}
To complete the set, here's how to get the mode of the list if there is a unique one (relevant for some applications where values are selected from a fairly small set):
proc mode {list} {
# Compute a histogram
foreach val $list {dict incr h $val}
# Sort the histogram in descending order of frequency; type-puns the dict as a list
set h [lsort -stride 2 -index 1 -descending -integer $h]
# The mode is now the first element
return [lindex $h 0]
}
I'll leave handling the empty and non-unique cases as an exercise.

Related

How to create an efficient permutation algorithm in Tcl?

I have written the following proc in tcl which gives a permutation of the set {1, 2, ..., n} for some positive integer n:
proc permu {n} {
set list {}
while {[llength $list] < $n} {
set z [expr 1 + int(rand() * $n)]
if {[lsearch $list $z] == -1} {
lappend list $z
}
}
return $list
}
I have used some code snippets from tcl-codes which I found on other web sites in order to write the above one.
The following part of the code is problematic:
[lsearch $list $z] == -1
This makes the code quite inefficient. For example, if n=10000 then it takes a few seconds
until the result is displayed and if n=100000 then it takes several minutes. On the other hand, this part is required as I need to check whether a newly generated number is already in my list.
I need an efficient code to permute the set {1, 2, ..., n}. How can this be solved in tcl?
Thank you in advance!
Looking up a value in a list is a problem that grows in runtime as the list gets larger. A faster way is to look up a key in a dictionary. Key lookup time does not increase as the size of the dictionary increases.
Taking advantage of the fact the Tcl dictionary keys are ordered by oldest to most recent:
proc permu {n} {
set my_dict [dict create]
while {[dict size $my_dict] < $n} {
set z [expr 1 + int(rand() * $n)]
if {![dict exists $my_dict $z]} {
dict set my_dict $z 1
}
}
return [dict keys $my_dict]
}
This fixes the problem of slow list lookup, but the random number z is now the limiting factor. As the dict size approaches $n you need to wait longer and longer for a new value of z to be a unique value.
A different faster approach is to first assign the numbers 1 to n as value to randomized keys in a dict. Next, you can get values of each sorted key.
proc permu2 {n} {
# Add each number in sequence as a value to a dict for a random key.
set random_key_dict [dict create]
for {set i 1} {$i <= $n} {incr i} {
while {1} {
set random_key [expr int(rand() * $n * 100000)]
if {![dict exists $random_key_dict $random_key]} {
dict set random_key_dict $random_key $i
break
}
}
}
# Sort the random keys to shuffle the values.
set permuted_list [list]
foreach key [lsort -integer [dict keys $random_key_dict]] {
lappend permuted_list [dict get $random_key_dict $key]
}
return $permuted_list
}

can I automate the calculation of the moving average for some data in tcl?

I wrote a program that calculates the moving average for some data. The problem is that I can't automatize, I mean if i want to do the process by 10 steps or more it is not smart to write 10 lines with the terms of exchange.
The section that calculates the moving average by 4 steps is:
set aux [lindex $line 4]
set T [lindex $line 1]
set aux1 [lrange $valores 1 1]
set valores [lreplace $valores 0 0 $aux1]
set aux1 [lrange $valores 2 2]
set valores [lreplace $valores 1 1 $aux1]
set aux1 [lrange $valores 3 3]
set valores [lreplace $valores 2 2 $aux1]
set aux1 [lrange $valores 4 4]
set valores [lreplace $valores 3 3 $aux1]
set valores [lreplace $valores 4 4 $aux]
set promP [avg $valores]
I know that i have to use a for loop, but the attempts i have made didn't work.
Assuming that you're keeping a window on the data, it's not too difficult. The trick is to make a procedure to do the critical work.
set WINDOW_SIZE 10
set storedData {}
proc updateMovingAverage {value} {
global storedData WINDOW_SIZE
set storedData [lreplace [list {*}$storedData $value] 0 end-$WINDOW_SIZE]
return [expr {[tcl::mathop::+ {*}$storedData] / double([llength $storedData])}]
}
Or you could create a class:
oo::class create MovingAverage {
variable window size
constructor {{windowSize 10}} {
set window {}
set size $windowSize
}
method item {value} {
set window [lreplace [list {*}$window $value] 0 end-$size]
return
}
method average {} {
return [expr {[tcl::mathop::+ {*}$window] / double([llength $window])}]
}
}
The class splits apart the adding of an item and the calculating of the average. The latter is a standard pattern in Tcl now. The trick with adding an item is to append the item to a list and then trim off the front of the list if it is larger than the desired window; the list {*}$thing $value does the append-an-item, and the lreplace THING 0 end-$wantedLength does the prefix trim (it's replacing them with an empty sequence of items).
Here's a more efficient version.
oo::class create MovingAverage {
variable window size index
constructor {{windowSize 10}} {
set window {}
set size $windowSize
set index 0
}
method item {value} {
lset window $index $value
set index [expr {($index + 1) % $size}]
return
}
method average {} {
return [expr {[tcl::mathop::+ {*}$window] / double([llength $window])}]
}
}
This uses the fact that, froom 8.6 onwards (entirely coincidentally when classes are integrated), the lset command can append an item to a list.
A coroutine version, like Donal mentioned in a comment:
#!/usr/bin/env tclsh
package require Tcl 8.6
proc moving_average {{size 10}} {
set window [list [yield [info coroutine]]]
set index 0
while {true} {
set avg [expr {[tcl::mathop::+ {*}$window] / double([llength $window])}]
set index [expr {($index + 1) % $size}]
lset window $index [yield $avg]
}
}
# Example usage, plus commented-out example usage of Donal's
# class version to compare the two.
#set class_avg [MovingAverage new]
coroutine coro_avg moving_average
for {set n 1} {$n < 20} {incr n} {
# $class_avg item $n
# puts "OO: [$class_avg average]"
puts "coro: [coro_avg $n]"
}
It and the class version have the advantage over the standard proc version like Donal's first example in that it doesn't rely on any global variables, so you can have multiple different data sets with different window sizes all being computed at the same time in different coroutines/objects.

To add preceding digits in a list in Tcl

I am trying to incorporate the preceding digits in a numerical list, say 1-14, before the list 15-43.
I am using TCL for writing my code.
I am expecting the list should be 1,2,3,....43, instead the list is coming to be 15,16,17,...43.
I have tried to incorporate the missing numbers as follows:
set nres2 ""
set x 0
set y [lindex $nres $x]
while {$x < [llength $nres]} {
set i [lindex $nres $x]
while {$y < $i} {
lappend nres2 $y
incr y
}
incr x
incr y
}
This will incorporate missing numbers within a list like, if a list 15,16,17...,43 do not have numbers like 18, 22, 34, etc., it will incorporate these numbers in a separate list named as nres2.
But, I cannot include the preceding numbers of a corresponding list.
Any comments/suggestions will be greatly helpful.
Thanks in advance...
Prathit Chatterjee
I would change the y at the start:
set nres2 ""
set x 0
set y 1
while {$x < [llength $nres]} {
set i [lindex $nres $x]
while {$y < $i} {
lappend nres2 $y
incr y
}
incr x
incr y
}
With nres as 4 6 8 9, nres2 becomes 1 2 3 5 7, which is what I think you are trying to get.
Jerry posted already the immediate fix, but you may think about general improvements over your implementation using pairwise comparisons:
proc printMissings1 {lst} {
set b [lrepeat [lindex $lst end] 0]
foreach i $lst {
lset b $i 1
}
lsearch -exact -integer -all $b 0
}
printMissings1 $nres
lrepeat is used to create a Tcl list (array) of [llength $nres] elements.
lset is used to mark the elements whose indices correspond to values in the given range.
lsearch is used to return the indices of the elements left unmarked, which correspond to the missing values.

tcl to add numbers in columns till pattern mismatches

Hi I need to add numbers in a column till pattern matches and then to start adding numbers after pattern matches, for example:
start 1
start 2
start 3
pattern
start 4
start 5
start 6
I need to have sum as 6 till pattern and 15 after pattern separately, i tried regexp start but it adds all the numbers in 2nd column irrespective of 'pattern', i know sed works, but i need in tcl-regexp only
With minimal change to your current code and your current attempt/method to reach the desired outcome, this is what I suggest:
set sum1 0
set sum2 0
set ind 0
set skip true
while {![eof $file]} {
# Notice the change of $x to x here
gets $file x
if {[regexp start $x]} {
set ind [lindex $x 1]
# Depending on $skip, add the number to either sum1 or sum2
if {$skip == "true"} {
set sum1 [expr $sum1 + $ind]
} else {
set sum2 [expr $sum2 + $ind]
}
}
if {[regexp pattern $x]} {
set skip "false"
}
}
puts $sum1
puts $sum2
Though, I would use the following to make things a bit simpler:
set sum 0
while {[gets $file x] != -1} {
# if there line has "pattern, then simply print the current sum, then resets it to zero
if {[regexp pattern $x]} {
puts $sum
set sum 0
} elseif {[regexp {start ([0-9]+)} $x - number]} {
# if the line matches 'start' followed by <space> and a number, save that number and add it to the sum
# also, I prefer using incr here than expr. If you do want to use expr, brace your expression [expr {$sum+$ind}]
incr sum $number
}
}
# puts the sum
puts $sum

Searching for a number in a sorted list in Tcl

I'm using Tcl. I have a sorted list of real numbers. Given the number n I need to find an index of the list element which is:
either less of equal to n;
or greater than n.
Is there any standard way to do this? lsearch expects exact match and can't be used.
With Tcl 8.6 (still in beta) lsearch will do what your asking, the -sorted and new -bisect options allow the following:
-bisect
Inexact search when the list elements are in sorted order. For an increasing list the last index where the element is less than
or equal to the pattern is returned. For a decreasing list the last
index where the element is greater than or equal to the pattern is
returned.
For Tcl versions prior to 8.6 your going to have to roll your own code, given that the list is sorted it should be fairly straightforward to write a binary search with the properties that you require, Rosetta code here contains a description of a pure binary search and also a Tcl implemention. You should be able to use this as your starting point.
Here is a very quick version I created, it returns the index of either the value you search for or the value greater than it. The exception to watch for is the end of the list, searching for a value beyond the largest element returns the largest element. It's only had minimal testing so if you do use it do some additional tests! I also don't stop if the search finds the value, if this is likely to happen often you may want to optimize for this.
set lst [lsort -real [list 1.2 3.4 5.4 7.9 2.3 1.1 0.9 22.7 4.3]]
puts $lst
# Assumes that lst is sorted in ascending order
proc bisect { lst val } {
puts "Looking for $val in $lst"
set len [llength $lst]
# Initial interval - the start to the middle of the list
set start 0
set end [expr $len - 1]
set mid [expr $len / 2]
set lastmid -1
while { $mid != $lastmid } {
if { [expr $val <= [lindex $lst $mid]] } {
# val lies somewhere between the start and the mid
set end $mid
} else {
# val lies somewhere between mid and end
set start [expr $mid + 1]
}
set lastmid $mid
set mid [expr ($start + $end ) / 2]
}
return $mid
}
set res [bisect $lst 2.4]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst -1]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 999]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 1.2]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 0.9]
puts "found [lindex $lst $res] at index $res"
set res [bisect $lst 22.7]
puts "found [lindex $lst $res] at index $res"