How to generate a summary from a list? - tcl

I have a list:
set aList {aa aa aa bb bb cc cc cc cc aa aa bb cc cc cc cc}
I would like to generate a summary that looks something like:
3 aa 2 bb 4 cc 2 aa 1 bb 4 cc
I am able to generate the summary by using a foreach loop, but am looking for a better solution.

What you are asking for is basically an RLE encoding command. I don't know of any, but there might be one somewhere -- whether it fits your format is another matter. There is a page called RLE on the wiki, but that's just a user signature.
Otherwise, it's hard to avoid iterating commands when processing an iterative structure. You could try recursion:
set aList {aa aa aa bb bb cc cc cc cc aa aa bb cc cc cc cc}
proc summarize {result theList} {
if {[llength $theList] eq 0} {
return $result
} else {
summarize {*}[count $result $theList 1]
}
}
proc count {result theList theCount} {
set theList [lassign $theList theToken]
if {$theToken eq [lindex $theList 0]} {
count $result $theList [incr theCount]
} else {
list [lappend result $theCount $theToken] $theList
}
}
summarize [list] $aList
Brief commentary: the summarize function works with two lists, an even-sized (initially empty) list of counts and tokens (result) and a list of unprocessed tokens (theList). If theList is empty, the result list is the result of the operation. If there is at least one token in theList, we prepare a new pair of similar lists by counting how many identical tokens we have in the head of theList, and summarize those two lists instead. For every application, result will be longer by one pair, and theList will be shorter by some number of tokens ≥1.
It could be expressed symbolically (with quasi-ML notation) as
 summarize (R, Ø) → R
 summarize (R, L) → summarize count (R, L, 1)
The count function takes the same two lists, and also a count (initially 1: if theList wasn't empty, we will always have at least one token to count). It breaks off one token at the head of theList and compares it to the (now shortened) theList. If theToken is identical to the first token in theList, we increment theCount and apply count again. If it isn't, we extend result with theCount and theToken, having finished counting one kind of token (execution then passes back through the layers of calls back to summarize).
Note that calling count with an empty list for theList will result in an infinite loop: summarize will never call it unless there are tokens in theList.
 count (R, (T :: (T :: L)), N) → count (R, (T :: L), N+1)
 count (R, (a :: L), N) → ((R # (N :: (a :: Ø))), L)

Following function which runs through the list and maintains count of current item works:
proc group_identical {slist} {
set newlist {}
set current [lindex $slist 0]
set count 0
foreach ele $slist {
if { ![string compare $ele $current] } { ;# same as current;
incr count
} else { ;# new element
lappend newlist $count $current
set current $ele
set count 1
}
}
lappend newlist $count $current
return $newlist
}
set aList {aa aa aa bb bb cc cc cc cc aa aa bb cc cc cc cc}
puts [group_identical $aList]
Output:
3 aa 2 bb 4 cc 2 aa 1 bb 4 cc

Related

Pass few but not all optional arguments to a Tcl procedure

In TCL the way to make a parameter optional is to give it a default value. I don't know if there are any other ways too. e.g
proc my_func {a b c {d 10} {e 11} {f 12}} {
...
}
Now in the above example the parameters a, b and c are compulsory. The parameters d, e and f are optional. Is there another way to create optional parameters?
I am in a situation where I need to create a parameter that can be called from a TCL terminal (in Xilinx Vivado) which has some optional parameters. The user decide to pass a few or all of the optional parameters or none at all. The problem is that, when using positional argument passing, it is impossible to tell TCL which optional parameter we are passing to it. What is the solution to this? e.g
my_func 1 2 3 4 5 6
shall call the my_func with values a=1, b=2, c=3, d=4, e=5 and f=6. Also,
my_func 1 2 3 4
shall call my_func with values a=1, b=2, c=3 and d=4 and the e, f left at their default values. However, I might need to do something like this
my_func 1 2 3 100
where I am passing 100 to f and leave c and d at default value. But the above stament will set d to 100 instead and leave e and f at their default values.
What is the solution since I can clearly not use the positional argument technique here.
A readable way to design the function is to do it Tk style: use -d 100 options:
proc my_func {a b c args} {
set opts [dict merge {-d 10 -e 11 -f 12} $args]
puts "a = $a"
puts "b = $b"
puts "c = $c"
puts "d = [dict get $opts -d]"
puts "e = [dict get $opts -e]"
puts "f = [dict get $opts -f]"
}
Then when you use them, you can specify them in any order:
% my_func
wrong # args: should be "my_func a b c ?arg ...?"
% my_func 1 2 3
a = 1
b = 2
c = 3
d = 10
e = 11
f = 12
% my_func 1 2 3 -e 100 -d 200
a = 1
b = 2
c = 3
d = 200
e = 100
f = 12
If the final argument in your proc definition is literally args, then the remaining arguments (if any) are collected in a list.
This proc demonstrates how d,e,f can be optional. The optional arguments are included as a {name value} pair.
proc my_func {a b c args} {
set defaults {d 10 e 11 f 12}
foreach {var_name var_value} $defaults {
set $var_name $var_value
}
foreach arg $args {
set [lindex $arg 0] [lindex $arg 1]
}
puts "a:$a b:$b c:$c d:$d e:$e f:$f"
}
tcl8.6.8> my_func 1 2 3
a:1 b:2 c:3 d:10 e:11 f:12
tcl8.6.8> my_func 1 2 3 {d 5} {e 8} {f 99}
a:1 b:2 c:3 d:5 e:8 f:99
tcl8.6.8> my_func 1 2 3 {f 99}
a:1 b:2 c:3 d:10 e:11 f:99
The below is a minor variation to the solutions already suggested. By using dict with, on can unpack the dictionary content into the proc-local scope as variables:
proc my_func {a b c args} {
set () [dict merge {(d) 10 (e) 11 (f) 12} $args]
dict with () {}
puts "a = $a"
puts "b = $b"
puts "c = $c"
puts "d = $(d)"
puts "e = $(e)"
puts "f = $(f)"
}
Some remarks:
To avoid collisions with other (existing?) proc-local variables, the optional parameters are denoted as elements of an array named using the empty string: ().
dict with will unpack the so-named keys into that array: (e), (f), ...
The processed optionals can be accessed via $ syntax: $(e), $(f), ...
Watch:
my_func 1 2 3
my_func 1 2 3 (e) 100 (d) 200
Yields:
a = 1
b = 2
c = 3
d = 10
e = 11
f = 12
a = 1
b = 2
c = 3
d = 200
e = 100
f = 12

Count number of unique element in a list

Say I have a list, a b c b b d e e f …, and I don't know how many different kind of elements are in there.
How do I count the number of each unique element and print them out?
Output would looks like:
a: 32
b: 12
c: 6
…
You have to count them up. This isn't too hard with an array or dictionary of counters. I'll use a dictionary since then they'll be printed in order of first occurrence. (With an array, you'd get a “random” order or you'd have to sort them.)
set counters {}
foreach item $list {
dict incr counters $item
}
dict for {item count} $counters {
puts "${item}: $count"
}
Try this if you have 8.4 or older version of TCL,
set lst "a a a a b b b c c c d d a a a f f f f f s s s s"
set unique [lsort -unique $lst]
foreach f $unique {
set cnt 0
foreach item $lst {
if {$item == $f} {
incr cnt
}
}
puts "$f :: $cnt"
}
Gives Output Like,
% tclsh main.tcl
a :: 7
b :: 3
c :: 3
d :: 2
f :: 5
s :: 4
It can be easily done using lsearch and llength.
Lets say your list is {a c a c s a a c a} then,
set tempList {a c a c s a a c a}
puts "c : [llength [lsearch -all $tempList c]]"
puts "a : [llength [lsearch -all $tempList a]]"
puts "d : [llength [lsearch -all $tempList d]]"
Output :
c : 3
a : 5
d : 0
Explanation : lsearch -all, will return all the index of matching element
and this list of index is returned to llength which will count length of the list.
The dict or array solution is the best one and should be preferred. Another way that works on a sorted list of tokens is to match contiguous regions of non-blank tokens.
% regexp -all -inline {(\S+)(?:\s+\1)*} {a a b b b c d d}
{a a} a {b b b} b c c {d d} d
The result is an even-sized list of alternately matched regions of tokens and the token matched in the region. This can be used to print a frequency report for the tokens in the list in list.
foreach {a b} [regexp -all -inline {(\S+)(?:\s+\1)*} [lsort $list]] {
puts "$b: [llength $a]"
}
Note the limitation that the tokens cannot contain blanks. This can be overcome, but it's simpler to use the array / dict solution which only requires the tokens to be valid list elements.
Documentation: foreach, llength, lsort, puts, Syntax of Tcl regular expressions, regexp

Is there a way to print array in order with entries entered in that [TCL]

I have an array in Tcl, say
set count(a) b
set count(b) b
set count(c) b
set count(e) b
set count(d) b
set count(z) b
set count(m) b
When I print this I get output
array names count
d m e a z b c
Is there a way I can get the same order in which I have written the array?
Use a dict instead (mostly the same thing, just another syntax):
dict set count a b
dict set count b b
dict set count c b
dict set count e b
dict set count d b
dict set count z b
dict set count m b
The following prints the keys in insertion order
% dict keys $count
a b c e d z m
If you want to have it both ways, assign to the dictionary and recreate an array when desired using
array unset countArray
array set countArray $count
dict was added in Tcl 8.5. While an array never preserves insertion order for its elements, original insertion order is kept for dict elements even after later assignments.
Dictionaries and arrays are both implemented as hash tables and have some overlap in functionality. However, arrays are primarily containers for variables and allow elements to be individually traced. Dictionaries are containers of values, and can be interchanged with other kinds of data (the dict command ensemble can only use even-sized proper lists).
Documentation: array, dict
Based on the Tcl wiki you can't do it
Array keys are not ordered. It isn't straight-forward to get values out of an array in the same order that they were set. One common alternative is to get the names and then order them. In contrast, values in a dict are ordered.
dict in tcl8.5 is recommended. This is how you can do it with an array though:
array set foo {}
set fooOrder [list]
trace variable foo w bar
proc bar {args} {
global fooOrder
lappend fooOrder [lindex $args 1]
}
set foo(a) 10
set foo(c) 20
set foo(b) 30
puts "Default behaviour..."
puts [parray foo]
puts "Maintaining the order..."
foreach key $fooOrder {
puts "foo($key) = $foo($key)"
}
Output:
sharad#ss:~$ tclsh my.tcl
Default behaviour...
foo(a) = 10
foo(b) = 30
foo(c) = 20
Maintaining the order...
foo(a) = 10
foo(c) = 20
foo(b) = 30
sharad#ss:~$

Print multiple tcl lists in a uniform manner

I have a group of lists some with strings, some with numbers and some with both. All these lists have variable lengths. I would like to know what would be the best way to print it to a file so that they all have equal spacing between them.
For example, I use,
set numbers {0 1 2 3 4}
set type {dog reallybigbaddog thisisaevenlargersentence cat bird}
set paths {aaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb ccc ddddddddddddddddd efgh}
puts $fid "NUMBERS\t\tTYPE\tPATHS"
foreach numbersval $numbers typeval $type pathsval $paths {
puts $fid "$numbersval\t\t$typeval\t$pathsval"
}
The result was,
NUMBERS TYPE PATHS
0 dog AAA
1 reallybigbaddog bbbbbbbbbbbbbbbbbbbbbbbb
2 thisisaevenlargersentence ccc
3 cat ddddddddddddddddd
4 bird efgh
I Tried using "format" based on one of the suggestions on this site but that resulted in a similar output, I guess we need a way to determining what the longest string is and cant arbitrarily use "\t"? Would appreciate any better suggestions.
For reference, this is how you could do it with struct::matrix and report:
package require struct::matrix
package require report
set nrows 5
set ncols 3
set npads [expr {$ncols + 1}]
struct::matrix m
m add rows $nrows
m add column {0 1 2 3 4}
m add column {dog reallybigbaddog thisisaevenlargersentence cat bird}
m add column {aaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb ccc ddddddddddddddddd efgh}
m insert row 0 {NUMBERS TYPE PATHS}
report::report r $ncols
r data set [lrepeat $npads \t]
m format 2string r
(This uses only a fraction of the formatting power of report.) This method can handle values with spaces in them.
Result (there is a tab character to the left of the first column on each row, but it's lost in the formatting here.):
NUMBERS TYPE PATHS
0 dog aaa
1 reallybigbaddog bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
2 thisisaevenlargersentence ccc
3 cat ddddddddddddddddd
4 bird efgh
Documentation: expr, lrepeat, package, report package, set, struct::matrix package
In this case, I'd call out to column -t to do the work for me:
set all "NUMBERS TYPE PATHS\n"
foreach n $numbers t $type p $paths {
append all "$n $t $p\n"
}
set formatted [exec column -t << $all]
puts $formatted
NUMBERS TYPE PATHS
0 dog aaa
1 reallybigbaddog bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
2 thisisaevenlargersentence ccc
3 cat ddddddddddddddddd
4 bird efgh
A pure Tcl way to do this:
array set maxl {numbers 0 type 0 paths 0}
foreach l {numbers type paths} {
foreach e [concat $l [set $l]] {
if {[set len [string length $e]] > $maxl($l)} {
set maxl($l) $len
}
}
}
puts [format "%-*s %-*s %-*s" $maxl(numbers) NUMBERS $maxl(type) TYPE $maxl(paths) "PATH LISTS"]
foreach n $numbers t $type p $paths {
puts [format "%-*s %-*s %-*s" $maxl(numbers) $n $maxl(type) $t $maxl(paths) $p]
}

How to pass a dictionary with more arguments into a proc in tcl?

proc test {a b c } {
puts $a
puts $b
puts $c
}
set test_dict [dict create a 2 b 3 c 4 d 5]
Now I want to pass dict into test like this:
test $test_dict
How to make test only selects three elements in the dict with the same name of its parameters (the keys). The expected output should be:
2
3
4
Because it selects a b c in the dictionary but not d. How can I do this? I saw some code does like this but I can't make it work.
I think you should use dict get:
proc test {test_dic} {
puts [dict get $test_dic a]
puts [dict get $test_dic b]
puts [dict get $test_dic c]
}
set test_dict [dict create a 2 b 3 c 4 d 5]
test $test_dict
Edit:
Another variant would be to use dict with:
proc test {test_dic} {
dict with test_dic {
puts $a
puts $b
puts $c
}
}
set test_dict [dict create a 2 b 3 c 4 d 5]
test $test_dict
But test gets still a list.