tcl to add numbers in columns till pattern mismatches - tcl

Hi I need to add numbers in a column till pattern matches and then to start adding numbers after pattern matches, for example:
start 1
start 2
start 3
pattern
start 4
start 5
start 6
I need to have sum as 6 till pattern and 15 after pattern separately, i tried regexp start but it adds all the numbers in 2nd column irrespective of 'pattern', i know sed works, but i need in tcl-regexp only

With minimal change to your current code and your current attempt/method to reach the desired outcome, this is what I suggest:
set sum1 0
set sum2 0
set ind 0
set skip true
while {![eof $file]} {
# Notice the change of $x to x here
gets $file x
if {[regexp start $x]} {
set ind [lindex $x 1]
# Depending on $skip, add the number to either sum1 or sum2
if {$skip == "true"} {
set sum1 [expr $sum1 + $ind]
} else {
set sum2 [expr $sum2 + $ind]
}
}
if {[regexp pattern $x]} {
set skip "false"
}
}
puts $sum1
puts $sum2
Though, I would use the following to make things a bit simpler:
set sum 0
while {[gets $file x] != -1} {
# if there line has "pattern, then simply print the current sum, then resets it to zero
if {[regexp pattern $x]} {
puts $sum
set sum 0
} elseif {[regexp {start ([0-9]+)} $x - number]} {
# if the line matches 'start' followed by <space> and a number, save that number and add it to the sum
# also, I prefer using incr here than expr. If you do want to use expr, brace your expression [expr {$sum+$ind}]
incr sum $number
}
}
# puts the sum
puts $sum

Related

Calculate average of columns of column with Tcl

I want to calculate the average for this column with tcl
please help me
frame Elec
1 50
2 40
3 30
4 20
If this is for a standalone script, (Warning: Self promotion ahead), I wrote a program called tawk that's like awk except using TCL for scripting, which does most of the work for you:
$ tawk 'line {$NR > 1} { incr sum $F(2) }
END { puts [expr {double($sum) / ($NR - 1)}] }' input.txt
35
# Equivalent awk:
$ awk 'NR > 1 { sum += $2 } END { print (sum / (NR - 1)) }' input.txt
35
If it's part of a larger program, you have to open the file and read and split lines yourself. Maybe something like
# Column number is 1-based
proc avg_column {filename column} {
set f [open $filename r]
gets $f ;# Read and discard header line
set sum 0
set nlines 0
while {[gets $f line] >= 0} {
set columns [regexp -all -inline {\S+} $line]
incr sum [lindex $columns $column-1]
incr nlines
}
close $f
return [expr {double($sum) / $nlines}]
}
puts [avg_column input.txt 2]
Not an answer, but some tips. You need to:
open the file
read the header with gets
use a while loop to read lines of the file
use split or regexp to get the 2nd field
sum the values (and count the lines) with expr, or incr if the values are only integers
If your input happens to be (some sort of CSV), or you can steer it into this direction, then you may use tcllib's csv package:
package require csv
package require struct::matrix
struct::matrix dm
set f [open mydata.csv]
while {[gets $f l] >= 0} {
# sanitize input, line-wise
set l [regsub -all {\s+} $l " "]
csv::split2matrix dm $l " " auto
}
close $f
set columnData [lrange [dm get column 1] 1 end]; # strip off header
puts [expr {double([tcl::mathop::+ {*}$columnData])/[llength $columnData]}]; # compute avg
Some hints:
gets will read your input file line by line;
csv::split2matrix puts each line into a struct::matrix;
/matrix/ get column /n/ gives access to one data column (incl. header field);
tcl::mathop::+ gives access to the built-in addition operator (outside of the [expr] command) and supports 2+ summands.

Finding Median and average of list in tcl

I am having trouble finding a way to calculate the median and average of a list of numbers and the resources online seem to be really limited with Tcl. So far I managed to only print the numbers of the list.
Your help would be greatly appreciated.
proc ladd {l} {
set total 0
set counter 0
foreach nxt $l {
incr total $nxt
incr counter 1
}
puts "$total"
puts "$counter"
set average ($total/$counter)
puts "$average"
}
set a [list 4 3 2 1 15 6 29]
ladd $a
To get the average (i.e., the arithmetic mean) of a list, you can just do:
proc average {list} {
expr {[tcl::mathop::+ {*}$list 0.0] / max(1, [llength $list])}
}
That sums the values in the list (the trailiing 0.0 forces the result to be a floating point value, even if all the added numbers are integers) and divides by the number of elements (or 1 if the list is empty so an empty list gets a mean of 0.0 instead of an error).
To get the median of a list, you have to sort it and pick the middle element.
proc median {list {mode -real}} {
set list [lsort $mode $list]
set len [llength $list]
if {$len & 1} {
# Odd number of elements, unique middle element
return [lindex $list [expr {$len >> 1}]]
} else {
# Even number of elements, average the middle two
return [average [lrange $list [expr {($len >> 1) - 1] [expr {$len >> 1}]]]
}
}
To complete the set, here's how to get the mode of the list if there is a unique one (relevant for some applications where values are selected from a fairly small set):
proc mode {list} {
# Compute a histogram
foreach val $list {dict incr h $val}
# Sort the histogram in descending order of frequency; type-puns the dict as a list
set h [lsort -stride 2 -index 1 -descending -integer $h]
# The mode is now the first element
return [lindex $h 0]
}
I'll leave handling the empty and non-unique cases as an exercise.

How to add the elements of a list in tcl

I have a list as belows:
test = {a[2] r[5] f[6] t[8]} {d[32] g[66] k[88]} {w[2] e[33]}
The size of the test list is variable and can have any number of elements.
I want the total to be calculated as:
total = 4 + 3 + 2 = 9
I am trying something like this but it gives an error.
set result ""
set index1 ""
foreach index1 [llength $test] {
set value [llength [lindex $test $index1]]
result1 = expr [$value + $result1]
puts $result1
}
It gives the error below:
invalid command name "0"
Thanks.
To do something for every member of a list, always use foreach, and incr is great for various sorts of counting things. In this case:
set total 0; # In case the input list is empty
foreach sublist $test {
incr total [llength $sublist]
}
# The value you are looking for is in the “total” variable

TCL incr gives wrong value for zero padded integer

I was trying to increment a number which is padded by zeroes to become a six digit number. But strangely any value other than single digit gives a wrong value. like
set x 000660
incr x 1
gives result 433. Also tried with smaller number like 010 but the result is 9. Why is this happening ?
What is the proper way to solve this issue ?
You can try this way too.
proc getIntVal { x } {
# Using 'scan' command to get the literal integer value
set count [ scan $x %d n ]
if { $count!= 1 } {
return -1
}
return $n
}
proc padZero { x } {
# Using 'format' to pad with leading zeroes.
return [ format "%05d" $x ]
}
set val 00060
puts "Initial value : $val"
set tmp [ getIntVal $val ]; # 'tmp' will have the value as '60'
incr tmp;
set val [ padZero $tmp ]; # Padding with zero now
puts "Final value : $val"
Numbers beginning with 0 like
000660
are octet integers. It's equivalent to decimal 432.
The same for 010 (the same as 8 in decimal)
To strip off zeros, try this:
proc stripzeros {value} {
regsub ^0+(.+) $value \\1 retval
return $retval
}
For more information, see Tcl FAQ: How can I use numbers with leading zeroes?.
Yu Hao already explained the problem of octets, and Dinesh added some procs to circumvent the issue. I am suggesting creating one proc that will take on a zero padded integer and return another zero padded integer of the same format and which should work just like incr:
proc incr_pad {val args} {
# Check if increment is given properly
if {[llength $args] == 0} {
set args 1
} elseif {[llength $args] > 1} {
return -code error {wrong # args: should be "incr_pad varName ?increment?"}
}
# Check for integers
if {![regexp {^[0-9]+$} $val]} {
return -code error "expected integer but got \"$val\""
} elseif {![regexp {^[0-9]+$} $args]} {
return -code error "expected integer but got \"$args\""
}
# Get number of digits
set d [regexp -all {[0-9]} $val]
# Trim 0s to the left
set newval [string trimleft $val 0]
# Now use incr
incr newval $args
# Return back the number formatted with the same zero padding as initially given
return [format "%0${d}d" $newval]
}
With this...
% incr_pad 000660 1
000661
% incr_pad 2.5 1
expected integer but got "2.5"
% incr_pad 02 1.5
expected integer but got "1.5"
% incr_pad 010 2
012
% incr_pad 1 2 3
wrong # args: should be "incr_pad varName ?increment?"
% incr_pad 00024
00025
% incr_pad 999
1000
Of course, you can change the name of the function to a shorter one or one which you find more appropriate.

Script to generate N number of valid ip addresses?

I am new to TCL and trying to learn by doing some simple scripting, I have taken upon to write a simple script which generates valid ip address from a given starting ip address.
I have managed to write one but have run into two problems,
The last octet has a zero getting added in front of the number that is 192.168.1.025
When i specify the starting ip something like this 250.250.5.1 it fails to generate proper ips,
Below is my code:
proc generate {start_addr total_addr} {
if {$total_addr == 0} {return}
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
while {$a<255 && $b <255 && $c <255 && $d < 255 } {
set d [expr {$d + 1}];
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
if {$total_addr == 1} {return}
if {$total_addr > 1 && $d == 255} {
set c [expr {$c + 1}];
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $c==255 && $d == 255} {
set b [expr {$b + 1}];
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $b == 255 && $c == 255 && $d == 255} {
set a [expr {$a + 1}];
set b 1
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
}
}
flush stdout
puts "Please enter the starting IPv4 address with . as delimiter EX: 1.1.1.1"
set start_addr [gets stdin]
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
if {$a <= 255 & $b <= 255 & $c <= 255 & $d <= 255} {
puts "this is a valid ip address"
} else {
puts "this not a valid ip address"
}
flush stdout
puts "Please enter the total number of IPv4 address EX: 1000"
set total_addr [gets stdin]
set result [generate $start_addr $total_addr]
For parsing an IP address the simple way, it is better to use scan. If you know C's sscanf() function, Tcl's scan is very similar (in particular, %d matches a decimal number). Like that, we can do:
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "some components of address are missing"
}
It's a good idea to throw an error when things go wrong. You can catch them later or just let the script exit, depending on what's right for you. (You still need to check the number range.)
More generally, there's a package in Tcllib that does IP address parsing. It is far more complete than you're likely to need, but it's there.
Second major thing that you should do? Factor out the code to append a string to a file. It's can be a short procedure, short enough that it is obviously right.
proc addAddress {filename address} {
set fileId [open $filename "a"]
puts $fileId $address
close $fileId
}
Then you can replace:
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
With:
addAddress "output.txt" $a$b$c$d
Less to go wrong. Less noise. (Protip: consider $a.$b.$c.$d there.)
More seriously, your code is just really unlikely to work. It's too complicated. In particular, you should generate one address each time through the loop, and you should concentrate on how to advance the counters right. Using incr to add one to an integer is highly recommended too.
You might try something like this:
incr d
if {$d > 255} {
set d 1
incr c
}
if {$c > 255} {
set c 1
incr b
}
if {$b > 255} {
set b 1
incr a
}
if {$a > 255} {
set a 1
}
But that's less than efficient. We can do better with this:
if {[incr d] > 255} {
set d 1
if {[incr c] > 255} {
set c 1
if {[incr b] > 255} {
set b 1
if {[incr a] > 255} {
set a 1
}
}
}
}
That's better (though actual valid IP addresses have a wider range: you can have a 0 or two in the middle, such as in 127.0.0.1…)
Splitting the address
Apart from using the ip package in Tcllib, there are a few ways to split up an IPv4 "dot-decimal" address and put the octet values into four variables. The one you used was
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
This basically works, but there are a couple of problems with it. The first problem is that the address 1.234.1.234 will be split up as 1. 234. 1. 234, and then when you try to use the incr command on the first three variables you will get an error message (I suppose that's why you used expr {$x + 1} instead of incr). Instead, write
regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d
This expression puts the dots outside the capturing parentheses and places integer values into the variables. It's also a good idea to use the shorthand \d (decimal digit) instead of the [0-9] sets. But you could also do this:
regexp -all -inline -- {\d+} $start_addr
where you simply ask regexp to collect all (-all) unbroken sequences of decimal digits and return them as a list (-inline). Since you get the result as a list, you then need to lassign (list assign) them into variables:
lassign [regexp -all -inline -- {\d+} $start_addr] a b c d
But if you can make do without a regular expression, you should. Donal suggested
scan $start_addr "%d.%d.%d.%d" a b c d
which is fine. Another way is to split the string at the dots:
lassign [split $start_addr .] a b c d
(again you get a list as the result and need to assign it to your variables in a second step).
Checking the result
As Donal wrote, it's a good idea whenever you create data from user input (and in many other situations as well) to check that you did get what you expected to get. If you use an assigning regexp the command returns 1 or 0 depending on whether the matched succeeded or failed. This result can be plugged directly into an if invocation:
if {![regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d]} {
error "input data didn't match IPv4 dot-decimal notation"
}
Donal already gave an example of checking the result of scan. In this case you check against 4 since the command returns the number of successful matches it managed.
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "input data didn't match IPv4 dot-decimal notation"
}
If you use either of the list-creating commands (inline regexp or split) you can check the list length of the result:
if {[llength [set result [split $start_addr .]]] == 4} {
lassign $result a b c d
} else {
error "input data didn't match IPv4 dot-decimal notation"
}
This check should be followed by checking all variables for octet values (0-255). One convenient way to do this is like this:
proc isoctet args {
::tcl::mathop::* {*}[lmap octet $args {expr {0 <= $octet && $octet <= 255}}]
}
(It's usually a good idea to break out tests as functions; it's practically the law* if you are using the tests in several places in your code.)
This command, isoctet, takes a number of values as arguments, lumping them together as a list in the special parameter args. The lmap command creates a new list with the same number of elements as the original list, where the value of each element is the result of applying the given script to the corresponding element in the original list. In this case, lmap produces a list of ones and zeros depending on whether the value was a true octet value or not. Example:
input list: 1 234 567 89
result list: 1 1 0 1
The resulting list is then expanded by {*} into individual arguments to the ::tcl::mathop::* command, which multiplies them together. Why? Because if 1 and 0 can be taken as true and false values, the product of a list of ones and zeros happens to be exactly the same as the logical conjunction (AND, &&) of the same list.
result 1: 1 1 0 1
product : 0 (false)
result 2: 1 1 1 1
product : 1 (true)
So,
if {![isoctet $a $b $c $d]} {
error "one of the values was outside the (0, 255) range"
}
Generating new addresses
Possibly the least sexy way to generate a new address is to use a ready-made facility in Tcl: binary.
binary scan [binary format c* [list $a $b $c $d]] I n
This invocation first converts a list of integer values (while constraining them to octet size) to a bit string, and then interprets that bit string as a big-endian 32-bit integer (if your machine uses little-endian integers, you should use the conversion specifier i instead of I).
Increment the number. Wheee!
incr n
Convert it back to a list of 8-bit values:
binary scan [binary format I $n] c4 parts
The components of parts are now signed 8-bit integers, i.e. the highest value is 127, and the values that should be higher than 127 are now negative values. Convert the values to unsigned (0 - 255) values like this:
lassign [lmap part $parts {expr {$part & 0xff}}] a b c d
and join them up to a dot-decimal string like this:
set addr [join [list $a $b $c $d] .]
If you want more than one new address, repeat the process.
Documentation: binary, error, expr, if, incr, join, lassign, llength, lmap, mathop, proc, regexp, scan, set, split, {*}
lmap is a Tcl 8.6 command. Pure-Tcl implementations for Tcl 8.4 and 8.5 are available here.
*) If there were any laws. What you must learn is that these rules are no different than the rules of the Matrix. Some of them can be bent. Others can be broken.
proc ip_add { ip add } {
set re "^\\s*(\\d+)\.(\\d+)\.(\\d+)\.(\\d+)\\s*$"
if [regexp $re $ip match a b c d] {
set x [expr {(($a*256+$b)*256+$c)*256+$d+$add}]
set d [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set c [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set b [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set a [expr {int(fmod($x,256))}]
return "$a.$b.$c.$d"
} else {
puts stderr "invalid ip $ip"
exit 1
}
}
set res [ip_add "127.0.0.1" 512]
puts "res=$res"