Reading from a CSV file - csv

I have a CSV file containing many rows and columns 2 of which are similar to:
Horizontal-1 Acc. Filename Horizontal-2 Acc. Filename
RSN88_SFERN_FSD172.AT2 RSN88_SFERN_FSD262.AT2
RSN164_IMPVALL.H_H-CPE147.AT2 RSN164_IMPVALL.H_H-CPE237.AT2
RSN755_LOMAP_CYC195.AT2 RSN755_LOMAP_CYC285.AT2
RSN1083_NORTHR_GLE170.AT2 RSN1083_NORTHR_GLE260.AT2
RSN1614_DUZCE_1061-N.AT2 RSN1614_DUZCE_1061-E.AT2
RSN1633_MANJIL_ABBAR--L.AT2 RSN1633_MANJIL_ABBAR--T.AT2
RSN3750_CAPEMEND_LFS270.AT2 RSN3750_CAPEMEND_LFS360.AT2
RSN3757_LANDERS_NPF090.AT2 RSN3757_LANDERS_NPF180.AT2
RSN3759_LANDERS_WWT180.AT2 RSN3759_LANDERS_WWT270.AT2
RSN4013_SANSIMEO_36258021.AT2 RSN4013_SANSIMEO_36258111.AT2
RSN4841_CHUETSU_65004NS.AT2 RSN4841_CHUETSU_65004EW.AT2
RSN4843_CHUETSU_65006NS.AT2 RSN4843_CHUETSU_65006EW.AT2
RSN4844_CHUETSU_65007NS.AT2 RSN4844_CHUETSU_65007EW.AT2
RSN4848_CHUETSU_65011NS.AT2 RSN4848_CHUETSU_65011EW.AT2
In the CSV file I wanna look for the headers "Horizontal-1 Acc. Filename and Horizontal-2 Acc. Filename" and then line by line get the names of each row under these headers one at a time ?
Any suggestion ?
Thanks
RG.

package require csv
package require struct::matrix
::struct::matrix m
m add columns 2
set chan [open data.csv]
::csv::read2matrix $chan m
close $chan
lassign [m get row 0] header1 header2
for {set r 1} {$r < [m rows]} {incr r} {
puts -nonewline [format {%s = %-30s } $header1 [m get cell 0 $r]]
puts [format {%s = %s} $header2 [m get cell 1 $r]]
}
m destroy
I find that the easiest way to deal with csv data sets is by using a matrix. A matrix is sort of a two-dimensional vector with built-ins for searching, sorting and rearranging columns and rows.
First, create a matrix and call it m. It will have two columns from the beginning, but no rows yet.
::struct::matrix m
m add columns 2
Open a channel to read the data file. Pass the channel and the matrix name to the ::csv::read2matrix command. This command will read the csv data and create a matrix row for each data row. The data fields are stored in the columns.
set chan [open data.csv]
::csv::read2matrix $chan m
close $chan
To get the header strings, retrieve row 0.
lassign [m get row 0] header1 header2
To iterate over the data rows, go from 1 (if we didn't have headers, 0) to just under m rows, which is the number of rows in the matrix.
There is a handy report facility that works well with matrices, but I'll just use a for loop here. I'm guessing how you want the data presented:
for {set r 1} {$r < [m rows]} {incr r} {
puts -nonewline [format {%s = %-30s } $header1 [m get cell 0 $r]]
puts [format {%s = %s} $header2 [m get cell 1 $r]]
}
If you're done with the matrix, you might as well destroy it.
m destroy
Solution for the specific problem in the comments.
package require csv
package require struct::matrix
::struct::matrix m
set chan [open foo.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 "Result ID"]
set headerRow [lindex $f1 0 1]
set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] ""]
set f3 [m search row $headerRow "Horizontal-1 Acc. Filename"]
set f4 [m search row $headerRow "Horizontal-2 Acc. Filename"]
set top [expr {$headerRow + 1}]
set bottom [expr {[lindex $f2 0 1] - 1}]
set left [lindex $f3 0 0]
set right [lindex $f4 0 0]
puts [format {Vector=[ %s ]} [concat {*}[m get rect $left $top $right $bottom]]]
m destroy
Obviously, you need to change the filename to the correct name. There is no error handling: in such a simple script it's better to just have the script fail and correct whatever went wrong.
Solution to the second problem, comments below:
package require csv
package require struct::matrix
::struct::matrix m
set chan [open _SearchResults.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]
set f2 [m search -glob rect 0 $headerRow 0 [expr {[m rows] - 1}] { These*}]
set numofRow [lindex $f2 0 1]
set headercol1 [m search row $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row $headerRow { Horizontal-2 Acc. Filename}]
set indexheaderH1col [lindex $headercol1 0 0]
set indexheaderH2col [lindex $headercol2 0 0]
set rows [m get rect $indexheaderH1col [expr {$headerRow+1}] $indexheaderH2col [expr {$numofRow-1}]]
set rows [lmap row $rows {
lassign $row a b
list [string trim $a] [string trim $b]
}]
foreach row $rows {
puts [format {%-30s %s} {*}$row]
}
puts [format {Vector=[ %s ]} [concat {*}$rows]]
Comments:
You don't need to set the number of columns if you use read2matrix with auto
In this file, there is no empty cell after the table. Instead, we need to search for a string beginning with " These"
Since each cell holds a space character followed by the value, we need to trim off space around the value, otherwise the concatenation will go wrong. The part with the lmap command fixes that
Always brace your expressions
Documentation:
+ (operator),
- (operator),
< (operator),
chan,
close,
concat,
csv (package),
expr,
for,
format,
incr,
lassign,
lindex,
lmap (for Tcl 8.5),
lmap,
open,
package,
puts,
set,
struct::matrix (package),
{*} (syntax)

wipe all
package require csv
package require struct::matrix
::struct::matrix m
m add columns 2
set chan [open _SearchResults.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]
set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] {}]
set numofRow [lindex [lindex $f2 0 1]]
set headercol1 [m search row $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row $headerRow { Horizontal-2 Acc. Filename}]
set indexheaderH1col [lindex $headercol1 0 0]
set indexheaderH2col [lindex $headercol2 0 0]
set header1 [m get cell $indexheaderH1col $headerRow]
set header2 [m get cell $indexheaderH2col $headerRow]
for {set r [expr $headerRow+1]} {$r < [expr $numofRow-1]} {incr r} {
puts [format {%-30s %s} [m get cell $indexheaderH1col $r] [m get cell $indexheaderH2col $r]]
}
set vector [concat {*}[m get rect $indexheaderH1col [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]
puts [format {Vector=[ %s ]} [concat {*}[m get rect $indexheaderH1col [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]]

Related

How to recall a variable within the "string map" option in tcl

I want to sequentially update a series of file, from a tmp file distTmp.RST to sequentially dist1.RST, dist2.RST, etc..
For me, the fileutil package in vmd text interface is not working as follows:
My tcl code (add.tcl):
package require fileutil
set F 20.5
set Ff ""
for {set f 0} {$f < 70} {incr f} {
set F [expr {$F+1}]
lappend Ff $F
}
puts $Ff
for {set f 0} {$f < 70} {incr f} {
set M [lindex $Ff $f]
set N [expr {$f+1}]
package require fileutil
::fileutil::updateInPlace distTmp.RST {string map {WWW $M}}
::fileutil::cat dist$N.RST
}
========
The error occurring is
vmd > source add.tcl
can't find package fileutil
vmd >
========
Moreover, when I do not use "fileutil" package, my script is as follows:
set F 20.5
set Ff ""
for {set f 0} {$f < 70} {incr f} {
set F [expr {$F+1}]
lappend Ff $F
}
puts $Ff
for {set f 0} {$f < 70} {incr f} {
set M [lindex $Ff $f]
set N [expr {$f+1}]
set dat [open "distTmp.RST" r]
set out [open "dist$N.RST" w]
while {[gets $dat line] >= 0} {
set newline [string map {WWW $$M} $line]
puts $out $newline
}
}
======
But, there is a problem in recalling the variable $M within a string, and my required output files are as follows:
(base) [Prathit#master]~/APP/OnlyAPP/AlphaFold2/770_res/Charmm-Gui_Dimer-units/E2-E2_3222212666/charmm-gui-3222212666/amber/RSTfiles_Equil>head -n +4 dist1.RST dist2.RST
=> dist1.RST <==
&rst iat = -1, -1, r2 = $$M, r2a = $$M,
==> dist2.RST <==
&rst iat = -1, -1, r2 = $$M, r2a = $$M,
==========
In the above, $$M should be sequentially 21.5, 22.5, and so on....
Kindly let me know of a possible solution.
The fileutil package is part of tcllib. Check that your variable auto_path includes a path where Tcl can find tcllib and fileutil.
Your list for string map is in curly braces, so it's using a literal dollar sign for $M instead of the value of the variable named M.
Change curlies to double quotes or use the list command, as answered in a comment already.
$$M is usually not okay in Tcl. Are you trying to do double interpolation? If so, I recommend using set with one argument, to retrieve a value instead of to set a value. You can use $$ in a subst command, but that's not my preference.
set name John
set var_name name
puts $$var_name --> $name
puts [set $var_name] --> John
puts [set [set var_name]] --> John
puts [subst $$var_name] --> John

How to sum coordinates in TCL

I have a data file (*.dat) containing x, y, z coordinates. As following:
{26.3612117767334 40.19668960571289 54.13957977294922}
{27.351043701171875 40.57518768310547 54.05387496948242}
{29.48208999633789 42.08218765258789 56.42238235473633}
For this file I need to do a math operation as follow:
Xi + (Xf-Xi/4) ; Yi + (Yf-Yi/4) ; Zi + (Zf-Zi/4)
where "i" is the initial position and "f" the final, meaning that Xi,Yi,Zi are the data on the first line and Xf,Yf,Zf the data on the second.
I need to do these calculation for all the lines in a loop and then stored in a separate file, but I do not have idea how to do it in TCL. Thanks in advance for your help.
Since the contents of your file can be treated as a bunch of tcl lists, one per line (so basically a list of lists), parsing it is dead simple.
Something like:
set f [open file.dat]
set coords [read -nonewline $f]
close $f
for {set i 0} {$i < [llength $coords] - 1} {incr i} {
lassign [lindex $coords $i] xi yi zi
lassign [lindex $coords $i+1] xf yf zf
set xn [expr {$xi + ($xf - $xi/4.0)}]
set yn [expr {$yi + ($yf - $yi/4.0)}]
set zn [expr {$zi + ($zf - $zi/4.0)}]
puts "{$xn $yn $zn}"
}
This skips treating the last line as an initial set of coordinates because there is no next set for it.
This is a good opportunity to write a mathfunc:
proc tcl::mathfunc::f {ai af} {
expr {$ai * 0.75 + $af}
}
proc transform {file} {
set fh [open $file]
# read the first line, aka the initial "previous line"
gets $fh line
scan $line {{%f %f %f}} xi yi zi
# process the rest of the file
while {[gets $fh line] != -1} {
scan $line {{%f %f %f}} xf yf zf
puts "{[expr {f($xi, $xf)}] [expr {f($yi, $yf)}] [expr {f($zi, $zf)}]}"
lassign [list $xf $yf $zf] xi yi zi
}
close $fh
}
transform file.dat
outputs
{47.121952533721924 70.72270488739014 94.65855979919434]}
{49.9953727722168 72.51357841491699 96.96278858184814]}
I present an alternate method that uses lrange to pick the overlapping ranges of sublists that participate (so we can then process them element-wise) and then lmap to apply the same transformation expression to each coordinate axis.
# Same read-in code as Shawn's answer; it's the easiest way
set f [open file.dat]
set coords [read -nonewline $f]
close $f
foreach Ci [lrange $coords 0 end-1] Cf [lrange $coords 1 end] {
# I often like to put expressions on their own line for clarity
puts [list [lmap _i $Ci _f $Cf {expr {
$_i + ($_f - $_i/4.0)
}}]]
}
(The wrapping list call in there puts braces around the result of lmap.)

What is the reason of error "Floating point exception (core dumped)"

I'm trying to save output $result of proc Inverse2 which get scheduled after every one second (it is called inside another procedure,that procedure is rescheduled for 1s that is why Inverse2 procedure)
I want to get output which is {x y now} and assign variable to it for latest two instances
x1-> x location at current time (for example at 8.0)
y1-> y location at current time
x2-> x location at (current time+1) (for example at 9.0)
y2-> y location at (current time+1)
and use for further calculations.
Below is a code I have tried but error I got after two iterations is Floating point exception (core dumped). Where I'm doing wrong?
code:
set result {}
proc Inverse2 {m} {
set op [open output.tr w]
global result
global ns
set now [$ns now]
lassign [lindex $m 0 2] x1
lassign [lindex $m 0 3] y1
lassign [lindex $m 0 6] d1
lassign [lindex $m 1 2] x2
lassign [lindex $m 1 3] y2
lassign [lindex $m 1 6] d2
lassign [lindex $m 2 2] x3
lassign [lindex $m 2 3] y3
lassign [lindex $m 2 6] d3
set mt {{? ?} {? ?}}
lset mt 0 0 [expr 2*($x1-$x2)]
lset mt 0 1 [expr 2*($y1-$y2)]
lset mt 1 0 [expr 2*($x1-$x3)]
lset mt 1 1 [expr 2*($y1-$y3)]
set const {{?} {?}}
lset const 0 [expr {(pow($x1,2)+pow($y1,2)-pow($d1,2))-(pow($x2,2)+pow($y2,2)-pow($d2,2))}]
lset const 1 [expr {(pow($x1,2)+pow($y1,2)-pow($d1,2))-(pow($x3,2)+pow($y3,2)-pow($d3,2))}]
#puts $result "$const"
# puts $result "$mt"
set x [expr {double([lindex [Inverse3 $mt] 0 0] * [lindex $const 0]
+ [lindex [Inverse3 $mt] 0 1] * [lindex $const 1])}]
set y [expr {double([lindex [Inverse3 $mt] 1 0] * [lindex $const 0]
+ [lindex [Inverse3 $mt] 1 1] * [lindex $const 1])}]
lappend result "$x $y $now"
puts $result
for {set i 0} {$i< [llength $result]} {incr i} { #for latest two instances
for {set j 1} {$i< [llength $result]} {incr j} {
set X1 [lindex $result $i 0]
set Y1 [lindex $result $i 1]
if {[llength $result] >1} { #to ensure length of list is greater than 1
set X2 [lindex $result $j 0]
set Y2 [lindex $result $j 1]
set v [expr hypot($X2-$X1,$Y2-$Y1)/ ($now-($now-1))]
set theta [expr acos(($X2-$X1)/(hypot($X2-$X1,$Y2-$Y1)))]
set Xp [expr ($X2+($v*$now*cos($theta)))]
set Yp [expr ($Y2+($v*$now*sin($theta)))]
puts "$Xp $Yp"
}
break
}
}
}
Floating point exceptions can come from a few different things. In general, the main culprit is doing something awful like dividing zero by zero. However, Tcl is usually pretty good at ensuring that such things don't crash your program entirely, and instead just generate errors you can catch. Whatever is going on is therefore either one of the trickier cases, or due to running in ns2 and that turning signalling floating point errors on (Tcl's standard implementation disables them precisely to avoid probably-unwarranted fatal crashes).
If it is the latter, moving the processing out of the process into a standard tclsh is the easiest way forward. We can make stronger guarantees about the correctness of behaviour there as we have more control of tricky things like FPU flags.
But if it is the former… the problem should lie in these lines:
set v [expr hypot($X2-$X1,$Y2-$Y1)/ ($now-($now-1))]
set theta [expr acos(($X2-$X1)/(hypot($X2-$X1,$Y2-$Y1)))]
set Xp [expr ($X2+($v*$now*cos($theta)))]
set Yp [expr ($Y2+($v*$now*sin($theta)))]
Of the lines there, the one that looks most suspicious is the calculation of theta. There's several problems with what you're doing (e.g., it won't handle some quadrants correctly due to trigonometric periodicities) but the big nasty is that you've got a division in there that will be by zero if two successive positions are the same. Given that you're able to use hypot(), computing the angle is by far best computed with atan2(), as that deals with tricky edge cases much better (e.g., it has no problems with awful infinities). Try this:
set theta [expr { atan2($Y2-$Y1, $X2-$X1) }]
Also put your expressions in {braces} as I've done above. It permits Tcl to bytecode-compile the expression and makes your code quite a bit faster. It also lets you put spaces in the expression safely, which aids readability a lot even when you're not splitting over multiple lines, and ensures you get (much!) better error messages if you ever happen to use a variable holding a non-numeric value in your expression. In short, it's easy to do and makes your code much better.
Other minor issues
Do you expect ($now-($now-1)) to ever compute anything other than 1? Or at least a value very close to 1.0, given that you're dealing with floating point numbers for simulation time? I think your calculation of v can be safely simplified down to straight use of hypot().
These two nested loops look odd:
for {set i 0} {$i< [llength $result]} {incr i} {
for {set j 1} {$i< [llength $result]} {incr j} {
I think you either mean to do this:
for {set i 0} {$i< [llength $result]} {incr i} {
for {set j 0} {$j< [llength $result]} {incr j} {
if {$i == $j} continue; # Skip the diagonal in the comparison matrix
or this:
for {set i 0} {$i< [llength $result]} {incr i} {
for {set j [expr {$i + 1}]} {$j< [llength $result]} {incr j} {
# Just the upper triangle of the comparison matrix
depending on whether the rest of the code should compare values from both ways round (but never with itself), or just one way round. The latter does less work, but might be wrong if comparisons aren't symmetric (which depends on the details of what you're up to).

How to compare two lines in different files and output the same position in the other line in TCL?

I have two files and I want the output like below. Please help by providing me with a TCL script.
File1:
Name1: F * F F F
Name2: F F *
Name3: F F F F
File2:
Name1: AA, BB, CC, DD, EE,
Name2: AA, BB, CC,
Name3: AA, BB, CC, DD,
Output1:
Name1
AA - FAIL
BB - *
CC - FAIL
<cont>
Name2
AA - FAIL
BB - FAIL
CC - *
<cont>
Output2:
Name1
FAIL - AA CC DD EE
* - BB
Name2
FAIL - AA BB
* - CC
Name3
FAIL - AA BB CC DD
Try this following tested on tclsh8.5
set fd1 [open "input_file_1.txt" r]
set fd2 [open "input_file_2.txt" r]
set opfd [open "output_file.txt" w]
while {[gets $fd1 line] > 0 && [gets $fd2 line2] > 0} {
set line1 [split $line ":"]
set line2 [split $line2 ":"]
puts $opfd [lindex $line1 0]
set last_part_1 [string trim [lindex $line1 1] " "]
set last_part_2 [string trim [lindex $line2 1] " "]
set space_split [split $last_part_1 " "]
set comma_split [split $last_part_2 ","]
for {set i 0} {$i < [llength $space_split]} {incr i} {
puts $opfd "[string trim [lindex $comma_split $i] " "] = [string trim [lindex $space_split $i] " "]"
}
}
close $fd1
close $fd2
close $opfd
There will be file named as output_file.txt created inside current directory which contains your output.
Another way to do it:
package require fileutil
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
concat {*}[lmap item $rows {
split $item :
}]
}
set d1 [string map {F Fail} [getInput file1.txt]]
set d2 [string map {, {}} [getInput file2.txt]]
dict for {key values} $d1 {
puts $key
foreach v1 $values v2 [dict get $d2 $key] {
puts " $v2 - $v1"
}
}
This works by recognizing the dictionary-like structure of the data files. If every piece of data is a word without spaces, this version of getInput will coerce the contents of each file to a usable dict. From there, it's just a matter of replacing the F strings with Fail strings and removing the commas, and then doing dictionary iteration over either one of the dicts and pulling in the corresponding values from the other one.
If the values in the second file may contain spaces, getInput should look like this:
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
set res {}
foreach item $rows {
lassign [split $item :] key values
if {[string match *,* $values]} {
set values [split [string trimright $values {, }] ,]
}
lappend res $key $values
}
return $res
}
Documentation: concat, dict, foreach, if, lassign, lmap, lmap replacement, package, proc, puts, return, set, split, string

Script to generate N number of valid ip addresses?

I am new to TCL and trying to learn by doing some simple scripting, I have taken upon to write a simple script which generates valid ip address from a given starting ip address.
I have managed to write one but have run into two problems,
The last octet has a zero getting added in front of the number that is 192.168.1.025
When i specify the starting ip something like this 250.250.5.1 it fails to generate proper ips,
Below is my code:
proc generate {start_addr total_addr} {
if {$total_addr == 0} {return}
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
while {$a<255 && $b <255 && $c <255 && $d < 255 } {
set d [expr {$d + 1}];
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
if {$total_addr == 1} {return}
if {$total_addr > 1 && $d == 255} {
set c [expr {$c + 1}];
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $c==255 && $d == 255} {
set b [expr {$b + 1}];
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $b == 255 && $c == 255 && $d == 255} {
set a [expr {$a + 1}];
set b 1
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
}
}
flush stdout
puts "Please enter the starting IPv4 address with . as delimiter EX: 1.1.1.1"
set start_addr [gets stdin]
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
if {$a <= 255 & $b <= 255 & $c <= 255 & $d <= 255} {
puts "this is a valid ip address"
} else {
puts "this not a valid ip address"
}
flush stdout
puts "Please enter the total number of IPv4 address EX: 1000"
set total_addr [gets stdin]
set result [generate $start_addr $total_addr]
For parsing an IP address the simple way, it is better to use scan. If you know C's sscanf() function, Tcl's scan is very similar (in particular, %d matches a decimal number). Like that, we can do:
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "some components of address are missing"
}
It's a good idea to throw an error when things go wrong. You can catch them later or just let the script exit, depending on what's right for you. (You still need to check the number range.)
More generally, there's a package in Tcllib that does IP address parsing. It is far more complete than you're likely to need, but it's there.
Second major thing that you should do? Factor out the code to append a string to a file. It's can be a short procedure, short enough that it is obviously right.
proc addAddress {filename address} {
set fileId [open $filename "a"]
puts $fileId $address
close $fileId
}
Then you can replace:
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
With:
addAddress "output.txt" $a$b$c$d
Less to go wrong. Less noise. (Protip: consider $a.$b.$c.$d there.)
More seriously, your code is just really unlikely to work. It's too complicated. In particular, you should generate one address each time through the loop, and you should concentrate on how to advance the counters right. Using incr to add one to an integer is highly recommended too.
You might try something like this:
incr d
if {$d > 255} {
set d 1
incr c
}
if {$c > 255} {
set c 1
incr b
}
if {$b > 255} {
set b 1
incr a
}
if {$a > 255} {
set a 1
}
But that's less than efficient. We can do better with this:
if {[incr d] > 255} {
set d 1
if {[incr c] > 255} {
set c 1
if {[incr b] > 255} {
set b 1
if {[incr a] > 255} {
set a 1
}
}
}
}
That's better (though actual valid IP addresses have a wider range: you can have a 0 or two in the middle, such as in 127.0.0.1…)
Splitting the address
Apart from using the ip package in Tcllib, there are a few ways to split up an IPv4 "dot-decimal" address and put the octet values into four variables. The one you used was
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
This basically works, but there are a couple of problems with it. The first problem is that the address 1.234.1.234 will be split up as 1. 234. 1. 234, and then when you try to use the incr command on the first three variables you will get an error message (I suppose that's why you used expr {$x + 1} instead of incr). Instead, write
regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d
This expression puts the dots outside the capturing parentheses and places integer values into the variables. It's also a good idea to use the shorthand \d (decimal digit) instead of the [0-9] sets. But you could also do this:
regexp -all -inline -- {\d+} $start_addr
where you simply ask regexp to collect all (-all) unbroken sequences of decimal digits and return them as a list (-inline). Since you get the result as a list, you then need to lassign (list assign) them into variables:
lassign [regexp -all -inline -- {\d+} $start_addr] a b c d
But if you can make do without a regular expression, you should. Donal suggested
scan $start_addr "%d.%d.%d.%d" a b c d
which is fine. Another way is to split the string at the dots:
lassign [split $start_addr .] a b c d
(again you get a list as the result and need to assign it to your variables in a second step).
Checking the result
As Donal wrote, it's a good idea whenever you create data from user input (and in many other situations as well) to check that you did get what you expected to get. If you use an assigning regexp the command returns 1 or 0 depending on whether the matched succeeded or failed. This result can be plugged directly into an if invocation:
if {![regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d]} {
error "input data didn't match IPv4 dot-decimal notation"
}
Donal already gave an example of checking the result of scan. In this case you check against 4 since the command returns the number of successful matches it managed.
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "input data didn't match IPv4 dot-decimal notation"
}
If you use either of the list-creating commands (inline regexp or split) you can check the list length of the result:
if {[llength [set result [split $start_addr .]]] == 4} {
lassign $result a b c d
} else {
error "input data didn't match IPv4 dot-decimal notation"
}
This check should be followed by checking all variables for octet values (0-255). One convenient way to do this is like this:
proc isoctet args {
::tcl::mathop::* {*}[lmap octet $args {expr {0 <= $octet && $octet <= 255}}]
}
(It's usually a good idea to break out tests as functions; it's practically the law* if you are using the tests in several places in your code.)
This command, isoctet, takes a number of values as arguments, lumping them together as a list in the special parameter args. The lmap command creates a new list with the same number of elements as the original list, where the value of each element is the result of applying the given script to the corresponding element in the original list. In this case, lmap produces a list of ones and zeros depending on whether the value was a true octet value or not. Example:
input list: 1 234 567 89
result list: 1 1 0 1
The resulting list is then expanded by {*} into individual arguments to the ::tcl::mathop::* command, which multiplies them together. Why? Because if 1 and 0 can be taken as true and false values, the product of a list of ones and zeros happens to be exactly the same as the logical conjunction (AND, &&) of the same list.
result 1: 1 1 0 1
product : 0 (false)
result 2: 1 1 1 1
product : 1 (true)
So,
if {![isoctet $a $b $c $d]} {
error "one of the values was outside the (0, 255) range"
}
Generating new addresses
Possibly the least sexy way to generate a new address is to use a ready-made facility in Tcl: binary.
binary scan [binary format c* [list $a $b $c $d]] I n
This invocation first converts a list of integer values (while constraining them to octet size) to a bit string, and then interprets that bit string as a big-endian 32-bit integer (if your machine uses little-endian integers, you should use the conversion specifier i instead of I).
Increment the number. Wheee!
incr n
Convert it back to a list of 8-bit values:
binary scan [binary format I $n] c4 parts
The components of parts are now signed 8-bit integers, i.e. the highest value is 127, and the values that should be higher than 127 are now negative values. Convert the values to unsigned (0 - 255) values like this:
lassign [lmap part $parts {expr {$part & 0xff}}] a b c d
and join them up to a dot-decimal string like this:
set addr [join [list $a $b $c $d] .]
If you want more than one new address, repeat the process.
Documentation: binary, error, expr, if, incr, join, lassign, llength, lmap, mathop, proc, regexp, scan, set, split, {*}
lmap is a Tcl 8.6 command. Pure-Tcl implementations for Tcl 8.4 and 8.5 are available here.
*) If there were any laws. What you must learn is that these rules are no different than the rules of the Matrix. Some of them can be bent. Others can be broken.
proc ip_add { ip add } {
set re "^\\s*(\\d+)\.(\\d+)\.(\\d+)\.(\\d+)\\s*$"
if [regexp $re $ip match a b c d] {
set x [expr {(($a*256+$b)*256+$c)*256+$d+$add}]
set d [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set c [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set b [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set a [expr {int(fmod($x,256))}]
return "$a.$b.$c.$d"
} else {
puts stderr "invalid ip $ip"
exit 1
}
}
set res [ip_add "127.0.0.1" 512]
puts "res=$res"