Calculate average of columns of column with Tcl - tcl

I want to calculate the average for this column with tcl
please help me
frame Elec
1 50
2 40
3 30
4 20

If this is for a standalone script, (Warning: Self promotion ahead), I wrote a program called tawk that's like awk except using TCL for scripting, which does most of the work for you:
$ tawk 'line {$NR > 1} { incr sum $F(2) }
END { puts [expr {double($sum) / ($NR - 1)}] }' input.txt
35
# Equivalent awk:
$ awk 'NR > 1 { sum += $2 } END { print (sum / (NR - 1)) }' input.txt
35
If it's part of a larger program, you have to open the file and read and split lines yourself. Maybe something like
# Column number is 1-based
proc avg_column {filename column} {
set f [open $filename r]
gets $f ;# Read and discard header line
set sum 0
set nlines 0
while {[gets $f line] >= 0} {
set columns [regexp -all -inline {\S+} $line]
incr sum [lindex $columns $column-1]
incr nlines
}
close $f
return [expr {double($sum) / $nlines}]
}
puts [avg_column input.txt 2]

Not an answer, but some tips. You need to:
open the file
read the header with gets
use a while loop to read lines of the file
use split or regexp to get the 2nd field
sum the values (and count the lines) with expr, or incr if the values are only integers

If your input happens to be (some sort of CSV), or you can steer it into this direction, then you may use tcllib's csv package:
package require csv
package require struct::matrix
struct::matrix dm
set f [open mydata.csv]
while {[gets $f l] >= 0} {
# sanitize input, line-wise
set l [regsub -all {\s+} $l " "]
csv::split2matrix dm $l " " auto
}
close $f
set columnData [lrange [dm get column 1] 1 end]; # strip off header
puts [expr {double([tcl::mathop::+ {*}$columnData])/[llength $columnData]}]; # compute avg
Some hints:
gets will read your input file line by line;
csv::split2matrix puts each line into a struct::matrix;
/matrix/ get column /n/ gives access to one data column (incl. header field);
tcl::mathop::+ gives access to the built-in addition operator (outside of the [expr] command) and supports 2+ summands.

Related

Parse CSV into key value pair

I have a csv file which has hostname and attached serial numbers. I want to create a key value pair with key being hostname and value being the list of serial numbers. The serial numbers can be one or many.
For example:
A, 1, 2, 3, 4
B, 5, 6
C, 7, 8, 9
D, 10
I need to access key A and get {1 2 3 4} as output. And if I access D i should get {10}
How should I do this? As the version of TCL i am using doesn't support any packages like CSV and I also won't be able to install it as it is in the server, So I am looking at a solution which doesn't include any packages.
For now, I am splitting the line with \n and then I process each element. Then I split the elements with "," and then I get the host name and serial numbers in a list. I then use the 0th index of the list as hostname and remaining values as serial numbers. Is there a cleaner solution?
I'd do something like:
#!/usr/bin/env tclsh
package require csv
package require struct::queue
set filename "file.csv"
set fh [open $filename r]
set q [struct::queue]
csv::read2queue $fh $q
close $fh
set data [dict create]
while {[$q size] > 0} {
set values [lassign [$q get] hostname]
dict set data $hostname [lmap elem $values {string trimleft $elem}]
}
dict for {key value} $data {
puts "$key => $value"
}
then
$ tclsh csv.tcl
A => 1 2 3 4
B => 5 6
C => 7 8 9
D => 10
The repeated recommendation given here is to use the CSV package for this purpose. See also the answer by #glenn-jackman. If unavailable, the time is better invested in obtaining it at your server side.
To get you started, however, you might want to adopt something along the lines of:
set dat {
A, 1, 2, 3, 4
B, 5, 6
C, 7, 8, 9
D, 10
}
set d [dict create]
foreach row [split [string trim $dat] \n] {
set row [lassign [split $row ,] key]
dict set d [string trim $key] [concat {*}$row]
}
dict get $d A
dict get $d D
Be warned, however, such hand-knitted solutions typically only serve their purpose when you have full control of the data being processed and its representation. Again, time is better invested by obtaining the CSV package.
I tried this way and got it working. Thanks again for your inputs. Yes, I know csv package would be easy but I cannot install it in server/product.
set multihost "host_slno.csv"
set fh1 [open $multihost r]
set data [read -nonewline $fh1]
close $fh1
set hostslnodata [ split $data "\n" ]
set hostslno [dict create];
foreach line $hostslnodata {
set line1 [join [split $line ", "] ]
puts "$line1"
if {[regexp {([A-Za-z0-9_\-]+)\s+(.*)} $line1 match hostname serial_numbers]} {
dict lappend hostslno $hostname $serial_numbers
}
}
puts [dict get $hostslno]
The sourcecode from the csv package is available. If you are unable to install the full csv package, you can include the code from here:
http://core.tcl.tk/tcllib/artifact/2898cd911697ecdb
If you still can't use that option, then stripping out all the whitespace and splitting on "," is required.
An alternative to the earlier answers is using string map:
set row [split [string map {" " ""} $row ] ,]
The string map will remove all spaces, and then split on ","
Once you have converted the lines of text into valid tcl lists:
A 1 2 3 4
B 5 6
C 7 8 9
D 10
Then you can use the lindex and lrange commands to pluck off all the pieces.
foreach row $data {
set server [lindex $row 0]
set serial_numbers [lrange $row 1 end]
dict set ...
One possibility:
set hostslno [dict create]
set multihost "host_slno.csv"
set fh1 [open $multihost]
while {[gets $fh line] >= 0} {
set numbers [lassign [regexp -inline -all {[^\s,]+} $line] hostname]
dict set hostslno $hostname $numbers
}
close $fh1
puts [dict get $hostslno A]

how to split a file to list of lists TCL

I'm coding TCL and I would like to split a file into two lists of lists,
the file contain:
(1,2) (3,4) (5,6)
(7,8) (9,10) (11,12)
and I would like to get two list
one for each line, that contain lists that each one contain to two number
for example:
puts $list1 #-> {1 2} {3 4} {5 6}
puts [lindex $list1 0] #-> 1 2
puts [lindex $list2 2] #-> 11 12
I tried to use regexp and split but no success
The idea of using regexp is good, but you'll need to do some post-processing on its output.
# This is what you'd read from a file
set inputdata "(1,2) (3,4) (5,6)\n(7,8) (9,10) (11,12)\n"
foreach line [split $inputdata "\n"] {
# Skip empty lines.
# (I often put a comment format in my data files too; this is where I'd handle it.)
if {$line eq ""} continue
# Parse the line.
set bits [regexp -all -inline {\(\s*(\d+)\s*,\s*(\d+)\s*\)} $line]
# Example results of regexp:
# (1,2) 1 2 (3,4) 3 4 (5,6) 5 6
# Post-process to build the lists you really want
set list([incr idx]) [lmap {- a b} $bits {list $a $b}]
}
Note that this is building up an array; long experience says that calling variables list1, list2, …, when you're building them in a loop is a bad idea, and that an array should be used, effectively giving variables like list(1), list(2), …, as that yields a much lower bug rate.
An alternate approach is to use a simpler regexp and then have scan parse the results. This can be more effective when the numbers aren't just digit strings.
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\([^()]+\)} $line]
set list([incr idx]) [lmap substr $bits {scan $substr "(%d,%d)"}]
}
If you're not using Tcl 8.6, you won't have lmap yet. In that case you'd do something like this instead:
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\(\s*(\d+)\s*,\s*(\d+)\s*\)} $line]
set list([incr idx]) {}
foreach {- a b} $bits {
lappend list($idx) [list $a b]
}
}
foreach line [split $inputdata "\n"] {
if {$line eq ""} continue
set bits [regexp -all -inline {\([^()]+\)} $line]
set list([incr idx]) {}
foreach substr $bits {
lappend list($idx) [scan $substr "(%d,%d)"]
# In *very* old Tcl you'd need this:
# scan $substr "(%d,%d)" a b
# lappend list($idx) [list $a $b]
}
}
You have an answer already, but it can actually be done a little bit simpler (or at least without regexp, which is usually a good thing).
Like Donal, I'll assume this to be the text read from a file:
set lines "(1,2) (3,4) (5,6)\n(7,8) (9,10) (11,12)\n"
Clean it up a bit, removing the parentheses and any white space before and after the data:
% set lines [string map {( {} ) {}} [string trim $lines]]
1,2 3,4 5,6
7,8 9,10 11,12
One way to do it with good old-fashioned Tcl, resulting in a cluster of variables named lineN, where N is an integer 1, 2, 3...:
set idx 0
foreach lin [split $lines \n] {
set res {}
foreach li [split $lin] {
lappend res [split $li ,]
}
set line[incr idx] $res
}
A doubly iterative structure like this (a number of lines, each having a number of pairs of numbers separated by a single comma) is easy to process using one foreach within the other. The variable res is used for storing result lines as they are assembled. At the innermost level, the pairs are split and list-appended to the result. For each completed line, a variable is created to store the result: its name consists of the string "line" and an increasing index.
As Donal says, it's not a good idea to use clusters of variables. It's much better to collect them into an array (same code, except for how the result variable is named):
set idx 0
foreach lin [split $lines \n] {
set res {}
foreach li [split $lin] {
lappend res [split $li ,]
}
set line([incr idx]) $res
}
If you have the results in an array, you can use the parray utility command to list them in one fell swoop:
% parray line
line(1) = {1 2} {3 4} {5 6}
line(2) = {7 8} {9 10} {11 12}
(Note that this is printed output, not a function return value.)
You can get whole lines from this result:
% set line(1)
{1 2} {3 4} {5 6}
Or you can access pairs:
% lindex $line(1) 0
1 2
% lindex $line(2) 2
11 12
If you have the lmap command (or the replacement linked to below), you can simplify the solution somewhat (you don't need the res variable):
set idx 0
foreach lin [split $lines \n] {
set line([incr idx]) [lmap li [split $lin] {
split $li ,
}]
}
Still simpler is to let the result be a nested list:
set lineList [lmap lin [split $lines \n] {
lmap li [split $lin] {
split $li ,
}
}]
You can access parts of the result similar to above:
% lindex $lineList 0
{1 2} {3 4} {5 6}
% lindex $lineList 0 0
1 2
% lindex $lineList 1 2
11 12
Documentation:
array,
foreach,
incr,
lappend,
lindex,
lmap (for Tcl 8.5),
lmap,
parray,
set,
split,
string
The code works for windows :
TCL file code is :
proc captureImage {} {
#open the image config file.
set configFile [open "C:/main/image_config.txt" r]
#To retrive the values from the config file.
while {![eof $configFile]} {
set part [split [gets $configFile] "="]
set props([string trimright [lindex $part 0]]) [string trimleft [lindex $part 1]]
}
close $configFile
set time [clock format [clock seconds] -format %Y%m%d_%H%M%S]
set date [clock format [clock seconds] -format %Y%m%d]
#create the folder with the current date
set folderPath $props(folderPath)
append folderDate $folderPath "" $date "/"
set FolderCreation [file mkdir $folderDate]
while {0} {
if { [file exists $date] == 1} {
}
break
}
#camera selection to capture image.
set camera "video"
append cctv $camera "=" $props(cctv)
#set the image resolution (XxY).
set resolutionX $props(resolutionX)
set resolutionY $props(resolutionY)
append resolution $resolutionX "x" $resolutionY
#set the name to the save image
set imagePrefix $props(imagePrefix)
set imageFormat $props(imageFormat)
append filename $folderDate "" $imagePrefix "_" $time "." $imageFormat
set logPrefix "Image_log"
append logFile $folderDate "" $logPrefix "" $date ".txt"
#ffmpeg command to capture image in background
exec ffmpeg -f dshow -benchmark -i $cctv -s $resolution $filename >& $logFile &
after 3000
}
}
captureImage
thext file code is :
cctv=Integrated Webcam
resolutionX=1920
resolutionY=1080
imagePrefix=ImageCapture
imageFormat=jpg
folderPath=c:/test/
//camera=video=Integrated Webcam,Logitech HD Webcam C525
This code works for me me accept the code from text file were list of parameters are passed.

Ignoring lines in a file TCL

Suppose I have a file1 with a few queries inside,
Query 1
Query 2
Query 3
And I have a normal text file2 containing a bunch of data
Data 1 Query 1 something something
Data something Query 2 something something
Something Query 3 something something
Data1 continue no query
Data2 continue no query
How do I create a loop such that it ignores the lines containing queries from file1 and prints only lines without the queries in the file? so in this case only these values gets printed
Data1 continue no query
Data2 continue no query
i tried producing the results using this loop script i made
Storing the queries to be ignored from file1 into $wlistItems
set openFile1 [open file1.txt r]
while {[gets openFile1 data] > -1} {
set wlist $data
append wlistItems "{$wlist}\n"
}
close $openFile1
Processing file2 to print lines without ignored queries
set openFile2 [open file2.txt r]
while {[gets $openFile2 data] > -1} {
for {set n 0} {$n < [llength $wListItems]} {incr n} {
if {[regexp -all "[lindex $wListItems $n]" $data all value]} {
continue
}
puts $data
}
}
close $openFile2
However, the script does not skip the lines. It instead prints out repeated data from file2.
while {[gets $openFile2 data] > -1} {
set found 0
for {set n 0} {$n < [llength $wListItems]} {incr n} {
if {[regexp -all "[lindex $wListItems $n]" $data all value]} {
set found 1
break
}
}
if {!$found} {
puts $data
}
}
A simpler solution:
package require fileutil
set queries [join [split [string trim [::fileutil::cat file1]] \n] |]
::fileutil::foreachLine line file2 {
if {![regexp ($queries) $line]} {
puts $line
}
}
The first command (after the package require) reads the file with the queries and packs them up as a set of branches (Query 1|Query 2|Query 3). The second command processes the second file line by line and prints those lines that don't contain any of those branches.
Documentation: fileutil package, if, join, package, puts, Syntax of Tcl regular expressions, regexp, set, split, string
I'd just do this:
puts [exec grep -Fvf file1 file2]

Script to generate N number of valid ip addresses?

I am new to TCL and trying to learn by doing some simple scripting, I have taken upon to write a simple script which generates valid ip address from a given starting ip address.
I have managed to write one but have run into two problems,
The last octet has a zero getting added in front of the number that is 192.168.1.025
When i specify the starting ip something like this 250.250.5.1 it fails to generate proper ips,
Below is my code:
proc generate {start_addr total_addr} {
if {$total_addr == 0} {return}
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
while {$a<255 && $b <255 && $c <255 && $d < 255 } {
set d [expr {$d + 1}];
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
if {$total_addr == 1} {return}
if {$total_addr > 1 && $d == 255} {
set c [expr {$c + 1}];
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $c==255 && $d == 255} {
set b [expr {$b + 1}];
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
if {$total_addr > 1 && $b == 255 && $c == 255 && $d == 255} {
set a [expr {$a + 1}];
set b 1
set c 1
set d 1
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
set total_addr [expr {$total_addr - 1}];
}
}
}
flush stdout
puts "Please enter the starting IPv4 address with . as delimiter EX: 1.1.1.1"
set start_addr [gets stdin]
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
if {$a <= 255 & $b <= 255 & $c <= 255 & $d <= 255} {
puts "this is a valid ip address"
} else {
puts "this not a valid ip address"
}
flush stdout
puts "Please enter the total number of IPv4 address EX: 1000"
set total_addr [gets stdin]
set result [generate $start_addr $total_addr]
For parsing an IP address the simple way, it is better to use scan. If you know C's sscanf() function, Tcl's scan is very similar (in particular, %d matches a decimal number). Like that, we can do:
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "some components of address are missing"
}
It's a good idea to throw an error when things go wrong. You can catch them later or just let the script exit, depending on what's right for you. (You still need to check the number range.)
More generally, there's a package in Tcllib that does IP address parsing. It is far more complete than you're likely to need, but it's there.
Second major thing that you should do? Factor out the code to append a string to a file. It's can be a short procedure, short enough that it is obviously right.
proc addAddress {filename address} {
set fileId [open $filename "a"]
puts $fileId $address
close $fileId
}
Then you can replace:
set filename "output.txt"
set fileId [open $filename "a"]
puts $fileId $a$b$c$d
close $fileId
With:
addAddress "output.txt" $a$b$c$d
Less to go wrong. Less noise. (Protip: consider $a.$b.$c.$d there.)
More seriously, your code is just really unlikely to work. It's too complicated. In particular, you should generate one address each time through the loop, and you should concentrate on how to advance the counters right. Using incr to add one to an integer is highly recommended too.
You might try something like this:
incr d
if {$d > 255} {
set d 1
incr c
}
if {$c > 255} {
set c 1
incr b
}
if {$b > 255} {
set b 1
incr a
}
if {$a > 255} {
set a 1
}
But that's less than efficient. We can do better with this:
if {[incr d] > 255} {
set d 1
if {[incr c] > 255} {
set c 1
if {[incr b] > 255} {
set b 1
if {[incr a] > 255} {
set a 1
}
}
}
}
That's better (though actual valid IP addresses have a wider range: you can have a 0 or two in the middle, such as in 127.0.0.1…)
Splitting the address
Apart from using the ip package in Tcllib, there are a few ways to split up an IPv4 "dot-decimal" address and put the octet values into four variables. The one you used was
regexp {([0-9]+\.)([0-9]+\.)([0-9]+\.)([0-9]+)} $start_addr match a b c d
This basically works, but there are a couple of problems with it. The first problem is that the address 1.234.1.234 will be split up as 1. 234. 1. 234, and then when you try to use the incr command on the first three variables you will get an error message (I suppose that's why you used expr {$x + 1} instead of incr). Instead, write
regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d
This expression puts the dots outside the capturing parentheses and places integer values into the variables. It's also a good idea to use the shorthand \d (decimal digit) instead of the [0-9] sets. But you could also do this:
regexp -all -inline -- {\d+} $start_addr
where you simply ask regexp to collect all (-all) unbroken sequences of decimal digits and return them as a list (-inline). Since you get the result as a list, you then need to lassign (list assign) them into variables:
lassign [regexp -all -inline -- {\d+} $start_addr] a b c d
But if you can make do without a regular expression, you should. Donal suggested
scan $start_addr "%d.%d.%d.%d" a b c d
which is fine. Another way is to split the string at the dots:
lassign [split $start_addr .] a b c d
(again you get a list as the result and need to assign it to your variables in a second step).
Checking the result
As Donal wrote, it's a good idea whenever you create data from user input (and in many other situations as well) to check that you did get what you expected to get. If you use an assigning regexp the command returns 1 or 0 depending on whether the matched succeeded or failed. This result can be plugged directly into an if invocation:
if {![regexp {(\d+)\.(\d+)\.(\d+)\.(\d+)} $start_addr match a b c d]} {
error "input data didn't match IPv4 dot-decimal notation"
}
Donal already gave an example of checking the result of scan. In this case you check against 4 since the command returns the number of successful matches it managed.
if {[scan $start_addr "%d.%d.%d.%d" a b c d] != 4} {
error "input data didn't match IPv4 dot-decimal notation"
}
If you use either of the list-creating commands (inline regexp or split) you can check the list length of the result:
if {[llength [set result [split $start_addr .]]] == 4} {
lassign $result a b c d
} else {
error "input data didn't match IPv4 dot-decimal notation"
}
This check should be followed by checking all variables for octet values (0-255). One convenient way to do this is like this:
proc isoctet args {
::tcl::mathop::* {*}[lmap octet $args {expr {0 <= $octet && $octet <= 255}}]
}
(It's usually a good idea to break out tests as functions; it's practically the law* if you are using the tests in several places in your code.)
This command, isoctet, takes a number of values as arguments, lumping them together as a list in the special parameter args. The lmap command creates a new list with the same number of elements as the original list, where the value of each element is the result of applying the given script to the corresponding element in the original list. In this case, lmap produces a list of ones and zeros depending on whether the value was a true octet value or not. Example:
input list: 1 234 567 89
result list: 1 1 0 1
The resulting list is then expanded by {*} into individual arguments to the ::tcl::mathop::* command, which multiplies them together. Why? Because if 1 and 0 can be taken as true and false values, the product of a list of ones and zeros happens to be exactly the same as the logical conjunction (AND, &&) of the same list.
result 1: 1 1 0 1
product : 0 (false)
result 2: 1 1 1 1
product : 1 (true)
So,
if {![isoctet $a $b $c $d]} {
error "one of the values was outside the (0, 255) range"
}
Generating new addresses
Possibly the least sexy way to generate a new address is to use a ready-made facility in Tcl: binary.
binary scan [binary format c* [list $a $b $c $d]] I n
This invocation first converts a list of integer values (while constraining them to octet size) to a bit string, and then interprets that bit string as a big-endian 32-bit integer (if your machine uses little-endian integers, you should use the conversion specifier i instead of I).
Increment the number. Wheee!
incr n
Convert it back to a list of 8-bit values:
binary scan [binary format I $n] c4 parts
The components of parts are now signed 8-bit integers, i.e. the highest value is 127, and the values that should be higher than 127 are now negative values. Convert the values to unsigned (0 - 255) values like this:
lassign [lmap part $parts {expr {$part & 0xff}}] a b c d
and join them up to a dot-decimal string like this:
set addr [join [list $a $b $c $d] .]
If you want more than one new address, repeat the process.
Documentation: binary, error, expr, if, incr, join, lassign, llength, lmap, mathop, proc, regexp, scan, set, split, {*}
lmap is a Tcl 8.6 command. Pure-Tcl implementations for Tcl 8.4 and 8.5 are available here.
*) If there were any laws. What you must learn is that these rules are no different than the rules of the Matrix. Some of them can be bent. Others can be broken.
proc ip_add { ip add } {
set re "^\\s*(\\d+)\.(\\d+)\.(\\d+)\.(\\d+)\\s*$"
if [regexp $re $ip match a b c d] {
set x [expr {(($a*256+$b)*256+$c)*256+$d+$add}]
set d [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set c [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set b [expr {int(fmod($x,256))}]
set x [expr {int($x/256)}]
set a [expr {int(fmod($x,256))}]
return "$a.$b.$c.$d"
} else {
puts stderr "invalid ip $ip"
exit 1
}
}
set res [ip_add "127.0.0.1" 512]
puts "res=$res"

Looking for a search string in a file and using those lines for processing in TCL

To be more precise:
I need to be looking into a file abc.txt which has contents something like this:
files/f1/atmp.c 98 100
files/f1/atmp1.c 89 100
files/f1/atmp2.c !! 75 100
files/f2/btmp.c 92 100
files/f2/btmp2.c !! 85 100
files/f3/xtmp.c 92 100
The script needs to find "!!" and use those lines to print out the following as output:
atmp2.c 75
btmp2.c 85
Any help?
this should do the trick.
set data {files/f1/atmp.c 98 100
files/f1/atmp1.c 89 100
files/f1/atmp2.c !! 75 100
files/f2/btmp.c 92 100
files/f2/btmp2.c !! 85 100
files/f3/xtmp.c 92 100}
set lines [split $data \n]
foreach line $lines {
set match [regexp {(\S+)\s+!!\s+(\d+)} $line -> file num]
if {$match} {puts "$file $num"}
}
Although regexp has a -all switch I don't think we can use it here as we only get the last match vars with -all
If your file isn't huge, you can slurp the whole thing into memory, split the lines into a TCL list, and then iterate through the list looking for a match. For example:
set fh [open foo]
set lines [read $fh]
close $fh
set lines [split $lines "\n"]
foreach line $lines {
if { [regexp {.*/(\S+\.c)\s*!!\s*(\d+)} $line match file data] } {
puts "$file $data"
}
}
This will successfully return just the lines with "!!" in them. With your posted corpus, the results are:
atmp2.c 75
btmp2.c 85
I might be tempted in this case to exec to awk:
set output [exec awk {$2 == "!!" {print $1, $3}} abc.txt]
puts $output
The trick is to combine the code that reads lines from the file with a regular expression that detects matching lines and extracts the relevant parts (a one-step process with regexp). The only tricky part is working out what exactly to use as the regular expression, so that you get exactly what you want. I'm going to guess that you're after the parts of the filenames after the /, that those filenames won't contain spaces, and that the number you're after is the entirety of the first digit sequence after the double exclamation. (Other formats are possible, some of which are easier to extract with other tools such as scan.) That would give us something like this:
set f [open abc.txt]
while {[gets $f line] >= 0} {
if {[regexp {([^\s/]+)\s+!!\s+(\d+)} $line -> name value]} {
# Or do whatever you want with these
puts "$name $value"
}
}
close $f
(The gets command with two arguments returns the length of line read, or -1 on failure. For normal files the only failure mode is EOF, so we can just terminate the loop when we get a negative value. Other kinds of channels can be more complex…)